CN114626890A - Abnormal user detection method based on graph structure learning - Google Patents
Abnormal user detection method based on graph structure learning Download PDFInfo
- Publication number
- CN114626890A CN114626890A CN202210275577.8A CN202210275577A CN114626890A CN 114626890 A CN114626890 A CN 114626890A CN 202210275577 A CN202210275577 A CN 202210275577A CN 114626890 A CN114626890 A CN 114626890A
- Authority
- CN
- China
- Prior art keywords
- graph
- graph structure
- learning
- node
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 47
- 238000001514 detection method Methods 0.000 title claims abstract description 42
- 239000013598 vector Substances 0.000 claims abstract description 46
- 230000006870 function Effects 0.000 claims abstract description 40
- 238000003062 neural network model Methods 0.000 claims abstract description 38
- 238000000034 method Methods 0.000 claims abstract description 37
- 238000012549 training Methods 0.000 claims abstract description 19
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 239000011159 matrix material Substances 0.000 claims description 68
- 238000007500 overflow downdraw method Methods 0.000 claims description 14
- 238000004364 calculation method Methods 0.000 claims description 9
- 230000006399 behavior Effects 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000005295 random walk Methods 0.000 claims description 6
- 230000009286 beneficial effect Effects 0.000 claims description 5
- 239000004576 sand Substances 0.000 claims description 5
- 230000005856 abnormality Effects 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 abstract description 2
- 238000005457 optimization Methods 0.000 description 8
- 230000004927 fusion Effects 0.000 description 4
- 238000004140 cleaning Methods 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- GMVPRGQOIOIIMI-DWKJAMRDSA-N prostaglandin E1 Chemical compound CCCCC[C@H](O)\C=C\[C@H]1[C@H](O)CC(=O)[C@@H]1CCCCCCC(O)=O GMVPRGQOIOIIMI-DWKJAMRDSA-N 0.000 description 2
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/018—Certifying business or products
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Finance (AREA)
- Development Economics (AREA)
- Accounting & Taxation (AREA)
- Entrepreneurship & Innovation (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Marketing (AREA)
- Game Theory and Decision Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses an abnormal user detection method based on graph structure learning, which is characterized by comprising a graph neural network training layer, a graph neural network model and a learning node, wherein the graph neural network model is established by a graph neural network model defining method, the learning node is represented by a low-dimensional vector, and a graph neural network model optimizing method realizes the learning of graph neural network model weight and graph structure characteristics by defining a plurality of constraint functions, so that the robustness of the model is enhanced. The method learns various graph structures from node representation to mine potential information of the nodes, improves the quality of low-dimensional vectors of the nodes through an attention mechanism, and greatly improves the accuracy of detecting abnormal users under the condition of unbalanced sample categories.
Description
Technical Field
The invention relates to application of a graph neural network in abnormal user detection, in particular to an abnormal user detection method based on graph structure learning.
Background
In the e-commerce field, examples of abnormal transactions such as fraudulent transactions and false transactions are frequently available, and a great number of people suffer economic losses caused by the abnormal transactions every year. As an e-commerce enterprise, how to utilize settled order data and apply data mining to detect whether the transaction has abnormal behaviors in advance, and the transaction action is blocked in advance or in advance, so that the property safety of users is guaranteed, and the damage caused by abnormal transactions is greatly reduced.
At present, a graph neural network model for processing abnormal user detection exists, and an abnormal user detection task has the remarkable characteristic that training samples are extremely unbalanced, and usually normal users account for most of the abnormal users. The current graph neural network model for the abnormal detection task usually adopts a negative sampling or data enhancement method, the proportion of normal users and abnormal users is basically equivalent by reducing the number of normal users, and the method can not fully utilize precious normal data; the data enhancement method generates some new abnormal users by learning the characteristics of the abnormal users, so that the proportion of the abnormal users is basically equivalent to that of the abnormal users. Therefore, a new method is needed to improve the accuracy of detecting the abnormal user by improving the robustness of the model.
Disclosure of Invention
In order to solve the problem of difficult anomaly detection, the invention provides an anomaly user detection method based on graph structure learning.
A method for detecting abnormal users based on graph structure learning is characterized by comprising the following steps:
s1, capturing user transaction behavior data, and converting the data into a graph structure type;
s2, training the graph neural network, and learning a model weight coefficient until a target function is converged;
s3, generating various graph structures to express various information;
and S4, detecting new user transaction behavior data by using the trained graph neural network model.
S2 includes a graph neural network model definition method for defining a corresponding graph convolution layer based on a plurality of graph structures and generating a low-dimensional vector representation of a plurality of nodes by a graph convolution network. And constructing an attention fusion method according to the characteristics of the abnormal user detection task, and only paying attention to the important low-dimensional vector according to the attention so as to generate a minimum sufficient vector.
S21, the graph convolution layer is defined as follows:
H(k)=Relu(SH(k-1)W(k)),
representing the processed adjacency matrix S, which is suitable for convolution operations, wherein D represents the degree matrix of the adjacency matrix S,represents the adjacency matrix S plus the identity matrix I; h(k)Represents the data characteristics of the k-th layer graph convolution network, where H(0)The original data characteristics; w is a group ofkRepresenting weight coefficients of a k-th layer graph convolution network, wherein a Relu () function represents a nonlinear activation function; for graph convolution of l layers, the l-th layer uses softmax () activation function and the prediction matrix Z ═ Hl。
Learning 3 graph structures in the graph structure learning layer to be applied to the graph convolution layer, wherein the prediction matrix of n nodes in the u graph structure is represented asWhereinIs the probability that node i belongs to class c in the u-th graph structure, where u is { A, S }f,Sd,SsN is the number of nodes;
notably, the difference between the proposed graph convolution model and the mainstream graph convolution model is the challenge of dealing with the anomaly detection task by learning a variety of graph structures.
And S22, fusing the 4 learned low-dimensional vectors of each node into a low-dimensional vector representation which is most beneficial to anomaly detection based on an attention fusion method through an attention fusion method aiming at the 4 graph convolutions.
The attention fusion method is defined as follows:
first, defining a node low-dimensional vector Z generated by a graph structure ggImportance coefficient of middle node i:
herein, theAndrespectively expressed in a low-dimensional vector ZgThe maximum value and the second largest value of the vector of the middle node i are easy to find; and lambda epsilon (0,1) is an artificial parameter. Notably, the attention fusion method has several advantages: 1. by the method, the training of the graph neural network model can be accelerated, because if the predicted value Z of a graph structure has a higher maximum value and the difference between the maximum value and the second maximum value is larger, the attention fusion method can capture the result and guide the optimization of model parameters. 2. The method does not need to add new weight coefficients for calculation, so that the overfitting condition of the model is effectively relieved. Similarly, the importance coefficient of the node i in other graph structures can be obtained.
S23, obtaining a final prediction matrix of the graph neural network model based on the importance coefficient:
where R and K both represent the number of graph structures, ε ∈ (0,1) is an artificial parameter, and Z ∈ Rn*c. The attention fusion framework is not only applicable to the graph convolution form, but also to the fusion between any multiple low-dimensional node vectors generated by different graph structures as a low-dimensional vector fusion framework.
The S2 includes a neural network model optimization method, where the neural network model optimization method obtains the objective function by defining multiple constraint functions, optimizes the neural network model by back propagation under the guidance of a training set, and learns a prediction matrix for anomaly detection, thereby enhancing the robustness of the model. The method specifically comprises the following steps:
s201, defining a consistency constraint function:
by reducing the loss function LuThe similarity of the three prediction matrixes is improved, and the universality among the prediction matrixes is enhanced.
S202, defining an independence constraint function:
n is the number of the prediction matrix Z; matrix arrayWherein I is an n-order identity matrix, and the personality of the eigenvector Z can be amplified by calculating a matrix G; through LdTo enhance the difference between the generated graph structure and the original adjacency matrix a, ensuring that useful information with individuality can be captured.
S203, optimizing the weight coefficient W of the neural network model of the graph by defining a cross entropy loss function,assuming that the training set is L, the true label for each node L ∈ L is YlThe prediction label is a prediction matrix Z epsilon Rn*cThe nodes in all training sets are expressed as:
s204, combining the abnormal detection task and the constraint condition to obtain the following objective function:
L=Lt+τLu+υLd,
here, τ and υ e (0,1) are artificial parameters of the coherence constraint function and the independence constraint function. Under the guidance of a training set, the graph neural network model is optimized through back propagation, and low-dimensional vector representations of nodes for anomaly detection are learned.
S3 includes a graph structure learning method, which includes feature-based graph structure learning, wandering-based graph structure learning, and sub-graph-based graph structure learning.
S31, the graph structure based on the characteristics SfThe learning method comprises the following steps:
firstly, fixing the weight coefficient W of the graph neural network, and taking out the vector representation H of the node as H ═ H0,H1,…,HlS to construct a feature map Sf={F0,F1,…,FlIn which FkIs the eigenvector H passing through the k-th layerkThe generated adjacency matrix based on the characteristics characterizes the similarity of k-th order neighbors, and the calculation formula is as follows:
obtaining an adjacency matrix F of each layer through the formula, wherein alpha and beta epsilon (0,1) are artificial parameters;
notably, the calculationThe method updates the feature vector and the graph structure through iterative optimization, a good feature vector can calculate a graph structure which accords with objective reality, meanwhile, the graph structure which accords with standard answers is more beneficial to generating low-dimensional vector representation which accords with an abnormal detection task, and the low-dimensional vector representation is optimized through iterative optimization until convergence.
S32, graph structure S based on wanderingdThe learning method comprises the following steps:
the wandering-based graph structure SdGenerated by the random walk of the nodes of the original adjacency matrix A and contains the global information of the graph structure, and therefore passes through the graph structure SdThe learned prediction matrix expresses global information of the data. The calculation formula is as follows:
here, α represents a transition probability in random walk, τ represents a walk cost, and the value becomes larger with the number of walks. In the task of abnormality detection, by learning a graph structure S having global informationdThe accuracy of the prediction matrix is improved.
S33, the graph structure S based on the subgraphsThe learning method comprises the following steps:
for the original graph adjacency matrix A, a subgraph S is generated by randomly reserving a certain edgesAnd sub-graph SsPutting the graph volume layer into the graph volume layer to learn a prediction matrix Zs=f(X,Ss) Through ZsEvaluation subgraph SsThe probability of connecting edges between each pair of nodes in the set. The probability of connecting edges between the node i and the target node j is as follows:
where W iss∈R2c*1Representation application graph structure SsA mapping vector of bs∈R2c*1A vector of the offset is represented, and,representing node i application graph structure SsIn order to save space and time, only the limited range K of the node i is consideredsAnd (c) neighbor nodes in the set, wherein k is an artificial parameter.
The node connecting edge probability rhosAfter the calculation is completed, the graph structure SsComprises the following steps:
Ss=Ss+μsρs,
where musE (0,1) is an artificial parameter.
According to the abnormal user detection method based on graph structure learning, abnormal users are judged through a graph neural network, and aiming at the problems that most of nodes in sample types are non-abnormal types and the sample types are seriously unbalanced, the method re-expresses an adjacent matrix of an original graph structure by providing a graph structure learning model, and further improves the robustness of side attack by learning multi-aspect information. Aiming at the aspect of fusing various information to form minimum sufficient information which is most helpful to the task of identifying the abnormal node, the invention provides a fusion method based on attention, and the minimum sufficient information is formed by calculating the importance and fusing the low-dimensional vector representation learned in all aspects according to the weight, so that the noise of generating a final prediction matrix is reduced, and the discrimination capability of the abnormal node is improved. Aiming at the aspect of model optimization, a plurality of constraint functions are provided, graph structure learning and node vector representation are updated in the direction beneficial to an anomaly detection task, and negative effects of sample class imbalance on a graph convolution network are reduced.
Drawings
Fig. 1 is a flowchart of an abnormal user detection method according to a first embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
The embodiment provides an abnormal user detection system and method based on graph structure learning.
The data reading layer is used for acquiring transaction data of normal users and abnormal users and converting the transaction data into graph structure types and comprises a data cleaning module, a data marking module and a data dividing module. The data cleaning module is used for examining and checking user log data, processing invalid values and missing values, deleting repeated information, correcting existing errors and providing data consistency, extracting user information from the cleaned user log data, forming nodes in a graph, and forming edges in the graph through the relationship among users, so that the graph is constructed; the data marking module is used for marking known data according to categories, including marking normal users and abnormal users; the data dividing module is used for dividing the marking data into a training set, a verification set and a test set according to the proportion of 1:1: 3.
The graph neural network training layer is used for learning the parameters of the graph neural network model, so that the graph neural network model has the capability of identifying abnormal users, and comprises a graph neural network model defining module and a graph neural network model optimizing module. The graph neural network model definition module defines corresponding graph convolution layers to learn low-dimensional vector representation of nodes based on a plurality of graph structures, and fuses the low-dimensional vectors learned by each graph convolution network according to importance by adding an attention fusion mechanism, so that the probability of detection errors of the model is remarkably reduced; the graph neural network model optimization module optimizes the parameters of the graph neural network model by defining a loss function, and improves the identification capability of abnormal users.
The graph structure learning layer is used for relearning the topological structure of graph data, and learning the original adjacent matrix into various adjacent matrix graphs by defining various graph structure learning models, so that various information can be expressed, the topological structure of the graph is more objective and correct, and the performance of abnormal user detection is obviously improved; the graph structure learning layer excavates potential link relations through node characteristics, so that multiple pieces of more objective and correct graph structure information are fused, and negative influences of noise of the graph structures on a graph neural network are reduced.
The abnormal user detection layer is used for man-machine interaction of the system and comprises a data definition and transmission module, a system management module, a file data management module and an abnormal detection result display module. The data definition and transmission module is used for providing an interactive interface for data cleaning, data marking, data dividing and the like; the system management module is used for managing the system and daily operation and maintenance work of the system; and the abnormal result display module is used for providing a visual interface of the abnormal user detection result.
The detection method comprises the following steps:
s1, capturing user transaction behavior data through a data reading layer, and converting the data into a graph structure type;
s2, training the graph neural network through transaction behavior data, and learning a model weight coefficient until a target function is converged;
s3, multiple graph structures are generated through the graph structure learning layer to express information in multiple aspects, so that the topological structure of the graph is more objective and correct, and the robustness of the graph neural network model is improved;
and S4, detecting new user transaction behavior data through the trained neural network model at the abnormal user detection layer.
In step S2, a graph neural network model definition method and a graph neural network model optimization method of the graph neural network training layer are included;
the graph neural network model definition method defines corresponding graph convolution layers based on various graph structures, and generates low-dimensional vector representations of a plurality of nodes through a graph convolution network. And constructing an attention fusion method according to the characteristics of the abnormal user detection task, and only paying attention to the important low-dimensional vector according to the attention so as to generate a minimum sufficient vector.
The map convolutional layer is defined as follows:
H(k)=Relu(SH(k-1)W(k)),
representing the processed adjacency matrix S, which is suitable for convolution operations, wherein D represents the degree matrix of the adjacency matrix S,represents the adjacency matrix S plus the identity matrix I; h(k)Represents the data characteristics of the k-th layer graph convolution network, where H(0)The original data characteristics; wkRepresenting weight coefficients of a k-th layer graph convolution network, wherein a Relu () function represents a nonlinear activation function; for graph convolution of l layers, the l-th layer uses softmax () activation function and the prediction matrix Z ═ Hl。
Learning 3 graph structures in the graph structure learning layer to be applied to the graph convolution layer, wherein the prediction matrix of n nodes in the u graph structure is represented asWhereinIs the probability that node i belongs to class c in the u-th graph structure, where u is { A, S }f,Sd,SsN is the number of nodes;
and S22, fusing the 4 learned low-dimensional vectors of each node into a low-dimensional vector representation which is most beneficial to anomaly detection based on an attention fusion method through an attention fusion method aiming at the 4 graph convolutions.
The attention fusion method is defined as follows:
first defining what is generated by the graph structure gNode low-dimensional vector ZgImportance coefficient of middle node i:
herein, theAndrespectively expressed in a low-dimensional vector ZgThe maximum value and the second largest value of the vector of the middle node i are easy to find; and lambda epsilon (0,1) is an artificial parameter.
Finally, a prediction matrix of the final graph neural network model is obtained based on the importance coefficient:
where R and K both represent the number of graph structures, ε ∈ (0,1) is an artificial parameter, and Z ∈ Rn*c。
According to the method for optimizing the neural network model of the graph, the target function is obtained by defining various constraint functions, the neural network model of the graph is optimized through back propagation under the guidance of a training set, low-dimensional vector representation of nodes for anomaly detection is learned, and the robustness of the model is enhanced.
By defining a consistency constraint function, the commonality between three prediction matrixes is improved:
by reducing the loss function LuThe similarity of the 3 prediction matrixes is improved, and the universality among all the prediction matrixes is enhanced. By defining an independence constraint function, the difference between each generation graph structure and the original adjacency matrix is improved,to ensure that they can capture different information:
where n is the number of prediction matrices Z; matrix arrayWherein I is an n-order identity matrix, and the personality of the eigenvector Z can be amplified by calculating the matrix G. Thus passing through LdTo enhance the difference between the generated graph structure and the original adjacency matrix a, ensuring that useful information with individuality can be captured.
Optimizing the weight coefficient W of the neural network model of the graph by defining a cross entropy loss function, and assuming that a training set is L and a real label of each node L belonging to L is YlThe prediction label is a prediction matrix Z epsilon Rn*cThe nodes in all training sets are expressed as:
combining the abnormal detection task and the constraint condition to obtain the following objective function:
L=Lt+τLu+υLd,
here, τ and υ e (0,1) are artificial parameters of the coherence constraint function and the independence constraint function. Under the guidance of a training set, the graph neural network model is optimized through back propagation, and low-dimensional vector representations of nodes for anomaly detection are learned.
In step S3, a graph structure learning method in the graph structure learning layer is further included, and the graph structure learning method includes feature-based graph structure learning, wandering-based graph structure learning, and sub-graph-based graph structure learning.
In particular, the feature-based graph structure SfThe learning method comprises the following steps:
firstly, fixing the weight coefficient W of the graph neural network, and taking out the vector representation H of the node, wherein the vector representation H is equal to { H ═ H0,H1,…,HlS to construct a feature map Sf={F0,F1,…,FlIn which FkIs the eigenvector H passing through the k-th layerkThe generated adjacency matrix based on the characteristics characterizes the similarity of k-th order neighbors, and the calculation formula is as follows:
obtaining an adjacency matrix F of each layer through the formula, wherein alpha and beta epsilon (0,1) are artificial parameters;
in particular, the walk-based graph structure SdThe learning method comprises the following steps:
the walk-based graph structure SdGenerated by said original adjacency matrix a through node random walks and contains global information of the graph structure, thus through the graph structure SdThe learned prediction matrix expresses global information of the data. The calculation formula is as follows:
here, α represents a transition probability in random walk, τ represents a walk cost, and the value becomes larger with the number of walks. In the task of abnormality detection, by learning a graph structure S having global informationdThe accuracy of the prediction matrix is improved.
In particular, the subgraph-based graph structure SsThe learning method comprises the following steps:
for the original graph adjacency matrix A, a subgraph S is generated by randomly reserving a certain edgesAnd sub-graph SsPrediction matrix Z of learning nodes put into the graph volume layers=f(X,Ss) Through ZsEvaluation subgraph SsThe probability of connecting edges between each node pair in the graph. The probability of connecting edges between the node i and the target node j is as follows:
where W iss∈R2c*1Representation application graph structure SsA mapping vector of bs∈R2c*1A vector of the offset is represented, and,representing node i application graph structure SsIn order to save space and time, only the limited range K of the node i is consideredsAnd (c) neighbor nodes in the cluster, wherein k is an artificial parameter.
The node connecting edge probability rhosAfter the computation is completed, the graph structure SsComprises the following steps:
Ss=Ss+μsρs,
where musE (0,1) is an artificial parameter.
Therefore, while the foregoing is directed to embodiments of the present invention, other and further embodiments of the invention may be devised without departing from the basic scope thereof, and the scope thereof is determined by the claims that follow.
Claims (4)
1. An abnormal user detection method based on graph structure learning is characterized by comprising the following steps:
s1, capturing user transaction behavior data, and converting the data into a graph structure type;
s2, training the graph neural network, and learning a model weight coefficient until a target function is converged;
s3, generating various graph structures to express various information;
and S4, detecting new user transaction behavior data by using the trained graph neural network model.
2. The abnormal user detection method based on graph structure learning according to claim 1,
in S2, the method for defining the neural network model includes:
s21, defining a map convolutional layer:
H(k)=Relu(SH(k-1)W(k)),
representing the processed adjacency matrix S, which is suitable for convolution operations, wherein D represents the degree matrix of the adjacency matrix S,represents the adjacency matrix S plus the identity matrix I; h(k)Represents the data characteristics of the k-th layer graph convolution network, where H(0)The original data characteristics; wkRepresenting weight coefficients of a k-th layer graph convolution network, wherein a Relu () function represents a nonlinear activation function; for graph convolution of l layers, the l-th layer uses softmax () activation function and the prediction matrix Z ═ Hl。;
Learning 3 graph structures in the graph structure learning layer to be applied to the graph convolution layer, wherein the prediction matrix of n nodes in the u graph structure is represented asWhereinIs the probability that node i belongs to class c in the u-th graph structure, where u is { A, S }f,Sd,SsN is the number of nodes;
s22, fusing the 4 learned low-dimensional vectors of each node into a low-dimensional vector representation which is most beneficial to anomaly detection based on an attention fusion method through an attention fusion method aiming at the 4 graph convolutions;
the attention fusion method is defined as follows:
first, defining a node low-dimensional vector Z generated by a graph structure ggImportance coefficient of middle node i:
andrespectively expressed in a low-dimensional vector ZgThe maximum value and the second maximum value of the vector of the middle node i; lambda epsilon (0,1) is an artificial parameter;
s23, obtaining a prediction matrix of the final graph neural network model based on the importance coefficient:
r and K both represent the number of graph structures, epsilon belongs to (0,1) as an artificial parameter, and Z belongs to Rn*c。
3. The abnormal user detection method based on graph structure learning according to claim 1 or 2,
the S2 further includes a method for optimizing a neural network model:
s201, defining a consistency constraint function:
by reducing the loss function LuThe similarity of the three prediction matrixes is improved, and the universality among the prediction matrixes is enhanced;
s202, defining an independence constraint function:
n is the number of the prediction matrix Z; matrix arrayWherein I is an n-order identity matrix, and the personality of the eigenvector Z can be amplified by calculating a matrix G; through LdThe difference between the generated graph structure and the original adjacency matrix A is enhanced, and the useful information with individuality can be captured;
s203, defining a cross entropy loss function, optimizing a weight coefficient W of the graph neural network model, assuming that a training set is L, and a real label of each node L belonging to L is YlThe prediction label is a prediction matrix Z epsilon Rn*cThe nodes in all training sets are expressed as:
s204, combining the abnormal detection task and the constraint condition to obtain the following objective function:
L=Lt+τlu+υLd,
τ and υ e (0,1) are artificial parameters of the coherence constraint function and the independence constraint function.
4. The abnormal user detection method based on graph structure learning according to claim 1,
in S3, a graph structure learning method is included, where the graph structure learning method includes feature-based graph structure learning, wandering-based graph structure learning, and sub-graph-based graph structure learning;
s31, the feature-based graph Structure SfThe learning method comprises the following steps:
first fixing the pattern nerveThe weight coefficient W of the network is taken and the vector representation H of the node is taken to be { H ═ H0,H1,…,HlS to construct a feature map Sf={F0,F1,…,FlIn which FkIs the eigenvector H passing through the k-th layerkThe generated adjacency matrix based on the characteristics characterizes the similarity of k-th order neighbors, and the calculation formula is as follows:
obtaining an adjacency matrix F of each layer through the formula, wherein alpha and beta epsilon (0,1) are artificial parameters;
s32, graph structure S based on wanderingdThe learning method comprises the following steps:
the wandering-based graph structure SdGenerated by the random walk of the nodes of the original adjacency matrix A and contains the global information of the graph structure, and therefore passes through the graph structure SdThe learned prediction matrix expresses global information of the data; the calculation formula is as follows:
α represents a transition probability in random walk, τ represents a walk cost, and a value becomes large with the number of walks; in the task of abnormality detection, by learning a graph structure S having global informationdThe accuracy of the prediction matrix is improved;
s33, the graph structure S based on the subgraphsThe learning method comprises the following steps:
for the original graph adjacency matrix A, a subgraph S is generated by randomly reserving a certain edgesAnd sub-graph SsPrediction matrix Z of learning nodes put into the graph volume layers=f(X,Ss) Through ZsEvaluation subgraph SsThe probability of connecting edges between each node pair; the probability of connecting edges between the node i and the target node j is as follows:
Ws∈R2c*1representation application graph structure SsA mapping vector of bs∈R2c*1A vector of the offset is represented, and,representing node i application graph structure SsIn order to save space and time, only the limited range K of the node i is consideredsThe neighbor nodes in the node, wherein k is an artificial parameter;
the node connecting edge probability rhosAfter the computation is completed, the graph structure SsComprises the following steps:
Ss=Ss+μsρs,
μse (0,1) is an artificial parameter.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210275577.8A CN114626890A (en) | 2022-03-21 | 2022-03-21 | Abnormal user detection method based on graph structure learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210275577.8A CN114626890A (en) | 2022-03-21 | 2022-03-21 | Abnormal user detection method based on graph structure learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114626890A true CN114626890A (en) | 2022-06-14 |
Family
ID=81904048
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210275577.8A Pending CN114626890A (en) | 2022-03-21 | 2022-03-21 | Abnormal user detection method based on graph structure learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114626890A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545467A (en) * | 2022-09-30 | 2022-12-30 | 广东工业大学 | Risk commodity identification model based on graph neural network |
CN116646072A (en) * | 2023-05-18 | 2023-08-25 | 肇庆医学高等专科学校 | Training method and device for prostate diagnosis neural network model |
CN116708029A (en) * | 2023-08-04 | 2023-09-05 | 烟台大学 | Method, system, equipment and storage medium for detecting abnormal nodes of blockchain |
CN116993433A (en) * | 2023-07-14 | 2023-11-03 | 重庆邮电大学 | Internet E-commerce abnormal user detection method based on big data |
CN117093928A (en) * | 2023-10-18 | 2023-11-21 | 南开大学 | Self-adaptive graph node anomaly detection method based on spectral domain graph neural network |
CN117520995A (en) * | 2024-01-03 | 2024-02-06 | 中国海洋大学 | Abnormal user detection method and system in network information platform |
CN117910519A (en) * | 2024-03-20 | 2024-04-19 | 烟台大学 | Graph application method, system and recommendation method for generating evolutionary graph to fight against network |
-
2022
- 2022-03-21 CN CN202210275577.8A patent/CN114626890A/en active Pending
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115545467A (en) * | 2022-09-30 | 2022-12-30 | 广东工业大学 | Risk commodity identification model based on graph neural network |
CN115545467B (en) * | 2022-09-30 | 2024-01-23 | 广东工业大学 | Risk commodity identification model based on graphic neural network |
CN116646072A (en) * | 2023-05-18 | 2023-08-25 | 肇庆医学高等专科学校 | Training method and device for prostate diagnosis neural network model |
CN116993433A (en) * | 2023-07-14 | 2023-11-03 | 重庆邮电大学 | Internet E-commerce abnormal user detection method based on big data |
CN116708029A (en) * | 2023-08-04 | 2023-09-05 | 烟台大学 | Method, system, equipment and storage medium for detecting abnormal nodes of blockchain |
CN116708029B (en) * | 2023-08-04 | 2023-10-20 | 烟台大学 | Method, system, equipment and storage medium for detecting abnormal nodes of blockchain |
CN117093928A (en) * | 2023-10-18 | 2023-11-21 | 南开大学 | Self-adaptive graph node anomaly detection method based on spectral domain graph neural network |
CN117520995A (en) * | 2024-01-03 | 2024-02-06 | 中国海洋大学 | Abnormal user detection method and system in network information platform |
CN117520995B (en) * | 2024-01-03 | 2024-04-02 | 中国海洋大学 | Abnormal user detection method and system in network information platform |
CN117910519A (en) * | 2024-03-20 | 2024-04-19 | 烟台大学 | Graph application method, system and recommendation method for generating evolutionary graph to fight against network |
CN117910519B (en) * | 2024-03-20 | 2024-06-07 | 烟台大学 | Recommendation method for generating countermeasure network by evolutionary graph |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114626890A (en) | Abnormal user detection method based on graph structure learning | |
Lee et al. | Gradients as a measure of uncertainty in neural networks | |
Nishikawa et al. | Concrete crack detection by multiple sequential image filtering | |
CN111476315B (en) | Image multi-label identification method based on statistical correlation and graph convolution technology | |
CN116822382B (en) | Sea surface temperature prediction method and network based on space-time multiple characteristic diagram convolution | |
CN115660688B (en) | Financial transaction anomaly detection method and cross-regional sustainable training method thereof | |
CN115205689A (en) | Improved unsupervised remote sensing image anomaly detection method | |
CN112364747A (en) | Target detection method under limited sample | |
CN110889493A (en) | Method and device for adding disturbance aiming at relational network | |
CN115293235A (en) | Method for establishing risk identification model and corresponding device | |
Stracuzzi et al. | Quantifying Uncertainty to Improve Decision Making in Machine Learning. | |
CN113743594A (en) | Network flow prediction model establishing method and device, electronic equipment and storage medium | |
CN113343123A (en) | Training method and detection method for generating confrontation multiple relation graph network | |
CN115174263B (en) | Attack path dynamic decision method and device | |
CN116545679A (en) | Industrial situation security basic framework and network attack behavior feature analysis method | |
CN115238773A (en) | Malicious account detection method and device for heterogeneous primitive path automatic evaluation | |
CN114329099A (en) | Overlapping community identification method, device, equipment, storage medium and program product | |
CN115238134A (en) | Method and apparatus for generating a graph vector representation of a graph data structure | |
CN114519605A (en) | Advertisement click fraud detection method, system, server and storage medium | |
JP6950647B2 (en) | Data determination device, method, and program | |
CN113744023A (en) | Dual-channel collaborative filtering recommendation method based on graph convolution network | |
LU501958B1 (en) | Graph structure learning based detection method for abnormal users | |
CN117633478B (en) | Method and system for evaluating fairness backdoor based on GNNs social media mining | |
CN111833171B (en) | Abnormal operation detection and model training method, device and readable storage medium | |
CN118041670A (en) | Abnormal flow detection method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20220614 |
|
RJ01 | Rejection of invention patent application after publication |