WO2022088408A1 - Graph neural network-based transaction fraud detection method and system - Google Patents

Graph neural network-based transaction fraud detection method and system Download PDF

Info

Publication number
WO2022088408A1
WO2022088408A1 PCT/CN2020/135271 CN2020135271W WO2022088408A1 WO 2022088408 A1 WO2022088408 A1 WO 2022088408A1 CN 2020135271 W CN2020135271 W CN 2020135271W WO 2022088408 A1 WO2022088408 A1 WO 2022088408A1
Authority
WO
WIPO (PCT)
Prior art keywords
transaction
graph
behavior
data
neural network
Prior art date
Application number
PCT/CN2020/135271
Other languages
French (fr)
Chinese (zh)
Inventor
王欢
李青山
司华友
Original Assignee
南京博雅区块链研究院有限公司
北京大学
博雅正链(北京)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京博雅区块链研究院有限公司, 北京大学, 博雅正链(北京)科技有限公司 filed Critical 南京博雅区块链研究院有限公司
Publication of WO2022088408A1 publication Critical patent/WO2022088408A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2323Non-hierarchical techniques based on graph theory, e.g. minimum spanning trees [MST] or graph cuts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Definitions

  • the invention relates to the field of financial technology, in particular to a transaction fraud detection method and system based on a graph neural network.
  • Transaction data refers to the directed transactions between many transaction accounts. Due to the existence of scams, malware, terrorist organizations, ransomware, Ponzi schemes, etc., some fraudulent transactions appear in the transaction network, and these data are time series data. Refers to the behavior sequence of transactions over a period of time, so we need to classify illegal transactions and legitimate transactions to detect transaction fraud.
  • a transaction fraud detection method and system based on a graph neural network.
  • the present invention proposes an embodiment of a transaction fraud detection method based on a graph neural network, comprising the following steps:
  • the transaction data preprocessing step is to obtain transaction data and preprocess the transaction data to obtain a panel-shaped transaction sample set
  • the step of extracting the historical features of the transaction behavior performing long-short-term memory network processing on the transaction sample set to obtain the historical features of the transaction behavior;
  • the step of extracting transaction behavior aggregation features is to perform graph convolution network processing on the transaction historical behavior features to obtain transaction behavior aggregation features;
  • the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior are processed by the full connection layer, and the fraud prediction of the transaction node is carried out through two classifications.
  • the preprocessing includes the following sub-steps:
  • the spectral clustering sample labeling step is to perform spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
  • the spectral clustering sample labeling process includes the following sub-steps:
  • the feature matrix is clustered.
  • the graph convolutional network processing includes the following sub-steps:
  • the adjacency matrix is input into the graph convolutional network graph learning layers of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
  • a transaction fraud detection system based on a graph neural network proposed by the present invention, the transaction fraud detection system based on a graph neural network includes the following modules:
  • a transaction data preprocessing module which is used for acquiring transaction data and preprocessing the transaction data to obtain a panel-shaped transaction sample set
  • a transaction behavior historical feature extraction module which is used to perform long-short-term memory network processing on the transaction sample set to obtain transaction behavior historical features
  • the transaction behavior aggregation feature extraction module is configured to perform graph convolution network processing on the transaction historical behavior features to obtain transaction behavior aggregation features
  • a prediction module which is configured to perform full-connection layer processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and perform fraud prediction of transaction nodes through binary classification.
  • the preprocessing includes the following sub-steps:
  • the graph neural network-based transaction fraud detection system further includes a spectral clustering sample labeling module, which is configured to perform spectral clustering sample labeling on the transaction sample set processing to obtain a spectral clustering transaction sample set.
  • the spectral clustering sample labeling process includes the following sub-steps:
  • the feature matrix is clustered.
  • the graph convolutional network processing includes the following sub-steps:
  • the adjacency matrix is input into the graph convolutional network graph learning layers of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
  • the above-mentioned transaction fraud detection method and system based on graph neural network overcomes the traditional transaction fraud detection method, which ignores the relationship between the data itself through the extraction of historical features of transaction behavior and the extraction of aggregated features of transaction behavior, and then through the full connection layer processing. And transaction behavior is the defect of time series data, which ensures the comprehensiveness of transaction fraud detection and improves the accuracy of transaction fraud detection.
  • Fig. 1 is the flow chart of the transaction fraud detection method based on graph neural network of the present invention
  • Fig. 2 is the flow chart of transaction data preprocessing steps in the transaction fraud detection method based on the graph neural network of the present invention
  • Fig. 3 time variation curve of local transaction node feature (transaction fee) after transaction data preprocessing in the transaction fraud detection method based on graph neural network of the present invention
  • Fig. 4 is the time change curve of the summary characteristic of the transaction node (the maximum value in the transaction fee of the local transaction node and its neighbor transaction node) after transaction data preprocessing in the transaction fraud detection method based on the graph neural network of the present invention
  • Figure 5 Illegal transaction diagram formed by transaction data marked as illegal transaction in Bitcoin transaction data
  • FIG. 7 is a comparison diagram of the effect of spectral clustering sample labeling and binary real distribution in the transaction fraud detection method based on graph neural network of the present invention.
  • Fig. 8 is a flow chart of parallelized construction of distance matrix in the method for detecting transaction fraud based on graph neural network of the present invention
  • FIG. 9 is a structural diagram of a transaction fraud detection system based on a graph neural network according to an embodiment of the present invention.
  • FIG. 10 is a data flow diagram of the transaction fraud detection system based on the graph neural network of the present invention.
  • the method and system for detecting transaction fraud based on a graph neural network of the present invention are described in detail by taking the digital currency transaction fraud detection of Bitcoin (BTC) as an example. It should be noted that the graph neural network-based transaction fraud detection method and system of the present invention can also be used in fraud detection of other transaction data, such as digital currency transaction data, traffic data, and stock data.
  • BTC digital currency transaction fraud detection
  • other transaction data such as digital currency transaction data, traffic data, and stock data.
  • the transaction fraud detection method based on the graph neural network proposed by the present invention includes the following steps:
  • a transaction data preprocessing step acquiring transaction data and preprocessing the transaction data to obtain a panel-shaped transaction sample set
  • S500 a prediction step, which performs full-connection layer processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and conducts fraud prediction of transaction nodes through two classifications.
  • the above-mentioned transaction fraud detection method based on graph neural network overcomes the traditional transaction fraud detection method, which ignores the relationship between the data itself and the transaction through the extraction of historical features of transaction behavior and the extraction of aggregated features of transaction behavior, and then through the full connection layer processing. Behavior is a flaw in time-series data, ensuring comprehensive transaction fraud detection and improving transaction fraud detection accuracy.
  • the transaction data preprocessing step is mainly used to collect and preprocess the transaction data required for transaction fraud detection, so that the preprocessed transaction sample set is in the form of a panel and has a connection between transaction nodes and transaction nodes.
  • the samples constitute a dynamically changing transaction flow graph.
  • the transaction features at each time step are obtained through preprocessing, and the preprocessing includes the following sub-steps:
  • the Bitcoin real transaction data used is a transaction graph collected from the Bitcoin blockchain.
  • the data description of the transaction graph is as follows: a node in the graph represents a transaction, and an edge can be seen as the flow of Bitcoin between one transaction and another. It consists of 203769 nodes and 234355 edges. Among them, 2% of the nodes are marked as illegal nodes, 21% of the nodes are marked as legal transaction nodes, and the rest of the transactions are not marked.
  • each transaction node is associated with time information, where the time information refers to the estimated time when the Bitcoin network confirms the transaction.
  • the time interval of about 2 weeks is divided into 49 different time steps, about two years of Bitcoin transaction data.
  • the time interval between their mutual transactions on the blockchain is less than 3 hours, and the transaction nodes that exist in other time steps will not be connected. side, the time interval here can be modified to other reasonable values.
  • the local transaction node feature in step S110 represents transaction data of the local transaction node, such as time step, input transaction number (node in-degree), output transaction number (node out-degree), transaction fee, output amount, and derivative statistics.
  • the derived statistical features refer to some average features of neighboring nodes, such as the average BTC fee received by the number of input transactions, the average BTC fee received by the number of output transactions, the average BTC fee spent by the number of input transactions, and the average number of output transactions. BTC fees spent, average number of input/output transactions related to the number of input transactions (average number of input related transactions), average number of input/output transactions related to the number of output transactions (average number of output related transactions), etc.
  • the summary features of the transaction nodes in step S120 are obtained through the local transaction node features of the neighbor transaction nodes of the local transaction node forward and/or backward one-hop (one-hop), that is, all neighbor transaction nodes of the local transaction node are obtained.
  • the characteristic data of the same local trading node obtained by step S100 are processed, and the descriptive statistical characteristics such as the maximum value, minimum value, median, mode, standard deviation, full distance and correlation coefficient among them are obtained as the transaction node. Summarize features.
  • Step S130 is to obtain the local topology information of a transaction node, which is obtained by calculating the spectral information of the graph of all transaction nodes radiating an appropriate number of layers with the local transaction node as the center.
  • the node characteristics of the transaction graph are described as follows: the time step is 2 weeks, with a total of 49 steps.
  • the first 93 node characteristics are the characteristics of local transaction nodes, which are the characteristics and transaction data of local transaction nodes, including time step, number of input transactions (node in-degree), number of output transactions (node out-degree), transaction fee, output amount and Derived statistical features.
  • the last 72 node features are the aggregated features of the transaction nodes, using the maximum value of the same feature parameters (a local transaction node feature) obtained from the local (central) transaction node’s backward and/or backward neighbor transaction nodes , minimum, median, mode, standard deviation, range, and correlation coefficient.
  • Figure 3 and Figure 4 are drawn to observe the change curve of transaction characteristics over time after the transaction data preprocessing.
  • Figure 3 is the time change curve of a certain local transaction node characteristics (such as transaction fees)
  • Figure 4 is the summary characteristics of transaction nodes (such as the maximum value of the transaction fees of the local transaction node and its neighbor transaction nodes) time curve.
  • the figure shows the change of three types of nodes over time on two different attributes (local transaction node characteristics and transaction node summary characteristics). It can be seen that these two attributes can better distinguish legal transaction nodes (Fig.
  • the relatively stable curve in the middle and lower part) and the illegal transaction node (the curve in the upper part of the figure is more tortuous), in which the attribute curve of the legal transaction node is relatively stable over time at the bottom of the image, while the illegal transaction node is at the top of the image.
  • the curve is steeper.
  • step S100 after the transaction data preprocessing step of step S100 is completed, when the number of known classified transaction samples is sufficient, the historical feature extraction of transaction behavior in step S300 can be directly performed.
  • the number of transaction samples known to be classified is small and it is impossible to accurately detect transaction fraud, it is necessary to further perform spectral clustering sample labeling, and label unlabeled transaction nodes to avoid the situation of too small samples.
  • a spectral clustering sample labeling step performing spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
  • the transaction sample set it consists of 203,769 nodes and 234,355 edges. Among them, 2% of the transaction nodes are marked as illegal transaction nodes, 21% of the transaction nodes are marked as legal transaction nodes, and the rest of the transaction nodes are not marked, that is, 77% of the transaction nodes are not marked. Since the classification of some samples of transaction data - transaction nodes is unknown, the present invention adopts the spectral clustering unsupervised method to classify these transaction nodes, and learns the labels of the unknown transaction nodes to increase the sample size and use them as available data for subsequent training. Optionally, due to the large number of samples to be learned, parallelized spectral clustering should be used.
  • spectral clustering is used to label unlabeled nodes for samples.
  • Spectral clustering can overcome the defect that K-means clustering is affected by data shape, and is a globally optimal clustering method.
  • the main idea of spectral clustering is to regard the data as points in the n-dimensional space, as shown in Figure 5 and Figure 6, which are the illegal transaction graph formed by the transaction data marked as illegal transaction in the bitcoin transaction data and the A graph of legitimate transactions formed by transaction data marked as legitimate transactions. If there is a certain similarity between points, they are connected by edges, and the purpose of clustering is achieved by cutting the graph composed of the above points and dividing them into multiple subgraphs, that is, the sum of the weight values in the subgraphs is as high as possible.
  • the implementation method is to connect the eigenvalue decomposition of the graph cut and the eigenvalue decomposition of the Laplacian matrix together through the Rayleigh entropy, so as to solve the NP-hard problem. Convert to continuous eigenvalues to solve the problem.
  • the spectral clustering sample labeling process includes the following sub-steps:
  • K-Means clustering method can be selected.
  • the left side of the figure is the real distribution of the two classifications of the original data
  • the right side is the spectral clustering sample labeling result processed by the spectral clustering algorithm of the present invention.
  • the classification result of the present invention after the spectral clustering sample labeling after spectral clustering is very similar to the true distribution of the binary classification, indicating that the spectral clustering sample labeling accuracy of the present invention is very high, which can greatly improve the detection accuracy of results.
  • the sample set data (x 1 , x 2 , .
  • the distance here can be measured using Euclidean distance, shortest path or Game distance, preferably shortest path or Game distance.
  • the Game distance means that there is only one shortest path from point A to point B (not allowed to leave the surface) on the surface (three-dimensional space), and the distance of this shortest path is the geodesic distance.
  • the process of parallelizing the construction of the distance matrix is shown in the figure.
  • reduce() reduces the results of all partitions, that is, traverses and merges values from the same new key written by map(), and combines The values in each row are filled column by column, resulting in a complete distance matrix.
  • the Gaussian similarity is calculated to obtain the similarity matrix W.
  • the final sparse real symmetric matrix L' is obtained.
  • the Lanczos method is suitable for iterative approximation to solve the eigenvalues and eigenvectors of such large sparse matrices.
  • the idea is to convert the Laplace matrix into a real symmetric tridiagonal matrix by means of orthogonal similarity transformation.
  • the eigenvalues and eigenvectors obtained by decomposing Tkk are the eigenvalues and eigenvectors of L'. If only the first k eigenvalues are calculated, the calculation can be completed with only k iterations, so it is more efficient.
  • the number of clusters k is set to 2 (legal transactions and illegal transactions), and the matrix composed of feature vectors h 1 , h 2 , .
  • the step of extracting the historical features of the transaction behavior is to perform long-short-term memory network processing on the transaction sample set to obtain the historical features of the transaction behavior. That is, by learning the historical characteristics of the transaction behavior, the historical characteristics of the transaction behavior can be obtained.
  • LSTM is committed to solving the long-term dependency problem. It adds three gates on the basis of RNN, namely input gate, forget gate and output gate, to effectively filter historical information, and the final output h t is composed of output gates o t and C t Long-term cellular state storage body determination.
  • the transaction node time series data in the transaction sample set obtained in the transaction data preprocessing step or the transaction node time series data in the spectral cluster transaction sample set obtained in the spectral clustering sample labeling step are input into the LSTM neural network
  • the graph convolutional network processing includes the following sub-steps:
  • the adjacency matrix is input into the graph convolutional network graph learning layer of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
  • the number of layers is set to 2-4 layers, so as to avoid too many layers affecting the learning of local features of nodes, and what is learned is global features.
  • the adjacency matrix is input into the 2-layer graph convolutional network graph learning layer for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
  • the calculation of the adjacency matrix of the historical feature of the transaction behavior can include two parts, the first part is whether there is an edge connection, and if so, it is set to 1; because it is a time series, the second part can be the similarity of each feature sequence. Finally, the weighted sum of the above two parts according to the weight is the similarity between a node and its neighbor nodes.
  • the present invention mainly uses graph-based methods for fraud detection.
  • is the loss for a specific prediction task, which measures the error between the true value and the predicted value
  • is the regularization term of the graph, which makes the prediction smooth on the graph
  • is a hyperparameter to balance the above ratio of the two.
  • the regularization term usually implements the smoothness assumption of the graph signal, that is, similar vertices tend to have similar predictions, preserving the topological relationship of the graph.
  • a widely used regularization term ⁇ is defined as follows, which is a measure weighting based on Euclidean distance, which belongs to the variation measure in the graph signal, and describes the overall smoothness. When g(x i , x j ) is 1 is the Euclidean distance:
  • g(x i , x j ) is the similarity measure between feature vectors of entity pairs, is the degree of vertex i.
  • the regularizer smoothes each pair of entities so that their predictions (after normalization by degrees) are close to each other.
  • the strength of the smoothing is determined by the similarity g(x i , x j ) of the feature vectors. This can be equivalently written in a more compact matrix form:
  • L is the Laplacian matrix of the graph i.e.
  • A is the similarity matrix
  • each element is g(x i , x j ).
  • Graph Convolutional Network is a special graph-based learning method that has developed rapidly in recent years. It incorporates the core idea of graph-based learning, namely advanced convolutional neural networks (CNNs).
  • CNNs advanced convolutional neural networks
  • the core idea of standard CNNs is to use convolutions (such as 3 ⁇ 3 filter matrices) to capture local patterns in the input data (such as oblique lines in images).
  • CNNs the goal of GCN is to capture the local connection patterns on the graph through convolution.
  • KNN K-Nearest Neighbor
  • Sort Sort to get the normalized output of the vertices
  • graph convolution network methods such as LGCL (Learn Graph Convolution Layer): the learnable graph convolution layer automatically selects a fixed number of neighbor nodes for each feature value-based sorting, so as to Transform the graph-structured data into regular one-dimensional mesh data, and then apply standard CNN operations on the one-dimensional mesh data;
  • f represents the filtering operation T of parameterized convolution
  • U is the matrix of characteristic column vectors of L.
  • U T X represents the positive transformation of GFT, and X is projected onto each eigenvector to obtain the Fourier coefficient ⁇ (in the spectral domain); the next step is F ⁇ , this step is scaling eigenvalue scaling, right
  • the elements of the angle matrix F are the eigenvalues of L, and the higher the frequency, the larger the scaling coefficient ⁇ , that is to say, L is a high-pass filter.
  • the above vector obtained by scaling Multiplying a U matrix to the left is an inverse transformation of GFT, which is equivalent to transforming the frequency domain information back to the time domain.
  • F is regarded as the equation of ⁇ , so that the k-order approximation of the Chebyshev polynomial T k (x) can be used to represent F:
  • X is the original vertex feature
  • the dimension is N*C
  • W is the parameter to be learned
  • the dimension is C*F
  • F is the output feature dimension. Then the dimension of the output after a first-order graph convolution is N*F.
  • the prediction step is to perform full-connection layer processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and perform fraud prediction of transaction nodes through two classifications.
  • the historical characteristics of transaction behavior, aggregation characteristics of transaction behavior, and the output results of traditional machine learning models are processed through the full connection layer and then classified into two categories to obtain the prediction of whether the final transaction node is an illegal transaction (that is, predicting that the label of the transaction node to be tested is a legal transaction). or illegal transactions).
  • the present invention also proposes a transaction fraud detection system based on a graph neural network.
  • the transaction fraud detection system based on the graph neural network includes the following modules:
  • the transaction data preprocessing module is used to obtain transaction data and preprocess the transaction data to obtain a panel-shaped transaction sample set
  • the transaction behavior historical feature extraction module is used to perform long-short-term memory network processing on the transaction sample set to obtain the transaction behavior historical features
  • the transaction behavior aggregation feature extraction module is used to perform graph convolution network processing on transaction historical behavior features to obtain transaction behavior aggregation features;
  • the prediction module is used to perform full connection processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and conduct fraud prediction of transaction nodes through two classifications.
  • the above-mentioned transaction fraud detection system based on graph neural network overcomes the traditional transaction fraud detection method that ignores the relationship between data itself and transactions through the extraction of historical features of transaction behavior and aggregation of transaction behavior, and then through the full connection layer processing. Behavior is a flaw in time-series data, ensuring comprehensive transaction fraud detection and improving transaction fraud detection accuracy.
  • the transaction data preprocessing module is mainly used to collect and preprocess the transaction data required for transaction fraud detection, so that the preprocessed transaction sample set is in the form of a panel and has a relationship between transaction nodes and transaction nodes.
  • the samples constitute a dynamically changing transaction flow graph.
  • the transaction features at each time step are obtained through preprocessing, and the preprocessing includes the following sub-steps:
  • the Bitcoin real transaction data used is a transaction graph collected from the Bitcoin blockchain.
  • the data description of the transaction graph is as follows: a node in the graph represents a transaction, and an edge can be seen as the flow of Bitcoin between one transaction and another. It consists of 203769 nodes and 234355 edges. Among them, 2% of the nodes are marked as illegal nodes, 21% of the nodes are marked as legitimate transaction nodes, and the rest of the transactions are not marked.
  • each transaction node is associated with time information, where time information refers to the estimated time when the Bitcoin network confirms the transaction.
  • time information refers to the estimated time when the Bitcoin network confirms the transaction.
  • the time interval of about 2 weeks is divided into 49 different time steps, about two years of Bitcoin transaction data.
  • the time interval between the mutual transactions between them appears on the blockchain is less than 3 hours, and the transaction nodes that exist in other time steps will not be connected. side, the time interval here can be modified to other reasonable values.
  • the various trading characteristics of each time step are explained in detail below.
  • the above-mentioned local transaction node characteristics represent transaction data of the local transaction node, such as time step, the number of input transactions (node in-degree), the number of output transactions (node out-degree), transaction fees, output volume, and derivative statistics.
  • the derived statistical features refer to some average features of neighboring nodes, such as the average BTC fee received by the number of input transactions, the average BTC fee received by the number of output transactions, the average BTC fee spent by the number of input transactions, and the average number of output transactions. BTC fees spent, average number of input/output transactions related to the number of input transactions (average number of input related transactions), average number of input/output transactions related to the number of output transactions (average number of output related transactions), etc.
  • the summary features of the above-mentioned transaction nodes are obtained through the local transaction node features of the neighbor transaction nodes of the local transaction node forward and/or one-hop backward (one-hop), that is, the passing steps of all neighbor transaction nodes of the local transaction node.
  • the characteristic data of the same local trading node obtained by S100 is processed, and the descriptive statistical characteristics such as the maximum value, minimum value, median, mode, standard deviation, range and correlation coefficient among them are obtained as the summary characteristics of the trading node.
  • the transaction node sub-graph information of the above transaction data is to obtain the local topology information of a transaction node.
  • the sub-graph information of the transaction node which reflects the topology information of the graph in the frequency domain. If the eigenvalues are similar, it means that the sub-graph topological structure where the transaction node is located is more similar.
  • the node characteristics of the transaction graph are described as follows: the time step is 2 weeks, with a total of 49 steps.
  • the first 93 node characteristics are the characteristics of local transaction nodes, which are the characteristics and transaction data of local transaction nodes, including time step, number of input transactions (node in-degree), number of output transactions (node out-degree), transaction fee, output amount and Derived statistical features.
  • the last 72 node features are the aggregated features of the transaction nodes, using the maximum value of the same feature parameters (a local transaction node feature) obtained from the local (central) transaction node’s backward and/or backward neighbor transaction nodes , minimum, median, mode, standard deviation, range, and correlation coefficient.
  • Figure 3 and Figure 4 are drawn to observe the change curve of transaction characteristics over time after the transaction data preprocessing.
  • Figure 3 is the time change curve of a certain local transaction node characteristics (such as transaction fees)
  • Figure 4 is the summary characteristics of transaction nodes (such as the maximum value of the transaction fees of the local transaction node and its neighbor transaction nodes) time curve.
  • the figure below shows the changes of three types of nodes over time on two different attributes (local transaction node characteristics and transaction node summary characteristics). It can be seen that these two attributes can better distinguish legal transaction nodes ( The lower part of the figure is a relatively stable curve) and the illegal transaction node (the upper part of the figure is more tortuous curve), in which the attribute curve of the legal transaction node is relatively stable over time at the bottom of the image, while the illegal transaction node at the top of the image changes with time.
  • the change curve is steeper.
  • the module for extracting historical features of transaction behavior can be directly executed.
  • the number of transaction samples known to be classified is small and it is impossible to accurately detect transaction fraud, it is necessary to further execute the spectral clustering sample labeling module to label unlabeled transaction nodes to avoid excessive sample size. Condition.
  • the graph neural network-based transaction fraud detection system further includes a spectral clustering sample labeling module, which is configured to perform spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
  • a spectral clustering sample labeling module which is configured to perform spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
  • the transaction sample set it consists of 203,769 nodes and 234,355 edges. Among them, 2% of the transaction nodes are marked as illegal transaction nodes, 21% of the transaction nodes are marked as legal transaction nodes, and the rest of the transaction nodes are not marked, that is, 77% of the transaction nodes are not marked. Since the classification of some samples of transaction data - transaction nodes is unknown, the present invention adopts the spectral clustering unsupervised method to classify these transaction nodes, and learns the labels of the unknown transaction nodes to increase the sample size and use them as available data for subsequent training. Optionally, due to the large number of samples to be learned, parallelized spectral clustering should be used.
  • spectral clustering is used to label unlabeled nodes for samples.
  • Spectral clustering can overcome the defect that K-means clustering is affected by data shape, and is a globally optimal clustering method.
  • the main idea of spectral clustering is to regard the data as points in the n-dimensional space, as shown in Figure 5 and Figure 6, which are the illegal transaction graph formed by the transaction data marked as illegal transaction in the bitcoin transaction data and the A graph of legitimate transactions formed by transaction data marked as legitimate transactions. If there is a certain similarity between points, they are connected by edges, and the purpose of clustering is achieved by cutting the graph composed of the above points and dividing them into multiple subgraphs, that is, the sum of the weight values in the subgraphs is as high as possible.
  • the implementation method is to connect the eigenvalue decomposition of the graph cut and the eigenvalue decomposition of the Laplacian matrix together through the Rayleigh entropy, so as to solve the NP-hard problem. Convert to continuous eigenvalues to solve the problem.
  • the spectral clustering sample labeling process includes the following sub-steps:
  • K-Means clustering method can be selected.
  • the left side of the figure is the real distribution of the two classifications of the original data
  • the right side is the spectral clustering sample labeling result processed by the spectral clustering algorithm of the present invention.
  • the classification result of the present invention after the spectral clustering sample labeling after spectral clustering is very similar to the true distribution of the binary classification, indicating that the spectral clustering sample labeling accuracy of the present invention is very high, which can greatly improve the detection accuracy of results.
  • the sample set data (x 1 , x 2 , .
  • the distance here can be measured using Euclidean distance, shortest path or Game distance, preferably shortest path or Game distance.
  • the Game distance means that there is only one shortest path from point A to point B (not allowed to leave the surface) on the surface (three-dimensional space), and the distance of this shortest path is the geodesic distance.
  • the process of parallelizing the construction of the distance matrix is shown in the figure.
  • reduce() reduces the results of all partitions, that is, traverses and merges values from the same new key written by map(), and combines The values in each row are filled column by column, resulting in a complete distance matrix.
  • the Gaussian similarity is calculated to obtain the similarity matrix W.
  • the final sparse real symmetric matrix L' is obtained.
  • the Lanczos method is suitable for iterative approximation to solve the eigenvalues and eigenvectors of such large sparse matrices.
  • the idea is to convert the Laplace matrix into a real symmetric tridiagonal matrix by means of orthogonal similarity transformation.
  • the eigenvalues and eigenvectors obtained by decomposing Tkk are the eigenvalues and eigenvectors of L'. If only the first k eigenvalues are calculated, the calculation can be completed with only k iterations, so it is more efficient.
  • the number of clusters k is set to 2 (legal transactions and illegal transactions), and the matrix composed of the eigenvectors h 1 , h 2 , .
  • the transaction behavior history feature extraction module is used to perform long short-term memory network processing on the transaction sample set to obtain transaction behavior history features. That is, by learning the historical characteristics of the transaction behavior, the historical characteristics of the transaction behavior can be obtained.
  • LSTM is committed to solving the long-term dependency problem. It adds three gates on the basis of RNN, namely input gate, forget gate and output gate, to effectively filter historical information, and the final output h t is composed of output gates o t and C t Long-term cellular state storage body determination.
  • the transaction node time series data in the transaction sample set obtained in the transaction data preprocessing step or the transaction node time series data in the spectral cluster transaction sample set obtained in the spectral clustering sample labeling step are input into the LSTM neural network
  • the graph convolutional network processing includes the following substeps:
  • the adjacency matrix is input into the graph convolutional network graph learning layer of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
  • the number of layers is set to 2-4 layers, so as to avoid too many layers affecting the learning of local features of nodes, and what is learned is global features.
  • the adjacency matrix is input into the 2-layer graph convolutional network graph learning layer for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
  • the calculation of the adjacency matrix of the historical feature of the transaction behavior can include two parts, the first part is whether there is an edge connection, and if so, it is set to 1; because it is a time series, the second part can be the similarity of each feature sequence. Finally, the weighted sum of the above two parts according to the weight is the similarity between a node and its neighbor nodes.
  • the present invention mainly uses graph-based methods for fraud detection.
  • is the loss for a specific prediction task, which measures the error between the true value and the predicted value
  • is the regularization term of the graph, which makes the prediction smooth on the graph
  • is a hyperparameter to balance the above ratio of the two.
  • the regularization term usually implements the smoothness assumption of the graph signal, that is, similar vertices tend to have similar predictions, preserving the topological relationship of the graph.
  • a widely used regularization term ⁇ is defined as follows, which is a measure weighting based on Euclidean distance, which belongs to the variation measure in the graph signal, and describes the overall smoothness. When g(x i , x j ) is 1 is the Euclidean distance:
  • g(x i , x j ) is the similarity measure between feature vectors of entity pairs, is the degree of vertex i.
  • the regularizer smoothes each pair of entities so that their predictions (after normalization by degrees) are close to each other.
  • the strength of the smoothing is determined by the similarity g(x i , x j ) of the feature vectors. This can be equivalently written in a more compact matrix form:
  • L is the Laplacian matrix of the graph i.e.
  • A is the similarity matrix
  • each element is g(x i , x j ).
  • Graph Convolutional Network is a special graph-based learning method that has developed rapidly in recent years. It incorporates the core idea of graph-based learning, namely advanced convolutional neural networks (CNNs).
  • CNNs advanced convolutional neural networks
  • the core idea of standard CNNs is to use convolutions (such as 3 ⁇ 3 filter matrices) to capture local patterns in the input data (such as oblique lines in images).
  • CNNs the goal of GCN is to capture the local connection patterns on the graph through convolution.
  • KNN K-Nearest Neighbor
  • Sort Sort to get the normalized output of the vertices
  • graph convolution network methods such as LGCL (Learn Graph Convolution Layer): the learnable graph convolution layer automatically selects a fixed number of neighbor nodes for each feature value-based sorting, so as to Transform the graph-structured data into regular one-dimensional mesh data, and then apply standard CNN operations on the one-dimensional mesh data;
  • f represents the filtering operation T of parameterized convolution
  • U is the matrix of characteristic column vectors of L.
  • U T X represents the positive transformation of GFT, and X is projected onto each eigenvector to obtain the Fourier coefficient ⁇ (in the spectral domain); the next step is F ⁇ , this step is scaling eigenvalue scaling, right
  • the elements of the angle matrix F are the eigenvalues of L, and the higher the frequency, the larger the scaling coefficient ⁇ , that is to say, L is a high-pass filter.
  • the above vector obtained by scaling Multiplying a U matrix to the left is an inverse transformation of GFT, which is equivalent to transforming the frequency domain information back to the time domain.
  • F is regarded as the equation of ⁇ , so that the k-order approximation of the Chebyshev polynomial T k (x) can be used to represent F:
  • X is the original vertex feature
  • the dimension is N*C
  • W is the parameter to be learned
  • the dimension is C*F
  • F is the output feature dimension. Then the dimension of the output after a first-order graph convolution is N*F.
  • the prediction module is used to perform full-connection layer processing on historical features of transaction behavior and aggregated features of transaction behaviors, and perform fraud prediction of transaction nodes through binary classification.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • General Business, Economics & Management (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Strategic Management (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Discrete Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Technology Law (AREA)
  • Image Analysis (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A graph neural network-based transaction fraud detection method and system. The method comprises the following steps: a transaction data preprocessing step (S100): obtaining transaction data, preprocessing the transaction data, and obtaining a transaction sample set in a panel form; a transaction behavior historical feature extraction step (S300): performing long short-term memory network processing on the transaction sample set to obtain a transaction behavior historical feature; a transaction behavior aggregation feature extraction step (S400): performing graph convolutional network processing on the transaction behavior historical feature to obtain a transaction behavior aggregation feature; and a prediction step (S500): performing fully-connected layer processing on the transaction behavior historical feature and the transaction behavior aggregation feature, and performing fraud prediction of a transaction node by means of binary classification. The method overcomes the defects that a traditional transaction fraud detection method ignores a relationship between data, and a transaction behavior is time series data, ensures the comprehensiveness of transaction fraud detection, and improves the accuracy of transaction fraud detection.

Description

基于图神经网络的交易欺诈检测方法及系统Transaction Fraud Detection Method and System Based on Graph Neural Network 技术领域technical field
本发明涉及金融技术领域,特别是涉及一种基于图神经网络的交易欺诈检测方法及系统。The invention relates to the field of financial technology, in particular to a transaction fraud detection method and system based on a graph neural network.
背景技术Background technique
大数据时代下线上交易变得越来越频繁,其中不乏一些恶意攻击、钓鱼等方式的违法交易,因此在交易成为违法交易前需要根据交易的特征将其检测出来,防止产生巨额损失。In the era of big data, offline and online transactions are becoming more and more frequent, including some illegal transactions such as malicious attacks and phishing. Therefore, before a transaction becomes an illegal transaction, it needs to be detected according to the characteristics of the transaction to prevent huge losses.
交易数据指众多交易账户之间发生的有向交易,由于骗局、恶意软件、恐怖组织、勒索软件、庞氏骗局等的存在,导致交易网络中出现部分欺诈交易,并且这些数据是时间序列数据,指交易一段时间内的行为序列,因此我们需要将非法交易和合法交易进行分类以检测出交易欺诈。Transaction data refers to the directed transactions between many transaction accounts. Due to the existence of scams, malware, terrorist organizations, ransomware, Ponzi schemes, etc., some fraudulent transactions appear in the transaction network, and these data are time series data. Refers to the behavior sequence of transactions over a period of time, so we need to classify illegal transactions and legitimate transactions to detect transaction fraud.
传统的检测方法采用机器学习方法或者时间序列分类方法,但是这样的方法一方面忽略了这类数据之间本身就存在联系:交易是网络上的节点,如果发生交易则说明两个节点之间有联系;另一方面忽视了交易行为是时间序列数据,这些缺陷将在很大程度上影响检测的全面性以及精确性。Traditional detection methods use machine learning methods or time series classification methods, but such methods ignore the inherent relationship between such data: transactions are nodes on the network, and if a transaction occurs, it means that there is a relationship between the two nodes. On the other hand, ignoring that transaction behavior is time series data, these defects will greatly affect the comprehensiveness and accuracy of detection.
发明内容SUMMARY OF THE INVENTION
基于此,有必要针对传统的交易欺诈检测方法忽略了数据之间本身就存在的联系以及交易行为是时间序列数据的特点,从而导致检测全面性差、精确性低的技术问题,本发明提出了一种基于图神经网络的交易欺诈检测方法及系统。Based on this, it is necessary to solve the technical problems of poor comprehensiveness and low accuracy of detection due to the fact that the traditional transaction fraud detection method ignores the relationship between the data itself and the transaction behavior is the characteristic of time series data. A transaction fraud detection method and system based on a graph neural network.
本发明提出了一实施方式的基于图神经网络的交易欺诈检测方法,包括以下步骤:The present invention proposes an embodiment of a transaction fraud detection method based on a graph neural network, comprising the following steps:
交易数据预处理步骤,获取交易数据并对所述交易数据进行预处理,获得面板形式的交易样本集;The transaction data preprocessing step is to obtain transaction data and preprocess the transaction data to obtain a panel-shaped transaction sample set;
交易行为历史特征提取步骤,对所述交易样本集进行长短期记忆网络处理,获得交易行为历史特征;The step of extracting the historical features of the transaction behavior, performing long-short-term memory network processing on the transaction sample set to obtain the historical features of the transaction behavior;
交易行为聚合特征提取步骤,对所述交易历史行为特征进行图卷积网络处理,获得交易行为聚合特征;The step of extracting transaction behavior aggregation features is to perform graph convolution network processing on the transaction historical behavior features to obtain transaction behavior aggregation features;
预测步骤,将所述交易行为历史特征以及所述交易行为聚合特征进行全连接层处理,通过二分类进行交易节点的欺诈预测。In the prediction step, the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior are processed by the full connection layer, and the fraud prediction of the transaction node is carried out through two classifications.
在其中的一个实施例中,在所述交易数据预处理步骤中,所述预处理包括以下子步骤:In one embodiment, in the transaction data preprocessing step, the preprocessing includes the following sub-steps:
获取所述交易数据的本地交易节点特征;obtaining the local transaction node characteristics of the transaction data;
获取所述交易数据的交易节点汇总特征;Obtain the transaction node summary characteristics of the transaction data;
获取所述交易数据的交易节点子图谱信息。Obtain the transaction node sub-graph information of the transaction data.
在其中的一个实施例中,在交易行为历史特征提取步骤之前,还包括以下步骤:In one embodiment, before the step of extracting the historical features of the transaction behavior, the following steps are further included:
谱聚类样本标注步骤,对所述交易样本集进行谱聚类样本标注处理,获得谱聚类交易样本集。The spectral clustering sample labeling step is to perform spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
在其中的一个实施例中,在所述谱聚类样本标注步骤中,谱聚类样本标注处理包括以下子步骤:In one embodiment, in the spectral clustering sample labeling step, the spectral clustering sample labeling process includes the following sub-steps:
构建交易样本集的谱矩阵;Construct the spectral matrix of the transaction sample set;
将所述谱矩阵进行特征值分解为特征矩阵;eigenvalue decomposition of the spectral matrix into an eigenmatrix;
对所述特征矩阵进行聚类。The feature matrix is clustered.
在其中的一个实施例中,在所述交易行为聚合特征提取步骤中,所述图卷积网络处理包括以下子步骤:In one embodiment, in the transaction behavior aggregation feature extraction step, the graph convolutional network processing includes the following sub-steps:
获取所述交易行为历史特征的邻接矩阵;obtaining an adjacency matrix of the historical characteristics of the transaction behavior;
将所述邻接矩阵输入到2至4层的图卷积网络图学习层中进行邻居间的特征传播,每一层结束后在外侧进行非线性激活。The adjacency matrix is input into the graph convolutional network graph learning layers of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
本发明提出的一种基于图神经网络的交易欺诈检测系统,所述基于图神经网络的交易欺诈检测系统包括以下模块:A transaction fraud detection system based on a graph neural network proposed by the present invention, the transaction fraud detection system based on a graph neural network includes the following modules:
交易数据预处理模块,所述交易数据预处理模块用于获取交易数据并对所述交易数据进行预处理,获得面板形式的交易样本集;a transaction data preprocessing module, which is used for acquiring transaction data and preprocessing the transaction data to obtain a panel-shaped transaction sample set;
交易行为历史特征提取模块,所述交易行为历史特征提取模块用于对所述交易样本集进行长短期记忆网络处理,获得交易行为历史特征;a transaction behavior historical feature extraction module, which is used to perform long-short-term memory network processing on the transaction sample set to obtain transaction behavior historical features;
交易行为聚合特征提取模块,所述交易行为聚合特征提取模块用于对所述交易历史行为特征进行图卷积网络处理,获得交易行为聚合特征;a transaction behavior aggregation feature extraction module, the transaction behavior aggregation feature extraction module is configured to perform graph convolution network processing on the transaction historical behavior features to obtain transaction behavior aggregation features;
预测模块,所述预测模块用于将所述交易行为历史特征以及所述交易行为聚合特征进行全连接层处理,通过二分类进行交易节点的欺诈预测。A prediction module, which is configured to perform full-connection layer processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and perform fraud prediction of transaction nodes through binary classification.
在其中的一个实施例中,在所述交易数据预处理模块中,所述预处理包括以下子步骤:In one embodiment, in the transaction data preprocessing module, the preprocessing includes the following sub-steps:
获取所述交易数据的本地交易节点特征;obtaining the local transaction node characteristics of the transaction data;
获取所述交易数据的交易节点汇总特征;Obtain the transaction node summary characteristics of the transaction data;
获取所述交易数据的交易节点子图谱信息。Obtain the transaction node sub-graph information of the transaction data.
在其中的一个实施例中,所述基于图神经网络的交易欺诈检测系统还包括谱聚类样本标注模块,所述谱聚类样本标注模块用于对所述交易样本集进行谱聚类样本标注处理,获得谱 聚类交易样本集。In one embodiment, the graph neural network-based transaction fraud detection system further includes a spectral clustering sample labeling module, which is configured to perform spectral clustering sample labeling on the transaction sample set processing to obtain a spectral clustering transaction sample set.
在其中的一个实施例中,在所述谱聚类样本标注模块中,谱聚类样本标注处理包括以下子步骤:In one embodiment, in the spectral clustering sample labeling module, the spectral clustering sample labeling process includes the following sub-steps:
构建交易样本集的谱矩阵;Construct the spectral matrix of the transaction sample set;
将所述谱矩阵进行特征值分解为特征矩阵;eigenvalue decomposition of the spectral matrix into an eigenmatrix;
对所述特征矩阵进行聚类。The feature matrix is clustered.
在其中的一个实施例中,在所述交易行为聚合特征提取模块中,所述图卷积网络处理包括以下子步骤:In one embodiment, in the transaction behavior aggregation feature extraction module, the graph convolutional network processing includes the following sub-steps:
获取所述交易行为历史特征的邻接矩阵;obtaining an adjacency matrix of the historical characteristics of the transaction behavior;
将所述邻接矩阵输入到2至4层的图卷积网络图学习层中进行邻居间的特征传播,每一层结束后在外侧进行非线性激活。The adjacency matrix is input into the graph convolutional network graph learning layers of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
上述基于图神经网络的交易欺诈检测方法及系统,通过交易行为历史特征提取以及交易行为聚合特征提取,进而通过全连接层处理,克服了传统的交易欺诈检测方法忽略数据之间本身就存在的联系以及交易行为是时间序列数据的缺陷,确保了交易欺诈检测的全面性,并且提高了交易欺诈检测的精确性。The above-mentioned transaction fraud detection method and system based on graph neural network overcomes the traditional transaction fraud detection method, which ignores the relationship between the data itself through the extraction of historical features of transaction behavior and the extraction of aggregated features of transaction behavior, and then through the full connection layer processing. And transaction behavior is the defect of time series data, which ensures the comprehensiveness of transaction fraud detection and improves the accuracy of transaction fraud detection.
附图说明Description of drawings
为了更清晰地说明本发明实施例与所设计的系统架构中的技术方案,下面结合附图对系统实施例与系统架构和技术方案中所需要的使用的附图进行简单介绍,显而易见,下面描述的附图仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention and the technical solutions in the designed system architecture, the following briefly introduces the system embodiments and the drawings required in the system architecture and technical solutions with reference to the accompanying drawings. Obviously, the following description The accompanying drawings are only some embodiments of the present invention, and for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative efforts.
图1为本发明基于图神经网络的交易欺诈检测方法流程图;Fig. 1 is the flow chart of the transaction fraud detection method based on graph neural network of the present invention;
图2为本发明基于图神经网络的交易欺诈检测方法中交易数据预处理步骤流程图;Fig. 2 is the flow chart of transaction data preprocessing steps in the transaction fraud detection method based on the graph neural network of the present invention;
图3本发明基于图神经网络的交易欺诈检测方法中交易数据预处理后本地交易节点特征(交易费用)的时间变化曲线;Fig. 3 time variation curve of local transaction node feature (transaction fee) after transaction data preprocessing in the transaction fraud detection method based on graph neural network of the present invention;
图4本发明基于图神经网络的交易欺诈检测方法中交易数据预处理后交易节点的汇总特征(本地交易节点与其所述有邻居交易节点的交易费用中的最大值)的时间变化曲线;Fig. 4 is the time change curve of the summary characteristic of the transaction node (the maximum value in the transaction fee of the local transaction node and its neighbor transaction node) after transaction data preprocessing in the transaction fraud detection method based on the graph neural network of the present invention;
图5比特币交易数据中被标记为非法交易的交易数据形成的非法交易图;Figure 5. Illegal transaction diagram formed by transaction data marked as illegal transaction in Bitcoin transaction data;
图6比特币交易数据中被标记为合法交易的交易数据形成的合法交易图;Figure 6. Legal transaction diagram formed by transaction data marked as legal transaction in Bitcoin transaction data;
图7本发明基于图神经网络的交易欺诈检测方法中谱聚类样本标注与二分类真实分布的效果对比图;7 is a comparison diagram of the effect of spectral clustering sample labeling and binary real distribution in the transaction fraud detection method based on graph neural network of the present invention;
图8本发明基于图神经网络的交易欺诈检测方法中并行化构建距离矩阵流程图;Fig. 8 is a flow chart of parallelized construction of distance matrix in the method for detecting transaction fraud based on graph neural network of the present invention;
图9为本发明一实施例的基于图神经网络的交易欺诈检测系统结构图;9 is a structural diagram of a transaction fraud detection system based on a graph neural network according to an embodiment of the present invention;
图10为本发明基于图神经网络的交易欺诈检测系统数据流向图。FIG. 10 is a data flow diagram of the transaction fraud detection system based on the graph neural network of the present invention.
具体实施方式Detailed ways
应当指明,以下详细说明的内容都是示例性的,目的是对本发明的内容进行指示性的说明,需要注意的是,本发明使用的所有技术和科学术语具有与发明所属技术领域的普通技术人员通常理解的相同含义。It should be pointed out that the content of the following detailed description is all exemplary, and the purpose is to illustrate the content of the present invention. The same meaning as commonly understood.
下面将结合本发明实施例中的附图说明,对本发明实施例中的系统架构与现有技术中的解决方案进行清晰、完整的描述,需要注意的是,所描述的实施例仅是为了对本发明进行解释与说明,而不是全部的内容。在本发明所提供的实施例的基础上,本领域内的普通技术人员在没有做出创造性劳动的前提下所得到的其他所有实施例,都在本发明申请的保护范围之内。The following will clearly and completely describe the system architecture in the embodiments of the present invention and the solutions in the prior art with reference to the accompanying drawings in the embodiments of the present invention. It should be noted that the described embodiments are only for the purpose of The invention is explained and illustrated, but not the entire content. On the basis of the embodiments provided by the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work are within the protection scope of the present application.
在本发明的下述实施例中,以针对比特币(Bitcoin,BTC)这种数字货币交易欺诈检测为例对本发明的基于图神经网络的交易欺诈检测方法及系统进行详细的说明。需要说明的是,本发明的基于图神经网络的交易欺诈检测方法及系统也可以用于其他交易数据的欺诈检测中,例如数字货币交易数据、交通数据以及股票数据等。In the following embodiments of the present invention, the method and system for detecting transaction fraud based on a graph neural network of the present invention are described in detail by taking the digital currency transaction fraud detection of Bitcoin (BTC) as an example. It should be noted that the graph neural network-based transaction fraud detection method and system of the present invention can also be used in fraud detection of other transaction data, such as digital currency transaction data, traffic data, and stock data.
请参阅图1所示,本发明提出的基于图神经网络的交易欺诈检测方法,包括以下步骤:Referring to Fig. 1, the transaction fraud detection method based on the graph neural network proposed by the present invention includes the following steps:
S100,交易数据预处理步骤,获取交易数据并对交易数据进行预处理,获得面板形式的交易样本集;S100, a transaction data preprocessing step, acquiring transaction data and preprocessing the transaction data to obtain a panel-shaped transaction sample set;
S300,交易行为历史特征提取步骤,对交易样本集进行长短期记忆网络处理,获得交易行为历史特征;S300, the step of extracting the historical characteristics of the transaction behavior, performing long-short-term memory network processing on the transaction sample set to obtain the historical characteristics of the transaction behavior;
S400,交易行为聚合特征提取步骤,对交易历史行为特征进行图卷积网络处理,获得交易行为聚合特征;S400, a transaction behavior aggregation feature extraction step, performing graph convolution network processing on transaction historical behavior features to obtain transaction behavior aggregation features;
S500,预测步骤,将交易行为历史特征以及交易行为聚合特征进行全连接层处理,通过二分类进行交易节点的欺诈预测。S500, a prediction step, which performs full-connection layer processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and conducts fraud prediction of transaction nodes through two classifications.
上述基于图神经网络的交易欺诈检测方法,通过交易行为历史特征提取以及交易行为聚合特征提取,进而通过全连接层处理,克服了传统的交易欺诈检测方法忽略数据之间本身就存在的联系以及交易行为是时间序列数据的缺陷,确保了交易欺诈检测的全面性,并且提高了交易欺诈检测的精确性。The above-mentioned transaction fraud detection method based on graph neural network overcomes the traditional transaction fraud detection method, which ignores the relationship between the data itself and the transaction through the extraction of historical features of transaction behavior and the extraction of aggregated features of transaction behavior, and then through the full connection layer processing. Behavior is a flaw in time-series data, ensuring comprehensive transaction fraud detection and improving transaction fraud detection accuracy.
在上述实施例中,交易数据预处理步骤主要用于收集和预处理交易欺诈检测所需要的交易数据,使预处理后的交易样本集形成面板形式且具有交易节点和交易节点之间联系,交易 样本之间构成了一张动态变化的交易流向图。In the above embodiment, the transaction data preprocessing step is mainly used to collect and preprocess the transaction data required for transaction fraud detection, so that the preprocessed transaction sample set is in the form of a panel and has a connection between transaction nodes and transaction nodes. The samples constitute a dynamically changing transaction flow graph.
作为一种可选实施方式,请参阅图2所示,通过预处理获取各个时间步长下的交易特征,预处理包括以下子步骤:As an optional implementation, please refer to FIG. 2 , the transaction features at each time step are obtained through preprocessing, and the preprocessing includes the following sub-steps:
S110,获取交易数据的本地交易节点特征;S110, acquiring local transaction node characteristics of the transaction data;
S120,获取交易数据的交易节点汇总特征;S120, acquiring transaction node summary characteristics of transaction data;
S130,获取交易数据的交易节点子图谱信息。S130, acquiring transaction node sub-graph information of the transaction data.
在上述交易数据预处理步骤中,不仅获取了本地交易节点特征,而且获取了交易节点汇总特征以及交易节点子图谱信息,检测对象包括交易行为与交易行为之间本身就存在的联系,能够使检测结果更为精确。In the above transaction data preprocessing step, not only the characteristics of the local transaction nodes, but also the summary characteristics of the transaction nodes and the sub-graph information of the transaction nodes are obtained. The result is more precise.
在本实施中,采用的比特币真实交易数据是一个从比特币区块链中收集的交易图。交易图的数据描述如下:图中的一个节点代表了一笔交易,边可以看作是一个交易和另一个交易之间的比特币流向。由203769个节点和234355条边组成。其中,2%的节点被标记为非法节点,21%的节点被标记为合法交易节点,其余的交易没有标记。In this implementation, the Bitcoin real transaction data used is a transaction graph collected from the Bitcoin blockchain. The data description of the transaction graph is as follows: a node in the graph represents a transaction, and an edge can be seen as the flow of bitcoin between one transaction and another. It consists of 203769 nodes and 234355 edges. Among them, 2% of the nodes are marked as illegal nodes, 21% of the nodes are marked as legal transaction nodes, and the rest of the transactions are not marked.
在交易数据中,每个交易节点会和时间信息相关联,这里的时间信息指的是比特币网络确认交易时的估计时间。在本实施例中考虑时间信息,则将约2周为时间间隔划分成49个不同的时间步长,大约两年的比特币交易数据。在每个时间步长上,都包含一个连通分量,它们之间的相互交易在区块链上出现的时间间隔小于3个小时,也不会存在于其他时间步长中的交易节点相连接的边,这里的时间间隔时长可以修改为其他合理的取值。以下详细讲解各个时间步长的各种交易特征。In the transaction data, each transaction node is associated with time information, where the time information refers to the estimated time when the Bitcoin network confirms the transaction. Considering the time information in this embodiment, the time interval of about 2 weeks is divided into 49 different time steps, about two years of Bitcoin transaction data. At each time step, there is a connected component, and the time interval between their mutual transactions on the blockchain is less than 3 hours, and the transaction nodes that exist in other time steps will not be connected. side, the time interval here can be modified to other reasonable values. The various trading characteristics of each time step are explained in detail below.
步骤S110中的本地交易节点特征代表本地交易节点的交易数据,比如时间步长、输入交易数量(节点入度)、输出交易数量(节点出度)、交易费用、输出量以及衍生统计特征。其中,衍生统计特征指的是邻居节点的一些均值特征,比如,输入交易数量平均收到的BTC费用、输出交易数量平均收到的BTC费用、输入交易数量平均花费的BTC费用、输出交易数量平均花费的BTC费用、与输入交易数量相关的输入/输出交易的平均数量(输入关联交易的平均数量)、与输出交易数量相关的输入/输出交易的平均数量(输出关联交易的平均数量)等。The local transaction node feature in step S110 represents transaction data of the local transaction node, such as time step, input transaction number (node in-degree), output transaction number (node out-degree), transaction fee, output amount, and derivative statistics. Among them, the derived statistical features refer to some average features of neighboring nodes, such as the average BTC fee received by the number of input transactions, the average BTC fee received by the number of output transactions, the average BTC fee spent by the number of input transactions, and the average number of output transactions. BTC fees spent, average number of input/output transactions related to the number of input transactions (average number of input related transactions), average number of input/output transactions related to the number of output transactions (average number of output related transactions), etc.
步骤S120中的交易节点的汇总特征是通过本地交易节点向前和/或向后一跳(one-hop)的邻居交易节点的本地交易节点特征获得的,即对本地交易节点的所有邻居交易节点的通过步骤S100获得的同一本地交易节点特征数据进行处理,求它们中的最大值、最小值、中位数、众数、标准差、全距和相关系数等这些描述性统计特征作为交易节点的汇总特征。The summary features of the transaction nodes in step S120 are obtained through the local transaction node features of the neighbor transaction nodes of the local transaction node forward and/or backward one-hop (one-hop), that is, all neighbor transaction nodes of the local transaction node are obtained. The characteristic data of the same local trading node obtained by step S100 are processed, and the descriptive statistical characteristics such as the maximum value, minimum value, median, mode, standard deviation, full distance and correlation coefficient among them are obtained as the transaction node. Summarize features.
步骤S130是为了获得一个交易节点局部的拓扑信息,是通过计算以本地交易节点为中心向外辐射适宜层数的所有交易节点构成图的谱信息获得的,在本实施例中是通过计算以本地 交易节点为中心向外辐射2层的所有交易节点构成图的谱信息获得的,即以得到的拉普拉斯矩阵L′=D′WD′特征值作为额外的特征——交易节点子图谱信息,这在频域上体现了图的拓扑信息,若特征值相似,则说明本交易节点所在的子图拓扑结构更为相似。Step S130 is to obtain the local topology information of a transaction node, which is obtained by calculating the spectral information of the graph of all transaction nodes radiating an appropriate number of layers with the local transaction node as the center. The transaction node is obtained from the spectral information of the graph composed of all transaction nodes radiating from the center to 2 layers, that is, the obtained Laplacian matrix L'=D'WD' eigenvalue is used as an additional feature - transaction node sub-graph information , which reflects the topological information of the graph in the frequency domain. If the eigenvalues are similar, it means that the subgraph topological structure where the transaction node is located is more similar.
在本实施例中,交易图的节点特征描述如下:时间步长是2周,一共49步。前93个节点特征为本地交易节点特征,是本地交易节点自身特征和交易数据,包括时间步长、输入交易数量(节点入度)、输出交易数量(节点出度)、交易费用、输出量以及衍生统计特征。后72个节点特征为交易节点的汇总特征,使用从本地(中心)交易节点向后和/或向后一跳的邻居交易节点获得的同一特征参数(某一本地交易节点特征)中的最大值、最小值、中位数、众数、标准差、全距以及相关系数。In this embodiment, the node characteristics of the transaction graph are described as follows: the time step is 2 weeks, with a total of 49 steps. The first 93 node characteristics are the characteristics of local transaction nodes, which are the characteristics and transaction data of local transaction nodes, including time step, number of input transactions (node in-degree), number of output transactions (node out-degree), transaction fee, output amount and Derived statistical features. The last 72 node features are the aggregated features of the transaction nodes, using the maximum value of the same feature parameters (a local transaction node feature) obtained from the local (central) transaction node’s backward and/or backward neighbor transaction nodes , minimum, median, mode, standard deviation, range, and correlation coefficient.
经过S100交易数据预处理步骤后,绘制图3以及图4以观察交易数据预处理后交易特征随时间的变化曲线。其中,图3为某一本地交易节点特征(如交易费用)的时间变化曲线,图4为交易节点的汇总特征(如本地交易节点与其所述有邻居交易节点的交易费用中的最大值)的时间变化曲线。图中显示了三类节点在两个不同属性(本地交易节点特征和交易节点的汇总特征)上随时间变化的情况,可以看到举例的这两个属性可以较好的区分合法交易节点(图中下部较为平稳的曲线)和非法交易节点(图中上部较为曲折的曲线),其中合法交易节点的属性曲线在图像最下方随时间变化较为稳定,而非法交易节点在图像上方随着时间的变化曲线较为陡峭。After the transaction data preprocessing step of S100, Figure 3 and Figure 4 are drawn to observe the change curve of transaction characteristics over time after the transaction data preprocessing. Among them, Figure 3 is the time change curve of a certain local transaction node characteristics (such as transaction fees), Figure 4 is the summary characteristics of transaction nodes (such as the maximum value of the transaction fees of the local transaction node and its neighbor transaction nodes) time curve. The figure shows the change of three types of nodes over time on two different attributes (local transaction node characteristics and transaction node summary characteristics). It can be seen that these two attributes can better distinguish legal transaction nodes (Fig. The relatively stable curve in the middle and lower part) and the illegal transaction node (the curve in the upper part of the figure is more tortuous), in which the attribute curve of the legal transaction node is relatively stable over time at the bottom of the image, while the illegal transaction node is at the top of the image. The curve is steeper.
在上述实施例中,完成步骤S100交易数据预处理步骤后,当已知分类的交易样本量足够时,可直接进行步骤S300交易行为历史特征提取。但是当已知分类的交易样本量较少时,无法精确地进行交易欺诈检测时,则需要进一步进行谱聚类样本标注,对未标记的交易节点进行标注,以避免样本量过少的情况。In the above embodiment, after the transaction data preprocessing step of step S100 is completed, when the number of known classified transaction samples is sufficient, the historical feature extraction of transaction behavior in step S300 can be directly performed. However, when the number of transaction samples known to be classified is small and it is impossible to accurately detect transaction fraud, it is necessary to further perform spectral clustering sample labeling, and label unlabeled transaction nodes to avoid the situation of too small samples.
作为一种可选实施方式,在交易行为历史特征提取步骤之前,还包括以下步骤:As an optional implementation manner, before the step of extracting the historical features of the transaction behavior, the following steps are further included:
S200,谱聚类样本标注步骤,对交易样本集进行谱聚类样本标注处理,获得谱聚类交易样本集。S200 , a spectral clustering sample labeling step, performing spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
在上述交易样本集中,由203769个节点和234355条边组成。其中,2%的交易节点被标记为非法交易节点,21%的交易节点被标记为合法交易节点,其余的交易节点没有标记,即有77%的交易节点没有被标记。由于交易数据的部分样本-交易节点的分类是未知的,本发明采用谱聚类无监督的方法对这些交易节点进行分类,将未知交易节点的标签学习出来,以增加样本量,将其作为可用数据进行后续的训练。可选的,由于需要学习的样本量较大,宜采用并行化谱聚类的方式进行。In the above transaction sample set, it consists of 203,769 nodes and 234,355 edges. Among them, 2% of the transaction nodes are marked as illegal transaction nodes, 21% of the transaction nodes are marked as legal transaction nodes, and the rest of the transaction nodes are not marked, that is, 77% of the transaction nodes are not marked. Since the classification of some samples of transaction data - transaction nodes is unknown, the present invention adopts the spectral clustering unsupervised method to classify these transaction nodes, and learns the labels of the unknown transaction nodes to increase the sample size and use them as available data for subsequent training. Optionally, due to the large number of samples to be learned, parallelized spectral clustering should be used.
在该实施方式中,使用谱聚类进行样本标注未标记的节点,谱聚类能够克服了K-means 聚类受数据形状的影响的缺陷,是一种全局最优的聚类方法。谱聚类的主要思路是将数据看作n维空间中的点,如图5和图6所示,分别为比特币交易数据中的被标记为非法交易的交易数据形成的非法交易图以及被标记为合法交易的交易数据形成的合法交易图。点与点之间若存在一定的相似性则用边连接起来,通过切割上述点组成的图将其分为多个子图来达到聚类的目的,即子图内权重值之和尽可能高,子图间权重值之和尽可能低;实现方式是通过将图切割的目标优化函数与拉普拉斯矩阵的特征值分解通过瑞利熵联系到一起,从而将NP难(NP-hard)问题转换为连续化的特征值求解问题。In this embodiment, spectral clustering is used to label unlabeled nodes for samples. Spectral clustering can overcome the defect that K-means clustering is affected by data shape, and is a globally optimal clustering method. The main idea of spectral clustering is to regard the data as points in the n-dimensional space, as shown in Figure 5 and Figure 6, which are the illegal transaction graph formed by the transaction data marked as illegal transaction in the bitcoin transaction data and the A graph of legitimate transactions formed by transaction data marked as legitimate transactions. If there is a certain similarity between points, they are connected by edges, and the purpose of clustering is achieved by cutting the graph composed of the above points and dividing them into multiple subgraphs, that is, the sum of the weight values in the subgraphs is as high as possible. The sum of the weight values between the subgraphs is as low as possible; the implementation method is to connect the eigenvalue decomposition of the graph cut and the eigenvalue decomposition of the Laplacian matrix together through the Rayleigh entropy, so as to solve the NP-hard problem. Convert to continuous eigenvalues to solve the problem.
作为一种可选实施方式,在谱聚类样本标注步骤中,谱聚类样本标注处理包括以下子步骤:As an optional implementation manner, in the spectral clustering sample labeling step, the spectral clustering sample labeling process includes the following sub-steps:
构建交易样本集的谱矩阵;Construct the spectral matrix of the transaction sample set;
将谱矩阵进行特征值分解为特征矩阵;Decompose the spectral matrix into eigenvalues;
对特征矩阵进行聚类。Cluster the feature matrix.
其中,对特征矩阵进行聚类根据需要选择其他的聚类方法,例如可以选择K-Means聚类方法。Among them, other clustering methods can be selected according to the need for clustering the feature matrix, for example, K-Means clustering method can be selected.
进一步地,构建交易样本集的普矩阵可以有不同的实现方法,例如,一种可选的谱聚类处理方法如表1所示。Further, there can be different implementation methods for constructing the general matrix of the transaction sample set. For example, an optional spectral clustering processing method is shown in Table 1.
表1谱聚类处理方法Table 1 Spectral clustering processing methods
Figure PCTCN2020135271-appb-000001
Figure PCTCN2020135271-appb-000001
其中,其中,步骤5)中,在比特币交易数据中,已知的分类为合法和非法2类,故将其聚类数目k设置为k=2。Among them, in step 5), in the bitcoin transaction data, known classifications are legal and illegal, so the number of clusters k is set to k=2.
请参阅图7所示,图中左侧为原始数据的二分类真实分布,右侧为应用本发明的谱聚类算法处理后的谱聚类样本标注结果。其中,样本量为n=1000,可以看到该算法在球形数据上 取得较好的聚类结果,并且基于KNN的谱聚类算法受尺度参数和临近数目的取值影响,这里默认取2和5。从图中可以看出,本发明通过谱聚类后的谱聚类样本标注后的分类结果与二分类真是分布十分相似,说明本发明的谱聚类样本标注准确度很高,能够大大提高检测结果的准确性。Please refer to FIG. 7 , the left side of the figure is the real distribution of the two classifications of the original data, and the right side is the spectral clustering sample labeling result processed by the spectral clustering algorithm of the present invention. Among them, the sample size is n=1000. It can be seen that the algorithm achieves better clustering results on spherical data, and the KNN-based spectral clustering algorithm is affected by the scale parameter and the number of neighbors. Here, 2 and 2 are taken by default. 5. As can be seen from the figure, the classification result of the present invention after the spectral clustering sample labeling after spectral clustering is very similar to the true distribution of the binary classification, indicating that the spectral clustering sample labeling accuracy of the present invention is very high, which can greatly improve the detection accuracy of results.
进一步地,考虑到现实中交易节点数量巨大,故可采用对每一步并行再汇总的方式进行,例如可以通过MapReduce编程模型进行并行运算,在这种情况下,谱聚类处理方法如表2所示。Further, considering the huge number of transaction nodes in reality, it can be done by parallelizing and re-aggregating each step. For example, parallel operations can be performed through the MapReduce programming model. In this case, the spectral clustering processing method is shown in Table 2. Show.
表2较大规模交易数据的谱聚类处理方法Table 2. Spectral clustering processing method of large-scale transaction data
Figure PCTCN2020135271-appb-000002
Figure PCTCN2020135271-appb-000002
其中,其中,步骤6)中,在比特币交易数据中,已知的分类为合法和非法2类,故将其聚类数目k设置为k=2。在其他实施例中,k值可以根据分类的具体情况进行设置或计算。Among them, in step 6), in the bitcoin transaction data, known classifications are legal and illegal, so the number of clusters k is set to k=2. In other embodiments, the k value may be set or calculated according to the specific situation of the classification.
在谱聚类处理过程中,将样本集data=(x 1,x 2,…,x n)输入,并行化构建交易样本的距离矩阵,然后将其调整为对称距离矩阵。这里的距离可以使用欧氏距离、最短路径或绝地距离进行度量,优选为最短路径或绝地距离。绝地距离是指在曲面(三维空间)上从A点走到B点(不允许离开曲面)只有一条最短路径,这条最短路径的距离即为测地距离,并行化构建距离矩阵的过程如图8所示,当上述所有的map()任务执行结束后生成了新键值对;reduce()规约所有分区的结果,即从map()写出的同一个新key中对values遍历合并,将每一行的值按列填充,得到一个完整的距离矩阵。 During the spectral clustering process, the sample set data=(x 1 , x 2 , . The distance here can be measured using Euclidean distance, shortest path or Jedi distance, preferably shortest path or Jedi distance. The Jedi distance means that there is only one shortest path from point A to point B (not allowed to leave the surface) on the surface (three-dimensional space), and the distance of this shortest path is the geodesic distance. The process of parallelizing the construction of the distance matrix is shown in the figure. As shown in 8, when all the above map() tasks are executed, new key-value pairs are generated; reduce() reduces the results of all partitions, that is, traverses and merges values from the same new key written by map(), and combines The values in each row are filled column by column, resulting in a complete distance matrix.
接着,按行并行稀疏化对称距离矩阵后,并计算高斯相似度得到相似度矩阵W。按行并行计算度矩阵D′=D -1/2,并行化计算L′=D′WD′,由于这几类矩阵是稀疏矩阵,故这些计算 可以通过并行的方式进行。 Next, after the symmetric distance matrix is sparsed in parallel by row, the Gaussian similarity is calculated to obtain the similarity matrix W. The degree matrix D'=D- 1/2 is calculated in parallel by row, and the parallelized calculation L'=D'WD'. Since these types of matrices are sparse matrices, these calculations can be performed in parallel.
经过上述并行化实现,得到了最终稀疏化的实对称矩阵L′。Lanczos方法适用于迭代逼近求解这种大型稀疏矩阵的特征值和特征向量,思想是将拉普拉斯矩阵通过正交相似变换的方法转换为实对称的三对角矩阵
Figure PCTCN2020135271-appb-000003
分解T kk得到的特征值和特征向量即为L′的特征值和特征向量。若只计算前k个特征值,则只需要迭代k次就可以完成计算,因此更加高效。聚类数目k设置为2(合法交易和违法交易),将特征向量h 1,h 2,…,h k组成的矩阵按行标准化组成特征矩阵H n*k
After the above parallelization implementation, the final sparse real symmetric matrix L' is obtained. The Lanczos method is suitable for iterative approximation to solve the eigenvalues and eigenvectors of such large sparse matrices. The idea is to convert the Laplace matrix into a real symmetric tridiagonal matrix by means of orthogonal similarity transformation.
Figure PCTCN2020135271-appb-000003
The eigenvalues and eigenvectors obtained by decomposing Tkk are the eigenvalues and eigenvectors of L'. If only the first k eigenvalues are calculated, the calculation can be completed with only k iterations, so it is more efficient. The number of clusters k is set to 2 (legal transactions and illegal transactions), and the matrix composed of feature vectors h 1 , h 2 , .
最后,对H采用并行化K-means方法进行聚类,得到簇划分C(c 1,c 2),完成对未知标签的交易节点的标注。 Finally, the parallelized K-means method is used to cluster H, and the cluster partition C(c 1 , c 2 ) is obtained, and the labeling of transaction nodes with unknown labels is completed.
在上述实施例中,交易行为历史特征提取步骤为对交易样本集进行长短期记忆网络处理,获得交易行为历史特征。即,通过学习交易行为的历史特征,从而能够获得交易行为历史特征。In the above-mentioned embodiment, the step of extracting the historical features of the transaction behavior is to perform long-short-term memory network processing on the transaction sample set to obtain the historical features of the transaction behavior. That is, by learning the historical characteristics of the transaction behavior, the historical characteristics of the transaction behavior can be obtained.
在比特币的交易过程中,每个交易被广播到比特币网络中是有时间的,而在这段时间里的交易历史行为足够影响预测交易在下一步是否为合法交易,即在该步骤中,选择合适时间步长的历史特征序列会足够影响交易在下一预测步是否为合法交易,因此该步骤将各交易节点时间序列通过长短期记忆网络(LSTM)学习交易节点的历史行为以提炼加以行为历史特征。In the transaction process of Bitcoin, there is a time for each transaction to be broadcast to the Bitcoin network, and the transaction history behavior during this period is enough to influence the prediction of whether the transaction is a legal transaction in the next step, that is, in this step, Selecting a historical feature sequence with a suitable time step will be enough to influence whether the transaction is a legal transaction in the next prediction step. Therefore, in this step, the time series of each transaction node is learned through the long short-term memory network (LSTM) to learn the historical behavior of the transaction node to refine the behavior history. feature.
LSTM致力于解决长期依赖问题,在RNN的基础上增加了三个门,分别是输入门、遗忘门和输出门,对历史信息进行有效过滤,最终的输出h t由输出门o t和C t长期细胞态存储体决定。 LSTM is committed to solving the long-term dependency problem. It adds three gates on the basis of RNN, namely input gate, forget gate and output gate, to effectively filter historical information, and the final output h t is composed of output gates o t and C t Long-term cellular state storage body determination.
其中,h t=LSTM(x t) where h t =LSTM(x t )
LSTM的处理过程如下:The processing of LSTM is as follows:
f t=σ(W f·[h t-1,x t+b f]) f t =σ(W f ·[h t-1 , x t +b f ])
i t=σ(W t·[h t-1,x t+b t]) i t =σ(W t ·[h t-1 , x t +b t ])
Figure PCTCN2020135271-appb-000004
Figure PCTCN2020135271-appb-000004
Figure PCTCN2020135271-appb-000005
Figure PCTCN2020135271-appb-000005
o t=σ(W o·[h t-1,x t+b o]) o t =σ(W o ·[h t-1 , x t +b o ])
h t=o t*tanh(C t) h t =o t *tanh(C t )
在该步骤中,将交易数据预处理步骤中得到的交易样本集中的交易节点时间序列数据或将谱聚类样本标注步骤中得到谱聚类交易样本集中的交易节点时间序列数据输入到LSTM神经网络层后得到输出为h t=LSTM(x t),在本实施例中步长设置为10以学习交易行为历史特征。 In this step, the transaction node time series data in the transaction sample set obtained in the transaction data preprocessing step or the transaction node time series data in the spectral cluster transaction sample set obtained in the spectral clustering sample labeling step are input into the LSTM neural network The output obtained after the layer is h t =LSTM(x t ), in this embodiment, the step size is set to 10 to learn the historical characteristics of trading behavior.
作为一种可选实施方式,在交易行为聚合特征提取步骤中,图卷积网络处理包括以下子步骤:As an optional implementation manner, in the transaction behavior aggregation feature extraction step, the graph convolutional network processing includes the following sub-steps:
获取交易行为历史特征的邻接矩阵;Obtain the adjacency matrix of the historical characteristics of transaction behavior;
将邻接矩阵输入到2至4层的图卷积网络图学习层中进行邻居间的特征传播,每一层结束后在外侧进行非线性激活。The adjacency matrix is input into the graph convolutional network graph learning layer of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
优选的,层数设置为2-4层,避免层数过多影响节点局部特征的学习,学到的是全局特征。Preferably, the number of layers is set to 2-4 layers, so as to avoid too many layers affecting the learning of local features of nodes, and what is learned is global features.
优选的,将邻接矩阵输入到2层的图卷积网络图学习层中进行邻居间的特征传播,每一层结束后在外侧进行非线性激活。Preferably, the adjacency matrix is input into the 2-layer graph convolutional network graph learning layer for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
该交易行为历史特征的邻接矩阵的计算可以包含两个部分,第一部分为是否有边连接,如果有,则设置为1;因为是时间序列的,所以第二部分可以是每个特征序列的相似度衡量,最后将上述两个部分按照权重加权求和即为一个节点与其邻居节点的相似度。The calculation of the adjacency matrix of the historical feature of the transaction behavior can include two parts, the first part is whether there is an edge connection, and if so, it is set to 1; because it is a time series, the second part can be the similarity of each feature sequence. Finally, the weighted sum of the above two parts according to the weight is the similarity between a node and its neighbor nodes.
进一步的,图卷积网络图学习层的原理及过程如下:Further, the principle and process of the graph learning layer of the graph convolutional network are as follows:
由于比特币交易数据构成了一幅图,本发明主要使用基于图的方法进行加以欺诈的检测。基于图的学习目的是训练一个预测函数
Figure PCTCN2020135271-appb-000006
该函数将一个实体的特征空间映射到目标标签空间。通常通过最小化目标损失函数来实现,可以抽象为I=Ω+λΦ。
Since Bitcoin transaction data constitutes a graph, the present invention mainly uses graph-based methods for fraud detection. Graph-based learning aims to train a prediction function
Figure PCTCN2020135271-appb-000006
This function maps the feature space of an entity to the target label space. It is usually achieved by minimizing the objective loss function, which can be abstracted as I=Ω+λΦ.
其中,Ω是针对特定预测任务的损失,衡量了真实值和预测值之间的误差;Φ是图的正则化项,它使得预测在图上变得平滑;λ是一个超参数来平衡上述这两项的比例。正则化项通常实现了图信号的平滑假设,即类似的顶点往往具有类似的预测,保留图的拓扑关系。一种广泛使用的正则化项Φ的定义如下,是一种基于欧式距离的测度加权,属于图信号中的变差测度,刻画了整体的平滑度,当g(x i,x j)为1时即为欧式距离: Among them, Ω is the loss for a specific prediction task, which measures the error between the true value and the predicted value; Φ is the regularization term of the graph, which makes the prediction smooth on the graph; λ is a hyperparameter to balance the above ratio of the two. The regularization term usually implements the smoothness assumption of the graph signal, that is, similar vertices tend to have similar predictions, preserving the topological relationship of the graph. A widely used regularization term Φ is defined as follows, which is a measure weighting based on Euclidean distance, which belongs to the variation measure in the graph signal, and describes the overall smoothness. When g(x i , x j ) is 1 is the Euclidean distance:
Figure PCTCN2020135271-appb-000007
Figure PCTCN2020135271-appb-000007
其中g(x i,x j)是实体对的特征向量之间的相似性度量,
Figure PCTCN2020135271-appb-000008
为顶点i的度。正则化项对每对实体进行平滑操作,使它们的预测(按度数标准化后)彼此接近。平滑的强度是由特征向量的相似度g(x i,x j)来决定的。可以等价地写成更简洁的矩阵形式:
where g(x i , x j ) is the similarity measure between feature vectors of entity pairs,
Figure PCTCN2020135271-appb-000008
is the degree of vertex i. The regularizer smoothes each pair of entities so that their predictions (after normalization by degrees) are close to each other. The strength of the smoothing is determined by the similarity g(x i , x j ) of the feature vectors. This can be equivalently written in a more compact matrix form:
Figure PCTCN2020135271-appb-000009
Figure PCTCN2020135271-appb-000009
Figure PCTCN2020135271-appb-000010
是预测值向量,L是图的拉普拉斯矩阵即
Figure PCTCN2020135271-appb-000011
A即相似度矩阵,每项元素为g(x i,x j)。
Figure PCTCN2020135271-appb-000010
is the predicted value vector, L is the Laplacian matrix of the graph i.e.
Figure PCTCN2020135271-appb-000011
A is the similarity matrix, and each element is g(x i , x j ).
图卷积网络(Graph Convolutional Network,GCN)是近年来发展迅速的一种特殊的基于图的学习方法,它融合了基于图学习的核心思想,即先进的卷积神经网络(CNNs)。标准的CNNs的核心思想是使用卷积(如3×3滤波矩阵)来捕获输入数据中的局部模式(如图像中的斜线)。按照CNNs思想,GCN的目标也是需要通过卷积捕获图上的本地连接模式。然而若直接在图的邻接矩阵上应用卷积操作这样的直观解决方案是不可行的,因为当交换两行邻接矩阵时,卷积的滤波输出可能会发生变化,而交换的邻接矩阵仍然表示相同的图结构,这是由图节点的无序性导致的,本发明采用两种方法解决该问题,一种方案是利用最近邻接点算法(K-Nearest Neighbor,KNN)处理顶点周围的邻居节点并排序来得到顶点的规范化输出,接着进行如LGCL(Learn Graph Convolution Layer)等图卷积网络方法:可学习图卷积层为每个特征基于值的排序自动地选择固定数量的邻居节点,以此将图结构数据变换为规则的一维网状数据,接着在一维网状数据上采用标准的CNN操作;另一种解决方案是利用频谱卷积捕获傅里叶域中的局部连接,将图傅立叶变换(Graph Fourier Transform,GFT)用在图的转换上,如:y=Tx=f(F,X)=UFU TX。 Graph Convolutional Network (GCN) is a special graph-based learning method that has developed rapidly in recent years. It incorporates the core idea of graph-based learning, namely advanced convolutional neural networks (CNNs). The core idea of standard CNNs is to use convolutions (such as 3×3 filter matrices) to capture local patterns in the input data (such as oblique lines in images). According to the idea of CNNs, the goal of GCN is to capture the local connection patterns on the graph through convolution. However, an intuitive solution such as applying a convolution operation directly on the adjacency matrix of the graph is not feasible, because the filtered output of the convolution may change when two rows of adjacency matrices are swapped, while the swapped adjacency matrices still represent the same The graph structure is caused by the disorder of graph nodes. The present invention adopts two methods to solve this problem. One solution is to use the nearest neighbor algorithm (K-Nearest Neighbor, KNN) to process the neighbor nodes around the vertex and Sort to get the normalized output of the vertices, and then perform graph convolution network methods such as LGCL (Learn Graph Convolution Layer): the learnable graph convolution layer automatically selects a fixed number of neighbor nodes for each feature value-based sorting, so as to Transform the graph-structured data into regular one-dimensional mesh data, and then apply standard CNN operations on the one-dimensional mesh data; another solution is to use spectral convolution to capture local connections in the Fourier domain, transform the graph Fourier transform (Graph Fourier Transform, GFT) is used in graph transformation, such as: y=Tx=f(F, X)=UFU T X.
其中,f表示参数化卷积的滤波操作T,U是L的特征列向量组成矩阵。从公式右侧开始看,U TX表示GFT的正变换,将X投影到每个特征向量上得到傅立叶系数α(频谱域上);接下来是Fα,这一步是scaling特征值放缩,对角矩阵F的元素为L的特征值,越高频的放缩的系数λ越大,也就是说L是一个高通滤波器。上述经过scaling得到的向量
Figure PCTCN2020135271-appb-000012
再左乘一个U矩阵,是一个GFT的逆变换,相当于将频域信息又变换回到时域上。
Among them, f represents the filtering operation T of parameterized convolution, and U is the matrix of characteristic column vectors of L. Starting from the right side of the formula, U T X represents the positive transformation of GFT, and X is projected onto each eigenvector to obtain the Fourier coefficient α (in the spectral domain); the next step is Fα, this step is scaling eigenvalue scaling, right The elements of the angle matrix F are the eigenvalues of L, and the higher the frequency, the larger the scaling coefficient λ, that is to say, L is a high-pass filter. The above vector obtained by scaling
Figure PCTCN2020135271-appb-000012
Multiplying a U matrix to the left is an inverse transformation of GFT, which is equivalent to transforming the frequency domain information back to the time domain.
上面的计算涉及到L的特征向量分解,复杂度较高,优选的,将F看作是Λ的方程,这样 一来可以使用切比雪夫多项式T k(x)的k阶近似来表示F: The above calculation involves the eigenvector decomposition of L, which has high complexity. Preferably, F is regarded as the equation of Λ, so that the k-order approximation of the Chebyshev polynomial T k (x) can be used to represent F:
Figure PCTCN2020135271-appb-000013
Figure PCTCN2020135271-appb-000013
其中
Figure PCTCN2020135271-appb-000014
λ max是L的最大特征值;θ k代表了切比雪夫多项式的系数;T k(x)=2xT k-1(x)-T k-2(x),且T 1(x)=x,T 0(x)=0。经过这样的近似,N个参数减少为k个参数,复杂度为O(1),不需要计算特征值和特征向量,但是失去了频域上的可解释性。下面推导以下当K=1时的切比雪夫GCN近似:
in
Figure PCTCN2020135271-appb-000014
λ max is the largest eigenvalue of L; θ k represents the coefficients of the Chebyshev polynomial; T k (x)=2xT k-1 (x)-T k-2 (x), and T 1 (x)=x , T 0 (x)=0. After such an approximation, N parameters are reduced to k parameters, and the complexity is O(1). There is no need to calculate eigenvalues and eigenvectors, but the interpretability in the frequency domain is lost. The following Chebyshev GCN approximation for K=1 is derived below:
Figure PCTCN2020135271-appb-000015
Figure PCTCN2020135271-appb-000015
其中X为原始的顶点特征,维度为N*C;W为需要学习的参数,维度为C*F,F是输出的特征维度。则经过一次1阶图卷积后输出的维度是N*F。Where X is the original vertex feature, the dimension is N*C; W is the parameter to be learned, the dimension is C*F, and F is the output feature dimension. Then the dimension of the output after a first-order graph convolution is N*F.
当对上述输出(one-hop)进行第二次近似图卷积(two-hop)时,数学表达如下:When the second approximate graph convolution (two-hop) is performed on the above output (one-hop), the mathematical expression is as follows:
Figure PCTCN2020135271-appb-000016
Figure PCTCN2020135271-appb-000016
其中
Figure PCTCN2020135271-appb-000017
in
Figure PCTCN2020135271-appb-000017
在上述实施例中,预测步骤为将交易行为历史特征以及交易行为聚合特征进行全连接层处理,通过二分类进行交易节点的欺诈预测。In the above-mentioned embodiment, the prediction step is to perform full-connection layer processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and perform fraud prediction of transaction nodes through two classifications.
考虑到传统的机器学习对欺诈检测也起到了不错的效果,进一步优选的,Considering that traditional machine learning also has a good effect on fraud detection, it is further preferred that
将交易行为历史特征、交易行为聚合特征以及传统机器学习模型的输出结果通过全联接层处理进而进行二分类获得最终的交易节点是否为非法交易的预测(即预测待测交易节点的标签为合法交易或非法交易)。The historical characteristics of transaction behavior, aggregation characteristics of transaction behavior, and the output results of traditional machine learning models are processed through the full connection layer and then classified into two categories to obtain the prediction of whether the final transaction node is an illegal transaction (that is, predicting that the label of the transaction node to be tested is a legal transaction). or illegal transactions).
基于同一发明构思,请参阅图9和图10所示,本发明还提出了一种基于图神经网络的交易欺诈检测系统,基于图神经网络的交易欺诈检测系统包括以下模块:Based on the same inventive concept, please refer to FIG. 9 and FIG. 10 , the present invention also proposes a transaction fraud detection system based on a graph neural network. The transaction fraud detection system based on the graph neural network includes the following modules:
交易数据预处理模块,交易数据预处理模块用于获取交易数据并对交易数据进行预处理,获得面板形式的交易样本集;The transaction data preprocessing module is used to obtain transaction data and preprocess the transaction data to obtain a panel-shaped transaction sample set;
交易行为历史特征提取模块,交易行为历史特征提取模块用于对交易样本集进行长短期记忆网络处理,获得交易行为历史特征;The transaction behavior historical feature extraction module is used to perform long-short-term memory network processing on the transaction sample set to obtain the transaction behavior historical features;
交易行为聚合特征提取模块,交易行为聚合特征提取模块用于对交易历史行为特征进行图卷积网络处理,获得交易行为聚合特征;The transaction behavior aggregation feature extraction module, the transaction behavior aggregation feature extraction module is used to perform graph convolution network processing on transaction historical behavior features to obtain transaction behavior aggregation features;
预测模块,预测模块用于将交易行为历史特征以及交易行为聚合特征进行全连接处理,通过二分类进行交易节点的欺诈预测。The prediction module is used to perform full connection processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and conduct fraud prediction of transaction nodes through two classifications.
上述基于图神经网络的交易欺诈检测系统,通过交易行为历史特征提取以及交易行为聚合特征提取,进而通过全连接层处理,克服了传统的交易欺诈检测方法忽略数据之间本身就存在的联系以及交易行为是时间序列数据的缺陷,确保了交易欺诈检测的全面性,并且提高了交易欺诈检测的精确性。The above-mentioned transaction fraud detection system based on graph neural network overcomes the traditional transaction fraud detection method that ignores the relationship between data itself and transactions through the extraction of historical features of transaction behavior and aggregation of transaction behavior, and then through the full connection layer processing. Behavior is a flaw in time-series data, ensuring comprehensive transaction fraud detection and improving transaction fraud detection accuracy.
在上述实施例中,交易数据预处理模块主要用于收集和预处理交易欺诈检测所需要的交易数据,使预处理后的交易样本集形成面板形式且具有交易节点和交易节点之间联系,交易样本之间构成了一张动态变化的交易流向图。In the above embodiment, the transaction data preprocessing module is mainly used to collect and preprocess the transaction data required for transaction fraud detection, so that the preprocessed transaction sample set is in the form of a panel and has a relationship between transaction nodes and transaction nodes. The samples constitute a dynamically changing transaction flow graph.
作为一种可选实施方式,通过预处理获取各个时间步长下的交易特征,预处理包括以下子步骤:As an optional implementation manner, the transaction features at each time step are obtained through preprocessing, and the preprocessing includes the following sub-steps:
获取交易数据的本地交易节点特征;Get the local transaction node characteristics of transaction data;
获取交易数据的交易节点汇总特征;Obtain transaction node summary characteristics of transaction data;
获取交易数据的交易节点子图谱信息。Get transaction node subgraph information of transaction data.
在上述交易数据预处理模块中,不仅获取了本地交易节点特征,而且获取了交易节点汇 总特征以及交易节点子图谱信息,检测对象包括交易行为与交易行为之间本身就存在的联系,能够使检测结果更为精确。In the above transaction data preprocessing module, not only the characteristics of the local transaction nodes, but also the summary characteristics of the transaction nodes and the sub-graph information of the transaction nodes are obtained. The result is more precise.
在本实施中,采用的比特币真实交易数据是一个从比特币区块链中收集的交易图。交易图的数据描述如下:图中的一个节点代表了一笔交易,边可以看作是一个交易和另一个交易之间的比特币流向。由203769个节点和234355条边组成。其中,2%的节点被标记为非法节点,21%的节点被标记为合法交易节点,其余的交易没有标记。In this implementation, the Bitcoin real transaction data used is a transaction graph collected from the Bitcoin blockchain. The data description of the transaction graph is as follows: a node in the graph represents a transaction, and an edge can be seen as the flow of bitcoin between one transaction and another. It consists of 203769 nodes and 234355 edges. Among them, 2% of the nodes are marked as illegal nodes, 21% of the nodes are marked as legitimate transaction nodes, and the rest of the transactions are not marked.
在交易数据中,每个交易节点会和时间信息相关联,这里的时间信息指的是比特币网络确认交易时的估计时间。在本实施例中考虑时间信息,则将约2周为时间间隔划分成49个不同的时间步长,大约两年的比特币交易数据。在每个时间步长上,都包含一个连通分量,它们之间的相互交易在区块链上出现的时间间隔小于3个小时,也不会存在于其他时间步长中的交易节点相连接的边,这里的时间间隔时长可以修改为其他合理的取值。以下详细讲解各个时间步长的各种交易特征。In transaction data, each transaction node is associated with time information, where time information refers to the estimated time when the Bitcoin network confirms the transaction. Considering the time information in this embodiment, the time interval of about 2 weeks is divided into 49 different time steps, about two years of Bitcoin transaction data. At each time step, there is a connected component, and the time interval between the mutual transactions between them appears on the blockchain is less than 3 hours, and the transaction nodes that exist in other time steps will not be connected. side, the time interval here can be modified to other reasonable values. The various trading characteristics of each time step are explained in detail below.
上述本地交易节点特征代表本地交易节点的交易数据,比如时间步长、输入交易数量(节点入度)、输出交易数量(节点出度)、交易费用、输出量以及衍生统计特征。其中,衍生统计特征指的是邻居节点的一些均值特征,比如,输入交易数量平均收到的BTC费用、输出交易数量平均收到的BTC费用、输入交易数量平均花费的BTC费用、输出交易数量平均花费的BTC费用、与输入交易数量相关的输入/输出交易的平均数量(输入关联交易的平均数量)、与输出交易数量相关的输入/输出交易的平均数量(输出关联交易的平均数量)等。The above-mentioned local transaction node characteristics represent transaction data of the local transaction node, such as time step, the number of input transactions (node in-degree), the number of output transactions (node out-degree), transaction fees, output volume, and derivative statistics. Among them, the derived statistical features refer to some average features of neighboring nodes, such as the average BTC fee received by the number of input transactions, the average BTC fee received by the number of output transactions, the average BTC fee spent by the number of input transactions, and the average number of output transactions. BTC fees spent, average number of input/output transactions related to the number of input transactions (average number of input related transactions), average number of input/output transactions related to the number of output transactions (average number of output related transactions), etc.
上述交易节点的汇总特征是通过本地交易节点向前和/或向后一跳(one-hop)的邻居交易节点的本地交易节点特征获得的,即对本地交易节点的所有邻居交易节点的通过步骤S100获得的同一本地交易节点特征数据进行处理,求它们中的最大值、最小值、中位数、众数、标准差、全距和相关系数等这些描述性统计特征作为交易节点的汇总特征。The summary features of the above-mentioned transaction nodes are obtained through the local transaction node features of the neighbor transaction nodes of the local transaction node forward and/or one-hop backward (one-hop), that is, the passing steps of all neighbor transaction nodes of the local transaction node. The characteristic data of the same local trading node obtained by S100 is processed, and the descriptive statistical characteristics such as the maximum value, minimum value, median, mode, standard deviation, range and correlation coefficient among them are obtained as the summary characteristics of the trading node.
上述交易数据的交易节点子图谱信息是为了获得一个交易节点局部的拓扑信息,是通过计算以本地交易节点为中心向外辐射适宜层数的所有交易节点构成图的谱信息获得的,在本实施例中是通过计算以本地交易节点为中心向外辐射2层的所有交易节点构成图的谱信息获得的,即以得到的拉普拉斯矩阵L′=D′WL′特征值作为额外的特征——交易节点子图谱信息,这在频域上体现了图的拓扑信息,若特征值相似,则说明本交易节点所在的子图拓扑结构更为相似。The transaction node sub-graph information of the above transaction data is to obtain the local topology information of a transaction node. In the example, it is obtained by calculating the spectral information of the graph of all transaction nodes radiating outward from the local transaction node as the center, that is, the obtained Laplacian matrix L'=D'WL' eigenvalue is used as an additional feature. ——The sub-graph information of the transaction node, which reflects the topology information of the graph in the frequency domain. If the eigenvalues are similar, it means that the sub-graph topological structure where the transaction node is located is more similar.
在本实施例中,交易图的节点特征描述如下:时间步长是2周,一共49步。前93个节点特征为本地交易节点特征,是本地交易节点自身特征和交易数据,包括时间步长、输入交易数量(节点入度)、输出交易数量(节点出度)、交易费用、输出量以及衍生统计特征。后 72个节点特征为交易节点的汇总特征,使用从本地(中心)交易节点向后和/或向后一跳的邻居交易节点获得的同一特征参数(某一本地交易节点特征)中的最大值、最小值、中位数、众数、标准差、全距以及相关系数。In this embodiment, the node characteristics of the transaction graph are described as follows: the time step is 2 weeks, with a total of 49 steps. The first 93 node characteristics are the characteristics of local transaction nodes, which are the characteristics and transaction data of local transaction nodes, including time step, number of input transactions (node in-degree), number of output transactions (node out-degree), transaction fee, output amount and Derived statistical features. The last 72 node features are the aggregated features of the transaction nodes, using the maximum value of the same feature parameters (a local transaction node feature) obtained from the local (central) transaction node’s backward and/or backward neighbor transaction nodes , minimum, median, mode, standard deviation, range, and correlation coefficient.
经过S100交易数据预处理步骤后,绘制图3以及图4以观察交易数据预处理后交易特征随时间的变化曲线。其中,图3为某一本地交易节点特征(如交易费用)的时间变化曲线,图4为交易节点的汇总特征(如本地交易节点与其所述有邻居交易节点的交易费用中的最大值)的时间变化曲线。After the transaction data preprocessing step of S100, Figure 3 and Figure 4 are drawn to observe the change curve of transaction characteristics over time after the transaction data preprocessing. Among them, Figure 3 is the time change curve of a certain local transaction node characteristics (such as transaction fees), Figure 4 is the summary characteristics of transaction nodes (such as the maximum value of the transaction fees of the local transaction node and its neighbor transaction nodes) time curve.
下图中显示了三类节点在两个不同属性(本地交易节点特征和交易节点的汇总特征)上随时间变化的情况,可以看到举例的这两个属性可以较好的区分合法交易节点(图中下部较为平稳的曲线)和非法交易节点(图中上部较为曲折的曲线),其中合法交易节点的属性曲线在图像最下方随时间变化较为稳定,而非法交易节点在图像上方随着时间的变化曲线较为陡峭。The figure below shows the changes of three types of nodes over time on two different attributes (local transaction node characteristics and transaction node summary characteristics). It can be seen that these two attributes can better distinguish legal transaction nodes ( The lower part of the figure is a relatively stable curve) and the illegal transaction node (the upper part of the figure is more tortuous curve), in which the attribute curve of the legal transaction node is relatively stable over time at the bottom of the image, while the illegal transaction node at the top of the image changes with time. The change curve is steeper.
在上述实施例中,执行完交易数据预处理后,当已知分类的交易样本量足够时,可直接执行交易行为历史特征提取模块。但是当已知分类的交易样本量较少时,无法精确地进行交易欺诈检测时,则需要进一步先执行谱聚类样本标注模块,对未标记的交易节点进行标注,以避免样本量过少的情况。In the above embodiment, after the preprocessing of transaction data is performed, when the number of transaction samples known to be classified is sufficient, the module for extracting historical features of transaction behavior can be directly executed. However, when the number of transaction samples known to be classified is small and it is impossible to accurately detect transaction fraud, it is necessary to further execute the spectral clustering sample labeling module to label unlabeled transaction nodes to avoid excessive sample size. Condition.
作为一种可选实施方式,基于图神经网络的交易欺诈检测系统还包括谱聚类样本标注模块,用于对交易样本集进行谱聚类样本标注处理,获得谱聚类交易样本集。As an optional embodiment, the graph neural network-based transaction fraud detection system further includes a spectral clustering sample labeling module, which is configured to perform spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
在上述交易样本集中,由203769个节点和234355条边组成。其中,2%的交易节点被标记为非法交易节点,21%的交易节点被标记为合法交易节点,其余的交易节点没有标记,即有77%的交易节点没有被标记。由于交易数据的部分样本-交易节点的分类是未知的,本发明采用谱聚类无监督的方法对这些交易节点进行分类,将未知交易节点的标签学习出来,以增加样本量,将其作为可用数据进行后续的训练。可选的,由于需要学习的样本量较大,宜采用并行化谱聚类的方式进行。In the above transaction sample set, it consists of 203,769 nodes and 234,355 edges. Among them, 2% of the transaction nodes are marked as illegal transaction nodes, 21% of the transaction nodes are marked as legal transaction nodes, and the rest of the transaction nodes are not marked, that is, 77% of the transaction nodes are not marked. Since the classification of some samples of transaction data - transaction nodes is unknown, the present invention adopts the spectral clustering unsupervised method to classify these transaction nodes, and learns the labels of the unknown transaction nodes to increase the sample size and use them as available data for subsequent training. Optionally, due to the large number of samples to be learned, parallelized spectral clustering should be used.
在该实施方式中,使用谱聚类进行样本标注未标记的节点,谱聚类能够克服了K-means聚类受数据形状的影响的缺陷,是一种全局最优的聚类方法。谱聚类的主要思路是将数据看作n维空间中的点,如图5和图6所示,分别为比特币交易数据中的被标记为非法交易的交易数据形成的非法交易图以及被标记为合法交易的交易数据形成的合法交易图。点与点之间若存在一定的相似性则用边连接起来,通过切割上述点组成的图将其分为多个子图来达到聚类的目的,即子图内权重值之和尽可能高,子图间权重值之和尽可能低;实现方式是通过将图切割的目标优化函数与拉普拉斯矩阵的特征值分解通过瑞利熵联系到一起,从而将NP难 (NP-hard)问题转换为连续化的特征值求解问题。In this embodiment, spectral clustering is used to label unlabeled nodes for samples. Spectral clustering can overcome the defect that K-means clustering is affected by data shape, and is a globally optimal clustering method. The main idea of spectral clustering is to regard the data as points in the n-dimensional space, as shown in Figure 5 and Figure 6, which are the illegal transaction graph formed by the transaction data marked as illegal transaction in the bitcoin transaction data and the A graph of legitimate transactions formed by transaction data marked as legitimate transactions. If there is a certain similarity between points, they are connected by edges, and the purpose of clustering is achieved by cutting the graph composed of the above points and dividing them into multiple subgraphs, that is, the sum of the weight values in the subgraphs is as high as possible. The sum of the weight values between the subgraphs is as low as possible; the implementation method is to connect the eigenvalue decomposition of the graph cut and the eigenvalue decomposition of the Laplacian matrix together through the Rayleigh entropy, so as to solve the NP-hard problem. Convert to continuous eigenvalues to solve the problem.
作为一种可选实施方式,在谱聚类样本标注模块中,谱聚类样本标注处理包括以下子步骤:As an optional implementation manner, in the spectral clustering sample labeling module, the spectral clustering sample labeling process includes the following sub-steps:
构建交易样本集的谱矩阵;Construct the spectral matrix of the transaction sample set;
将谱矩阵进行特征值分解为特征矩阵;Decompose the spectral matrix into eigenvalues;
对特征矩阵进行聚类。Cluster the feature matrix.
其中,对特征矩阵进行聚类根据需要选择其他的聚类方法,例如可以选择K-Means聚类方法。Among them, other clustering methods can be selected according to the need for clustering the feature matrix, for example, K-Means clustering method can be selected.
进一步地,构建交易样本集的普矩阵可以有不同的实现方法,例如,一种可选的谱聚类处理方法如表1所示。Further, there can be different implementation methods for constructing the general matrix of the transaction sample set. For example, an optional spectral clustering processing method is shown in Table 1.
表1谱聚类处理方法Table 1 Spectral clustering processing methods
Figure PCTCN2020135271-appb-000018
Figure PCTCN2020135271-appb-000018
其中,其中,步骤5)中,在比特币交易数据中,已知的分类为合法和非法2类,故将其聚类数目k设置为k=2。Among them, in step 5), in the bitcoin transaction data, known classifications are legal and illegal, so the number of clusters k is set to k=2.
请参阅图7所示,图中左侧为原始数据的二分类真实分布,右侧为应用本发明的谱聚类算法处理后的谱聚类样本标注结果。其中,样本量为n=1000,可以看到该算法在球形数据上取得较好的聚类结果,并且基于KNN的谱聚类算法受尺度参数和临近数目的取值影响,这里默认取2和5。从图中可以看出,本发明通过谱聚类后的谱聚类样本标注后的分类结果与二分类真是分布十分相似,说明本发明的谱聚类样本标注准确度很高,能够大大提高检测结果的准确性。Please refer to FIG. 7 , the left side of the figure is the real distribution of the two classifications of the original data, and the right side is the spectral clustering sample labeling result processed by the spectral clustering algorithm of the present invention. Among them, the sample size is n=1000. It can be seen that the algorithm achieves better clustering results on spherical data, and the KNN-based spectral clustering algorithm is affected by the scale parameter and the number of neighbors. Here, 2 and 2 are taken by default. 5. As can be seen from the figure, the classification result of the present invention after the spectral clustering sample labeling after spectral clustering is very similar to the true distribution of the binary classification, indicating that the spectral clustering sample labeling accuracy of the present invention is very high, which can greatly improve the detection accuracy of results.
进一步地,考虑到现实中交易节点数量巨大,故可采用对每一步并行再汇总的方式进行,例如可以通过MapReduce编程模型进行并行运算,在这种情况下,谱聚类处理方法如表2所 示。Further, considering the huge number of transaction nodes in reality, it can be done by parallelizing and re-aggregating each step. For example, parallel operations can be performed through the MapReduce programming model. In this case, the spectral clustering processing method is shown in Table 2. Show.
表2较大规模交易数据的谱聚类处理方法Table 2. Spectral clustering processing method of large-scale transaction data
Figure PCTCN2020135271-appb-000019
Figure PCTCN2020135271-appb-000019
其中,其中,步骤6)中,在比特币交易数据中,已知的分类为合法和非法2类,故将其聚类数目k设置为k=2。在其他实施例中,k值可以根据分类的具体情况进行设置或计算。Among them, in step 6), in the bitcoin transaction data, known classifications are legal and illegal, so the number of clusters k is set to k=2. In other embodiments, the k value may be set or calculated according to the specific situation of the classification.
在谱聚类处理过程中,首先,将样本集data=(x 1,x 2,…,x n)输入,并行化构建交易样本的距离矩阵,然后将其调整为对称距离矩阵。这里的距离可以使用欧氏距离、最短路径或绝地距离进行度量,优选为最短路径或绝地距离。绝地距离是指在曲面(三维空间)上从A点走到B点(不允许离开曲面)只有一条最短路径,这条最短路径的距离即为测地距离,并行化构建距离矩阵的过程如图8所示,当上述所有的map()任务执行结束后生成了新键值对;reduce()规约所有分区的结果,即从map()写出的同一个新key中对values遍历合并,将每一行的值按列填充,得到一个完整的距离矩阵。 In the spectral clustering process, first, input the sample set data=(x 1 , x 2 , . The distance here can be measured using Euclidean distance, shortest path or Jedi distance, preferably shortest path or Jedi distance. The Jedi distance means that there is only one shortest path from point A to point B (not allowed to leave the surface) on the surface (three-dimensional space), and the distance of this shortest path is the geodesic distance. The process of parallelizing the construction of the distance matrix is shown in the figure. As shown in 8, when all the above map() tasks are executed, new key-value pairs are generated; reduce() reduces the results of all partitions, that is, traverses and merges values from the same new key written by map(), and combines The values in each row are filled column by column, resulting in a complete distance matrix.
接着,按行并行稀疏化对称距离矩阵后,并计算高斯相似度得到相似度矩阵W。按行并行计算度矩阵D′=D -1/2,并行化计算L′=D′WD′,由于这几类矩阵是稀疏矩阵,故这些计算可以通过并行的方式进行。 Next, after the symmetric distance matrix is sparsed in parallel by row, the Gaussian similarity is calculated to obtain the similarity matrix W. The degree matrix D'=D- 1/2 is calculated in parallel by row, and the parallelized calculation L'=D'WD'. Since these types of matrices are sparse matrices, these calculations can be performed in parallel.
经过上述并行化实现,得到了最终稀疏化的实对称矩阵L′。Lanczos方法适用于迭代逼近求解这种大型稀疏矩阵的特征值和特征向量,思想是将拉普拉斯矩阵通过正交相似变换的方法转换为实对称的三对角矩阵
Figure PCTCN2020135271-appb-000020
分解T kk得到的特征值和特征向量即为L′的特征值和特征向量。若只计算前k个特征值,则只需要迭代k次就可以完成计算,因此更加高效。 聚类数目k设置为2(合法交易和违法交易),将特征向量h 1,h 2,…,h k组成的矩阵按行标准化组成特征矩阵H n*k
After the above parallelization implementation, the final sparse real symmetric matrix L' is obtained. The Lanczos method is suitable for iterative approximation to solve the eigenvalues and eigenvectors of such large sparse matrices. The idea is to convert the Laplace matrix into a real symmetric tridiagonal matrix by means of orthogonal similarity transformation.
Figure PCTCN2020135271-appb-000020
The eigenvalues and eigenvectors obtained by decomposing Tkk are the eigenvalues and eigenvectors of L'. If only the first k eigenvalues are calculated, the calculation can be completed with only k iterations, so it is more efficient. The number of clusters k is set to 2 (legal transactions and illegal transactions), and the matrix composed of the eigenvectors h 1 , h 2 , .
最后,对H采用并行化K-means方法进行聚类,得到簇划分C(c 1,c 2),完成对未知标签的交易节点的标注。 Finally, the parallelized K-means method is used to cluster H, and the cluster partition C(c 1 , c 2 ) is obtained, and the labeling of transaction nodes with unknown labels is completed.
在上述实施例中,交易行为历史特征提取模块用于对交易样本集进行长短期记忆网络处理,获得交易行为历史特征。即,通过学习交易行为的历史特征,从而能够获得交易行为历史特征。In the above embodiment, the transaction behavior history feature extraction module is used to perform long short-term memory network processing on the transaction sample set to obtain transaction behavior history features. That is, by learning the historical characteristics of the transaction behavior, the historical characteristics of the transaction behavior can be obtained.
在比特币的交易过程中,每个交易被广播到比特币网络中是有时间的,而在这段时间里的交易历史行为足够影响预测交易在下一步是否为合法交易,即在该步骤中,选择合适时间步长的历史特征序列会足够影响交易在下一预测步是否为合法交易,因此该步骤将各交易节点时间序列通过长短期记忆网络(LSTM)学习交易节点的历史行为以提炼加以行为历史特征。In the transaction process of Bitcoin, there is a time for each transaction to be broadcast to the Bitcoin network, and the transaction history behavior during this period is enough to influence the prediction of whether the transaction is a legal transaction in the next step, that is, in this step, Selecting a historical feature sequence with an appropriate time step will be enough to affect whether the transaction is a legal transaction in the next prediction step. Therefore, in this step, the time series of each transaction node is learned through the long short-term memory network (LSTM) to learn the historical behavior of transaction nodes to refine the behavior history. feature.
LSTM致力于解决长期依赖问题,在RNN的基础上增加了三个门,分别是输入门、遗忘门和输出门,对历史信息进行有效过滤,最终的输出h t由输出门o t和C t长期细胞态存储体决定。 LSTM is committed to solving the long-term dependency problem. It adds three gates on the basis of RNN, namely input gate, forget gate and output gate, to effectively filter historical information, and the final output h t is composed of output gates o t and C t Long-term cellular state storage body determination.
其中,h t=LSTM(x t) where h t =LSTM(x t )
LSTM的处理过程如下:The processing of LSTM is as follows:
f t=σ(W f·[h t-1,x t+b f]) f t =σ(W f ·[h t-1 , x t +b f ])
i t=σ(W t·[h t-1,x t+b t]) i t =σ(W t ·[h t-1 , x t +b t ])
Figure PCTCN2020135271-appb-000021
Figure PCTCN2020135271-appb-000021
Figure PCTCN2020135271-appb-000022
Figure PCTCN2020135271-appb-000022
o t=σ(W o·[h t-1,x t+b o]) o t =σ(W o ·[h t-1 , x t +b o ])
h t=o t*tanh(C t) h t =o t *tanh(C t )
在该步骤中,将交易数据预处理步骤中得到的交易样本集中的交易节点时间序列数据或将谱聚类样本标注步骤中得到谱聚类交易样本集中的交易节点时间序列数据输入到LSTM神经网络层后得到输出为h t=LSTM(x t),在本实施例中步长设置为10以学习交易行为历史特征。 In this step, the transaction node time series data in the transaction sample set obtained in the transaction data preprocessing step or the transaction node time series data in the spectral cluster transaction sample set obtained in the spectral clustering sample labeling step are input into the LSTM neural network The output obtained after the layer is h t =LSTM(x t ), in this embodiment, the step size is set to 10 to learn the historical characteristics of trading behavior.
作为一种可选实施方式,在交易行为聚合特征提取模块中,图卷积网络处理包括以下子 步骤:As an optional embodiment, in the transaction behavior aggregation feature extraction module, the graph convolutional network processing includes the following substeps:
获取交易行为历史特征的邻接矩阵;Obtain the adjacency matrix of the historical characteristics of transaction behavior;
将邻接矩阵输入到2至4层的图卷积网络图学习层中进行邻居间的特征传播,每一层结束后在外侧进行非线性激活。The adjacency matrix is input into the graph convolutional network graph learning layer of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
优选的,层数设置为2-4层,避免层数过多影响节点局部特征的学习,学到的是全局特征。Preferably, the number of layers is set to 2-4 layers, so as to avoid too many layers affecting the learning of local features of nodes, and what is learned is global features.
优选的,将邻接矩阵输入到2层的图卷积网络图学习层中进行邻居间的特征传播,每一层结束后在外侧进行非线性激活。Preferably, the adjacency matrix is input into the 2-layer graph convolutional network graph learning layer for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
该交易行为历史特征的邻接矩阵的计算可以包含两个部分,第一部分为是否有边连接,如果有,则设置为1;因为是时间序列的,所以第二部分可以是每个特征序列的相似度衡量,最后将上述两个部分按照权重加权求和即为一个节点与其邻居节点的相似度。The calculation of the adjacency matrix of the historical feature of the transaction behavior can include two parts, the first part is whether there is an edge connection, and if so, it is set to 1; because it is a time series, the second part can be the similarity of each feature sequence. Finally, the weighted sum of the above two parts according to the weight is the similarity between a node and its neighbor nodes.
进一步的,图卷积网络图学习层的原理及过程如下:Further, the principle and process of the graph learning layer of the graph convolutional network are as follows:
由于比特币交易数据构成了一幅图,本发明主要使用基于图的方法进行加以欺诈的检测。基于图的学习目的是训练一个预测函数
Figure PCTCN2020135271-appb-000023
该函数将一个实体的特征空间映射到目标标签空间。通常通过最小化目标损失函数来实现,可以抽象为I=Ω+λΦ。
Since Bitcoin transaction data constitutes a graph, the present invention mainly uses graph-based methods for fraud detection. Graph-based learning aims to train a prediction function
Figure PCTCN2020135271-appb-000023
This function maps the feature space of an entity to the target label space. It is usually achieved by minimizing the objective loss function, which can be abstracted as I=Ω+λΦ.
其中,Ω是针对特定预测任务的损失,衡量了真实值和预测值之间的误差;Φ是图的正则化项,它使得预测在图上变得平滑;λ是一个超参数来平衡上述这两项的比例。正则化项通常实现了图信号的平滑假设,即类似的顶点往往具有类似的预测,保留图的拓扑关系。一种广泛使用的正则化项Φ的定义如下,是一种基于欧式距离的测度加权,属于图信号中的变差测度,刻画了整体的平滑度,当g(x i,x j)为1时即为欧式距离: Among them, Ω is the loss for a specific prediction task, which measures the error between the true value and the predicted value; Φ is the regularization term of the graph, which makes the prediction smooth on the graph; λ is a hyperparameter to balance the above ratio of the two. The regularization term usually implements the smoothness assumption of the graph signal, that is, similar vertices tend to have similar predictions, preserving the topological relationship of the graph. A widely used regularization term Φ is defined as follows, which is a measure weighting based on Euclidean distance, which belongs to the variation measure in the graph signal, and describes the overall smoothness. When g(x i , x j ) is 1 is the Euclidean distance:
Figure PCTCN2020135271-appb-000024
Figure PCTCN2020135271-appb-000024
其中g(x i,x j)是实体对的特征向量之间的相似性度量,
Figure PCTCN2020135271-appb-000025
为顶点i的度。正则化项对每对实体进行平滑操作,使它们的预测(按度数标准化后)彼此接近。平滑的强度是由特征向量的相似度g(x i,x j)来决定的。可以等价地写成更简洁的矩阵形式:
where g(x i , x j ) is the similarity measure between feature vectors of entity pairs,
Figure PCTCN2020135271-appb-000025
is the degree of vertex i. The regularizer smoothes each pair of entities so that their predictions (after normalization by degrees) are close to each other. The strength of the smoothing is determined by the similarity g(x i , x j ) of the feature vectors. This can be equivalently written in a more compact matrix form:
Figure PCTCN2020135271-appb-000026
Figure PCTCN2020135271-appb-000026
Figure PCTCN2020135271-appb-000027
是预测值向量,L是图的拉普拉斯矩阵即
Figure PCTCN2020135271-appb-000028
A即相似度矩阵,每项元素为g(x i,x j)。
Figure PCTCN2020135271-appb-000027
is the predicted value vector, L is the Laplacian matrix of the graph i.e.
Figure PCTCN2020135271-appb-000028
A is the similarity matrix, and each element is g(x i , x j ).
图卷积网络(Graph Convolutional Network,GCN)是近年来发展迅速的一种特殊的基于图的学习方法,它融合了基于图学习的核心思想,即先进的卷积神经网络(CNNs)。标准的CNNs的核心思想是使用卷积(如3×3滤波矩阵)来捕获输入数据中的局部模式(如图像中的斜线)。按照CNNs思想,GCN的目标也是需要通过卷积捕获图上的本地连接模式。然而若直接在图的邻接矩阵上应用卷积操作这样的直观解决方案是不可行的,因为当交换两行邻接矩阵时,卷积的滤波输出可能会发生变化,而交换的邻接矩阵仍然表示相同的图结构,这是由图节点的无序性导致的,本发明采用两种方法解决该问题,一种方案是利用最近邻接点算法(K-Nearest Neighbor,KNN)处理顶点周围的邻居节点并排序来得到顶点的规范化输出,接着进行如LGCL(Learn Graph Convolution Layer)等图卷积网络方法:可学习图卷积层为每个特征基于值的排序自动地选择固定数量的邻居节点,以此将图结构数据变换为规则的一维网状数据,接着在一维网状数据上采用标准的CNN操作;另一种解决方案是利用频谱卷积捕获傅里叶域中的局部连接,将图傅立叶变换(Graph Fourier Transform,GFT)用在图的转换上,如:y=Tx=f(F,X)=UFU TX。 Graph Convolutional Network (GCN) is a special graph-based learning method that has developed rapidly in recent years. It incorporates the core idea of graph-based learning, namely advanced convolutional neural networks (CNNs). The core idea of standard CNNs is to use convolutions (such as 3×3 filter matrices) to capture local patterns in the input data (such as oblique lines in images). According to the idea of CNNs, the goal of GCN is to capture the local connection patterns on the graph through convolution. However, an intuitive solution such as applying a convolution operation directly on the adjacency matrix of the graph is not feasible, because the filtered output of the convolution may change when two rows of adjacency matrices are swapped, while the swapped adjacency matrices still represent the same The graph structure is caused by the disorder of graph nodes. The present invention adopts two methods to solve this problem. One solution is to use the nearest neighbor algorithm (K-Nearest Neighbor, KNN) to process the neighbor nodes around the vertex and Sort to get the normalized output of the vertices, and then perform graph convolution network methods such as LGCL (Learn Graph Convolution Layer): the learnable graph convolution layer automatically selects a fixed number of neighbor nodes for each feature value-based sorting, so as to Transform the graph-structured data into regular one-dimensional mesh data, and then apply standard CNN operations on the one-dimensional mesh data; another solution is to use spectral convolution to capture local connections in the Fourier domain, convert the graph Fourier transform (Graph Fourier Transform, GFT) is used in graph transformation, such as: y=Tx=f(F, X)=UFU T X.
其中,f表示参数化卷积的滤波操作T,U是L的特征列向量组成矩阵。从公式右侧开始看,U TX表示GFT的正变换,将X投影到每个特征向量上得到傅立叶系数α(频谱域上);接下来是Fα,这一步是scaling特征值放缩,对角矩阵F的元素为L的特征值,越高频的放缩的系数λ越大,也就是说L是一个高通滤波器。上述经过scaling得到的向量
Figure PCTCN2020135271-appb-000029
再左乘一个U矩阵,是一个GFT的逆变换,相当于将频域信息又变换回到时域上。
Among them, f represents the filtering operation T of parameterized convolution, and U is the matrix of characteristic column vectors of L. Starting from the right side of the formula, U T X represents the positive transformation of GFT, and X is projected onto each eigenvector to obtain the Fourier coefficient α (in the spectral domain); the next step is Fα, this step is scaling eigenvalue scaling, right The elements of the angle matrix F are the eigenvalues of L, and the higher the frequency, the larger the scaling coefficient λ, that is to say, L is a high-pass filter. The above vector obtained by scaling
Figure PCTCN2020135271-appb-000029
Multiplying a U matrix to the left is an inverse transformation of GFT, which is equivalent to transforming the frequency domain information back to the time domain.
上面的计算涉及到L的特征向量分解,复杂度较高,优选的,将F看作是Λ的方程,这样一来可以使用切比雪夫多项式T k(x)的k阶近似来表示F: The above calculation involves the eigenvector decomposition of L, which has high complexity. Preferably, F is regarded as the equation of Λ, so that the k-order approximation of the Chebyshev polynomial T k (x) can be used to represent F:
Figure PCTCN2020135271-appb-000030
Figure PCTCN2020135271-appb-000030
其中
Figure PCTCN2020135271-appb-000031
λ max是L的最大特征值;θ k代表了切比雪夫多项式的系数;T k(x)=2xT k-1(x)-T k-2(x),且T 1(x)=x,T 0(x)=0。经过这样的近似,N个参数减少为k个参数,复杂度为O(1),不需要计算特征值和特征向量,但是失去了频域上的可解释性。下面推导以下当K=1时的切比雪夫GCN近似:
in
Figure PCTCN2020135271-appb-000031
λ max is the largest eigenvalue of L; θ k represents the coefficients of the Chebyshev polynomial; T k (x)=2xT k-1 (x)-T k-2 (x), and T 1 (x)=x , T 0 (x)=0. After such an approximation, N parameters are reduced to k parameters, and the complexity is O(1). There is no need to calculate eigenvalues and eigenvectors, but the interpretability in the frequency domain is lost. The following Chebyshev GCN approximation for K=1 is derived below:
Figure PCTCN2020135271-appb-000032
Figure PCTCN2020135271-appb-000032
其中X为原始的顶点特征,维度为N*C;W为需要学习的参数,维度为C*F,F是输出的特征维度。则经过一次1阶图卷积后输出的维度是N*F。Where X is the original vertex feature, the dimension is N*C; W is the parameter to be learned, the dimension is C*F, and F is the output feature dimension. Then the dimension of the output after a first-order graph convolution is N*F.
当对上述输出(one-hop)进行第二次近似图卷积(two-hop)时,数学表达如下:When the second approximate graph convolution (two-hop) is performed on the above output (one-hop), the mathematical expression is as follows:
Figure PCTCN2020135271-appb-000033
Figure PCTCN2020135271-appb-000033
其中
Figure PCTCN2020135271-appb-000034
in
Figure PCTCN2020135271-appb-000034
在上述实施例中,预测模块用于将交易行为历史特征以及交易行为聚合特征进行全连接层处理,通过二分类进行交易节点的欺诈预测。In the above embodiment, the prediction module is used to perform full-connection layer processing on historical features of transaction behavior and aggregated features of transaction behaviors, and perform fraud prediction of transaction nodes through binary classification.
考虑到传统的机器学习对欺诈检测也起到了不错的效果,进一步优选的,将交易行为历 史特征、交易行为聚合特征以及传统机器学习模型的输出结果通过全联接层处理进而进行二分类获得最终的交易节点是否为非法交易的预测(即预测待测交易节点的标签为合法交易或非法交易)。Considering that traditional machine learning also has a good effect on fraud detection, it is further preferred to process the historical characteristics of transaction behaviors, aggregated characteristics of transaction behaviors and the output results of traditional machine learning models through the full connection layer and then perform binary classification to obtain the final result. Predict whether the transaction node is an illegal transaction (that is, predict whether the label of the transaction node to be tested is a legal transaction or an illegal transaction).
对所公开的上述实施例的说明,为了便于本领域专业技术人员能够实现或使用本发明,对上述实施例的修改或拓展到其他电子商务平台的应用对本领域的专业技术人员来说是显而易见的,本发明中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其他实施例中应用。因此,本发明将不会被限制于上述的实施例,而是符合本发明所公开的技术原理和新颖特点相一致的最宽的范围。For the description of the disclosed above-mentioned embodiments, in order to facilitate those skilled in the art to be able to realize or use the present invention, it is obvious to those skilled in the art to modify or extend the application of the above-mentioned embodiments to other e-commerce platforms. , the general principles defined in this invention may be applied in other embodiments without departing from the spirit or scope of this invention. Therefore, the present invention will not be limited to the above-described embodiments, but will conform to the widest scope consistent with the technical principles and novel features disclosed in the present invention.

Claims (10)

  1. 一种基于图神经网络的交易欺诈检测方法,其特征在于,所述基于图神经网络的交易欺诈检测方法包括以下步骤:A method for detecting transaction fraud based on a graph neural network, characterized in that the method for detecting transaction fraud based on a graph neural network comprises the following steps:
    交易数据预处理步骤,获取交易数据并对所述交易数据进行预处理,获得面板形式的交易样本集;The transaction data preprocessing step is to obtain transaction data and preprocess the transaction data to obtain a panel-shaped transaction sample set;
    交易行为历史特征提取步骤,对所述交易样本集进行长短期记忆网络处理,获得交易行为历史特征;The step of extracting the historical features of the transaction behavior, performing long-short-term memory network processing on the transaction sample set to obtain the historical features of the transaction behavior;
    交易行为聚合特征提取步骤,对所述交易历史行为特征进行图卷积网络处理,获得交易行为聚合特征;The step of extracting transaction behavior aggregation features is to perform graph convolution network processing on the transaction historical behavior features to obtain transaction behavior aggregation features;
    预测步骤,将所述交易行为历史特征以及所述交易行为聚合特征进行全连接处理,通过二分类进行交易节点的欺诈预测。In the prediction step, the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior are fully connected, and the fraud prediction of the transaction node is performed through two classifications.
  2. 根据权利要求1所述的基于图神经网络的交易欺诈检测方法,其特征在于,在所述交易数据预处理步骤中,所述预处理包括以下子步骤:The method for detecting transaction fraud based on a graph neural network according to claim 1, wherein, in the transaction data preprocessing step, the preprocessing includes the following sub-steps:
    获取所述交易数据的本地交易节点特征;obtaining the local transaction node characteristics of the transaction data;
    获取所述交易数据的交易节点汇总特征;Obtain the transaction node summary characteristics of the transaction data;
    获取所述交易数据的交易节点子图谱信息。Obtain the transaction node sub-graph information of the transaction data.
  3. 根据权利要求1所述的基于图神经网络的交易欺诈检测方法,其特征在于,在交易行为历史特征提取步骤之前,还包括以下步骤:The method for detecting transaction fraud based on a graph neural network according to claim 1, characterized in that, before the step of extracting historical features of transaction behavior, it further comprises the following steps:
    谱聚类样本标注步骤,对所述交易样本集进行谱聚类样本标注处理,获得谱聚类交易样本集。The spectral clustering sample labeling step is to perform spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
  4. 根据权利要求3所述的基于图神经网络的交易欺诈检测方法,其特征在于,在所述谱聚类样本标注步骤中,谱聚类样本标注处理包括以下子步骤:The method for detecting transaction fraud based on a graph neural network according to claim 3, wherein, in the step of labeling samples of spectral clustering, the processing of labeling samples of spectral clustering comprises the following sub-steps:
    构建交易样本集的谱矩阵;Construct the spectral matrix of the transaction sample set;
    将所述谱矩阵进行特征值分解为特征矩阵;eigenvalue decomposition of the spectral matrix into an eigenmatrix;
    对所述特征矩阵进行聚类。The feature matrix is clustered.
  5. 根据权利要求1至4任意一项所述的基于图神经网络的交易欺诈检测方法,其特征在于,在所述交易行为聚合特征提取步骤中,所述图卷积网络处理包括以下子步骤:The method for detecting transaction fraud based on a graph neural network according to any one of claims 1 to 4, characterized in that, in the step of extracting transaction behavior aggregation features, the graph convolutional network processing includes the following sub-steps:
    获取所述交易行为历史特征的邻接矩阵;obtaining an adjacency matrix of the historical characteristics of the transaction behavior;
    将所述邻接矩阵输入到2至4层的图卷积网络图学习层中进行邻居间的特征传播,每一层结束后在外侧进行非线性激活。The adjacency matrix is input into the graph convolutional network graph learning layers of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
  6. 一种基于图神经网络的交易欺诈检测系统,其特征在于,所述基于图神经网络的交易 欺诈检测系统包括以下模块:A transaction fraud detection system based on graph neural network is characterized in that, described transaction fraud detection system based on graph neural network comprises the following modules:
    交易数据预处理模块,所述交易数据预处理模块用于获取交易数据并对所述交易数据进行预处理,获得面板形式的交易样本集;a transaction data preprocessing module, which is used for acquiring transaction data and preprocessing the transaction data to obtain a panel-shaped transaction sample set;
    交易行为历史特征提取模块,所述交易行为历史特征提取模块用于对所述交易样本集进行长短期记忆网络处理,获得交易行为历史特征;a transaction behavior historical feature extraction module, which is used to perform long-short-term memory network processing on the transaction sample set to obtain transaction behavior historical features;
    交易行为聚合特征提取模块,所述交易行为聚合特征提取模块用于对所述交易历史行为特征进行图卷积网络处理,获得交易行为聚合特征;a transaction behavior aggregation feature extraction module, the transaction behavior aggregation feature extraction module is configured to perform graph convolution network processing on the transaction historical behavior features to obtain transaction behavior aggregation features;
    预测模块,所述预测模块用于将所述交易行为历史特征以及所述交易行为聚合特征进行全连接处理,通过二分类进行交易节点的欺诈预测。A prediction module, which is configured to perform full connection processing on the historical characteristics of the transaction behavior and the aggregated characteristics of the transaction behavior, and perform fraud prediction of transaction nodes through binary classification.
  7. 根据权利要求6所述的基于图神经网络的交易欺诈检测系统,其特征在于,在所述交易数据预处理模块中,所述预处理包括以下子步骤:The transaction fraud detection system based on a graph neural network according to claim 6, wherein, in the transaction data preprocessing module, the preprocessing comprises the following sub-steps:
    获取所述交易数据的本地交易节点特征;obtaining the local transaction node characteristics of the transaction data;
    获取所述交易数据的交易节点汇总特征;Obtain the transaction node summary characteristics of the transaction data;
    获取所述交易数据的交易节点子图谱信息。Obtain the transaction node sub-graph information of the transaction data.
  8. 根据权利要求6所述的基于图神经网络的交易欺诈检测系统,其特征在于,所述基于图神经网络的交易欺诈检测系统还包括谱聚类样本标注模块,所述谱聚类样本标注模块用于对所述交易样本集进行谱聚类样本标注处理,获得谱聚类交易样本集。The transaction fraud detection system based on the graph neural network according to claim 6, wherein the transaction fraud detection system based on the graph neural network further comprises a spectral clustering sample labeling module, and the spectral clustering sample labeling module uses Perform spectral clustering sample labeling processing on the transaction sample set to obtain a spectral clustering transaction sample set.
  9. 根据权利要求8所述的基于图神经网络的交易欺诈检测系统,其特征在于,在所述谱聚类样本标注模块中,谱聚类样本标注处理包括以下子步骤:The transaction fraud detection system based on a graph neural network according to claim 8, wherein, in the spectral clustering sample labeling module, the spectral clustering sample labeling process comprises the following sub-steps:
    构建交易样本集的谱矩阵;Construct the spectral matrix of the transaction sample set;
    将所述谱矩阵进行特征值分解为特征矩阵;eigenvalue decomposition of the spectral matrix into an eigenmatrix;
    对所述特征矩阵进行聚类。The feature matrix is clustered.
  10. 根据权利要求6至9任意一项所述的基于图神经网络的交易欺诈检测系统,其特征在于,在所述交易行为聚合特征提取模块中,所述图卷积网络处理包括以下子步骤:The transaction fraud detection system based on a graph neural network according to any one of claims 6 to 9, wherein, in the transaction behavior aggregation feature extraction module, the graph convolutional network processing includes the following sub-steps:
    获取所述交易行为历史特征的邻接矩阵;obtaining an adjacency matrix of the historical characteristics of the transaction behavior;
    将所述邻接矩阵输入到2至4层的图卷积网络图学习层中进行邻居间的特征传播,每一层结束后在外侧进行非线性激活。The adjacency matrix is input into the graph convolutional network graph learning layers of layers 2 to 4 for feature propagation among neighbors, and nonlinear activation is performed on the outside after each layer.
PCT/CN2020/135271 2020-11-02 2020-12-10 Graph neural network-based transaction fraud detection method and system WO2022088408A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011203297.3A CN112396160A (en) 2020-11-02 2020-11-02 Transaction fraud detection method and system based on graph neural network
CN202011203297.3 2020-11-02

Publications (1)

Publication Number Publication Date
WO2022088408A1 true WO2022088408A1 (en) 2022-05-05

Family

ID=74599110

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/135271 WO2022088408A1 (en) 2020-11-02 2020-12-10 Graph neural network-based transaction fraud detection method and system

Country Status (2)

Country Link
CN (1) CN112396160A (en)
WO (1) WO2022088408A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423542A (en) * 2022-11-07 2022-12-02 中邮消费金融有限公司 Old belt new activity anti-fraud identification method and system
CN116032670A (en) * 2023-03-30 2023-04-28 南京大学 Ethernet phishing fraud detection method based on self-supervision depth map learning
CN116128130A (en) * 2023-01-31 2023-05-16 广东电网有限责任公司 Short-term wind energy data prediction method and device based on graphic neural network
CN116629080A (en) * 2023-07-24 2023-08-22 福建农林大学 Method for predicting rolling of steel pipe concrete superposed member impact displacement time course chart
CN117057929A (en) * 2023-10-11 2023-11-14 中邮消费金融有限公司 Abnormal user behavior detection method, device, equipment and storage medium
CN117455518A (en) * 2023-12-25 2024-01-26 连连银通电子支付有限公司 Fraudulent transaction detection method and device

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11575695B2 (en) * 2021-04-02 2023-02-07 Sift Sciences, Inc. Systems and methods for intelligently constructing a backbone network graph and identifying and mitigating digital threats based thereon in a machine learning task-oriented digital threat mitigation platform
CN113362071A (en) * 2021-06-21 2021-09-07 浙江工业大学 Pompe fraudster identification method and system for Ether house platform
CN113627947A (en) * 2021-08-10 2021-11-09 同盾科技有限公司 Transaction behavior detection method and device, electronic equipment and storage medium
CN114372803A (en) * 2021-12-14 2022-04-19 同济大学 Quick anti-money laundering detection method based on transaction map
CN117408806A (en) * 2022-07-07 2024-01-16 汇丰软件开发(广东)有限公司 Method for identifying price manipulation behavior in cryptocurrency market
CN115345736B (en) * 2022-07-14 2023-12-29 上海即科智能技术集团有限公司 Abnormal behavior detection method for financial transaction
CN114972366B (en) * 2022-07-27 2022-11-18 山东大学 Full-automatic segmentation method and system for cerebral cortex surface based on graph network

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160364794A1 (en) * 2015-06-09 2016-12-15 International Business Machines Corporation Scoring transactional fraud using features of transaction payment relationship graphs
CN108960304A (en) * 2018-06-20 2018-12-07 东华大学 A kind of deep learning detection method of network trading fraud
CN110084603A (en) * 2018-01-26 2019-08-02 阿里巴巴集团控股有限公司 Method, detection method and the corresponding intrument of training fraudulent trading detection model
CN111311416A (en) * 2020-02-28 2020-06-19 杭州云象网络技术有限公司 Block chain money laundering node detection method based on multichannel graph and graph neural network
CN111462088A (en) * 2020-04-01 2020-07-28 深圳前海微众银行股份有限公司 Data processing method, device, equipment and medium based on graph convolution neural network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816535A (en) * 2018-12-13 2019-05-28 中国平安财产保险股份有限公司 Cheat recognition methods, device, computer equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160364794A1 (en) * 2015-06-09 2016-12-15 International Business Machines Corporation Scoring transactional fraud using features of transaction payment relationship graphs
CN110084603A (en) * 2018-01-26 2019-08-02 阿里巴巴集团控股有限公司 Method, detection method and the corresponding intrument of training fraudulent trading detection model
CN108960304A (en) * 2018-06-20 2018-12-07 东华大学 A kind of deep learning detection method of network trading fraud
CN111311416A (en) * 2020-02-28 2020-06-19 杭州云象网络技术有限公司 Block chain money laundering node detection method based on multichannel graph and graph neural network
CN111462088A (en) * 2020-04-01 2020-07-28 深圳前海微众银行股份有限公司 Data processing method, device, equipment and medium based on graph convolution neural network

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115423542A (en) * 2022-11-07 2022-12-02 中邮消费金融有限公司 Old belt new activity anti-fraud identification method and system
CN116128130A (en) * 2023-01-31 2023-05-16 广东电网有限责任公司 Short-term wind energy data prediction method and device based on graphic neural network
CN116128130B (en) * 2023-01-31 2023-10-24 广东电网有限责任公司 Short-term wind energy data prediction method and device based on graphic neural network
CN116032670A (en) * 2023-03-30 2023-04-28 南京大学 Ethernet phishing fraud detection method based on self-supervision depth map learning
CN116629080A (en) * 2023-07-24 2023-08-22 福建农林大学 Method for predicting rolling of steel pipe concrete superposed member impact displacement time course chart
CN116629080B (en) * 2023-07-24 2023-09-26 福建农林大学 Method for predicting rolling of steel pipe concrete superposed member impact displacement time course chart
CN117057929A (en) * 2023-10-11 2023-11-14 中邮消费金融有限公司 Abnormal user behavior detection method, device, equipment and storage medium
CN117057929B (en) * 2023-10-11 2024-01-26 中邮消费金融有限公司 Abnormal user behavior detection method, device, equipment and storage medium
CN117455518A (en) * 2023-12-25 2024-01-26 连连银通电子支付有限公司 Fraudulent transaction detection method and device
CN117455518B (en) * 2023-12-25 2024-04-19 连连银通电子支付有限公司 Fraudulent transaction detection method and device

Also Published As

Publication number Publication date
CN112396160A (en) 2021-02-23

Similar Documents

Publication Publication Date Title
WO2022088408A1 (en) Graph neural network-based transaction fraud detection method and system
Zhang et al. Adaptive risk minimization: A meta-learning approach for tackling group shift
Zhang et al. A graph-cnn for 3d point cloud classification
CN111583263B (en) Point cloud segmentation method based on joint dynamic graph convolution
CN113468227B (en) Information recommendation method, system, equipment and storage medium based on graph neural network
CN112633426B (en) Method and device for processing data class imbalance, electronic equipment and storage medium
Jiang et al. Convolution neural network model with improved pooling strategy and feature selection for weld defect recognition
Anirudh et al. Influential sample selection: A graph signal processing approach
CN114154557A (en) Cancer tissue classification method, apparatus, electronic device, and storage medium
CN113326862A (en) Audit big data fusion clustering and risk data detection method, medium and equipment
Mohammed Abdelkader et al. A self-adaptive exhaustive search optimization-based method for restoration of bridge defects images
Chen et al. Efficient kernel fuzzy clustering via random Fourier superpixel and graph prior for color image segmentation
US20190139144A1 (en) System, method and computer-accessible medium for efficient simulation of financial stress testing scenarios with suppes-bayes causal networks
CN117131348B (en) Data quality analysis method and system based on differential convolution characteristics
Faska et al. A robust and consistent stack generalized ensemble-learning framework for image segmentation
Saranya et al. FBCNN-TSA: An optimal deep learning model for banana ripening stages classification
US20230134508A1 (en) Electronic device and method with machine learning training
CN117056970A (en) Privacy feature protection method and system based on graph neural network
Wan et al. Modeling noisy annotations for point-wise supervision
CN114648560A (en) Distributed image registration method, system, medium, computer device and terminal
Cao et al. No-reference image quality assessment by using convolutional neural networks via object detection
CN114254738A (en) Double-layer evolvable dynamic graph convolution neural network model construction method and application
Sun et al. Reinforced Contrastive Graph Neural Networks (RCGNN) for Anomaly Detection
Poiitis et al. Pointspectrum: Equivariance Meets Laplacian Filtering for Graph Representation Learning
Bukowski et al. SuperNet--An efficient method of neural networks ensembling

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20959567

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 111023)