CN113469804A - Abnormal key account discovery method, system, equipment and storage medium based on graph neural network - Google Patents

Abnormal key account discovery method, system, equipment and storage medium based on graph neural network Download PDF

Info

Publication number
CN113469804A
CN113469804A CN202110805932.3A CN202110805932A CN113469804A CN 113469804 A CN113469804 A CN 113469804A CN 202110805932 A CN202110805932 A CN 202110805932A CN 113469804 A CN113469804 A CN 113469804A
Authority
CN
China
Prior art keywords
account
abnormal
node
neural network
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110805932.3A
Other languages
Chinese (zh)
Other versions
CN113469804B (en
Inventor
魏学光
黄俊恒
魏玉良
王佰玲
刘红日
王巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology Weihai
Original Assignee
Harbin Institute of Technology Weihai
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology Weihai filed Critical Harbin Institute of Technology Weihai
Priority to CN202110805932.3A priority Critical patent/CN113469804B/en
Publication of CN113469804A publication Critical patent/CN113469804A/en
Application granted granted Critical
Publication of CN113469804B publication Critical patent/CN113469804B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Molecular Biology (AREA)
  • Technology Law (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention relates to a method, a system, equipment and a storage medium for discovering an abnormal key account based on a graph neural network, which comprises the following steps: (1) data preprocessing: the historical transaction records of the abnormal financial accounts are sequentially subjected to operations such as data cleaning, key data item extraction, account transaction relationship construction in organization and the like; (2) constructing a financial transaction network graph by abnormal organization; constructing an abnormal organization financial transaction network diagram according to the account transaction relationship in the organization constructed in the step (1); (3) abnormally organizing key account discovery; and the discovery of the key account of the abnormal organization is realized through the trained TRGA model. The invention can obtain good abnormal key account discovery effect. The method can provide auxiliary study and judgment information for abnormal investigation work of related workers, improve the working efficiency and save time. With the discovery of more abnormal marking data, the classification model can be further improved, and the accuracy of the detection and identification result tends to increase.

Description

Abnormal key account discovery method, system, equipment and storage medium based on graph neural network
Technical Field
The invention relates to a method, a system, equipment and a storage medium for discovering an abnormal key account based on a graph neural network, belonging to the technical field of machine learning.
Background
If a financial account is a leader account, a subscription account, or a rebate account of an abnormal organization, it is called a critical account of the abnormal organization. The expression form of the abnormal organization fund has certain regularity and particularity and can be divided into two parts of purchase fund and rebate fund. The procurement fund is mainly transferred to the procurement account in the key accounts, and the fund flow is characterized by frequent transactions but small amount of money in each transaction. After accumulating the portion of the application funds, the application account transfers the portion of the application funds to a leader account of the organization. The leader account withholds a portion of the funds as its dividend and then transfers the remaining funds to the rebate account for rebate to the underlying account. Traditional key account discovery relies on constructing manual features for analysis, labor and time are high in cost, and a large-scale data set which continuously emerges cannot be effectively processed.
In the conventional graph-based research, there are generally a manual feature-based method, a random walk-based method, and a graph neural network-based method. The manual feature-based method carries out graph learning by constructing manual features, such as node degrees, centrality degrees, PageRank values and the like, strongly depends on manual intervention, and the effectiveness of the constructed features on a learning task cannot be guaranteed. The graph is converted into a sequence by a random walk-based method, and the structural information of the graph cannot be fully utilized. The method based on the graph neural network can well learn end to end aiming at a target task so as to obtain effective characteristics, but the existing graph neural network model is not sufficient for extracting the topological structure characteristics of the graph.
At present, great progress has been made in the field of abnormal tissue identification based on machine learning or deep learning methods. Many researches apply neural network models such as GNN and CNN to the field of abnormal tissue identification, and the effect is remarkable, but under the condition that the data volume is insufficient or the abnormal tissue data is not complete enough, the analysis result obtained by the identification method based on the abnormal network topological characteristic is likely to be not accurate enough and the usability is low. At present, no deep learning model with excellent performance can be fused with a graph neural network for detecting an abnormal key account, so that labor-intensive feature engineering labor cost is reduced, and the capability of discovering the abnormal key account is improved.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a method and a system for discovering an abnormal key account based on a graph neural network;
the method can analyze and mine information such as abnormal financial accounts, organization key fund flows and the like from the complex financial transaction network by applying machine learning, deep learning, complex network research and combination optimization methods according to the characteristics of account transaction and fund flow aiming at abnormal financial transaction behaviors of abnormal organizations. The invention can be used for: 1) feature mining of abnormal key accounts based on abnormal transaction flow data; 2) key account discovery based on abnormal transaction flow data.
The invention also provides computer equipment and a storage medium.
Interpretation of terms:
1. the transaction account is a bank account which actively initiates a fund transfer-out action;
2. an adversary account, a bank account that receives funds.
The technical scheme of the invention is as follows:
an abnormal key account discovery method based on a graph neural network comprises the following steps:
(1) data preprocessing: sequentially performing data cleaning, key data item extraction and internal account transaction relationship organization operation on the historical transaction records of the abnormal financial accounts;
data cleansing, which means: cleaning all transaction data related to normal accounts, and only keeping historical transaction records of abnormal financial accounts of both transaction parties;
key data item extraction, which refers to: extracting transaction account, counter-party account and access sign information item data from the historical transaction records of the abnormal financial accounts;
and (3) establishing an account transaction relationship in the organization, which means that:
two data items are created for new data, i.e., intra-organization account transaction relationships: a source account and a target account; the source account refers to an account which transfers a certain amount of money from the source account in the current transaction, and the target account refers to an account which receives the amount transferred from the source account; for the historical transaction records of each abnormal financial account, if the data of the in-out mark information item is 'out', the source account is a transaction account, and the target account is an opponent account; if the data of the in-out mark information item is 'in', the source account is an opponent account, and the target account is a transaction account;
when the piece of data does not exist in the new data, adding the piece of information containing the source account and the target account into the new data; meanwhile, coding mapping is carried out on each abnormal financial account again, and the abnormal financial accounts are mapped into codes from the interval 0 to the number of the abnormal accounts;
(2) constructing a financial transaction network graph by abnormal organization;
constructing an abnormal organization financial transaction network diagram according to the account transaction relationship in the organization constructed in the step (1); in the network diagram of the abnormal organization financial transaction, nodes represent codes of abnormal financial accounts, a directed edge connecting two nodes represents that the two abnormal financial accounts have transferred the transaction, and the direction of an arrow represents the flow direction of funds;
(3) abnormally organizing key account discovery; and the discovery of the key account of the abnormal organization is realized through the trained TRGA model.
According to the invention, in the step (1), a threshold method is adopted for data cleaning, specifically: and if the absolute difference of the fund inflow and outflow times of the current transaction record is smaller than a given threshold value, determining that the current transaction record is a normal account and cleaning, otherwise, keeping the current transaction record.
According to the invention, the step (3) is preferably realized by the following steps:
3.1 constructing and training a TRGA model;
3.2 extracting the topological characteristics of the account transaction, and finding the key account of abnormal organization.
According to the optimization of the invention, the TRGA comprises an input layer, a three-way graph neural network layer, a multi-head attention mechanism layer, a linear layer and a Softmax layer which are sequentially connected;
the input of the input layer of the TRGA model is abnormal organization of the financial transaction network Graph and the one-hot characteristic X of the account node;
the three-way graph neural network layer respectively carries out feature aggregation on nodes of the abnormal organization financial transaction network graph from different angles to update node features, then the node features obtained from different layers are spliced, and information weighting is carried out on the three obtained node features through the multi-head attention mechanism layer, so that a TRGA (tree trunk genetic algorithm) model focuses on more effective node topological structure information; the TRGA integrates the output of the multi-head attention mechanism layer, namely the TRGA integrates the characteristic vectors of the nodes through the linear layer and performs data dimensionality reduction, the TRGA finally outputs a vector with the length of 2, and abnormal organization key account discovery is achieved based on the output vector.
Preferably, according to the present invention, each road map neural network layer of the TRGA model aggregates adjacent node information of the nodes independently.
According to the invention, in the TRGA model, the abnormal organization financial transaction network Graph and the one-hot characteristic X of the account node are respectively input into the three-way Graph neural network layer; in each road map neural network layer, the input of other network layers is the output of the previous network layer in the current road map neural network layer; each path of graph neural network layer finally obtains a node feature matrix with the same dimension;
according to the optimization of the invention, the first road map neural network layer of the TRGA model extracts the account node features in the abnormally organized financial transaction network map through the multi-head attention mechanism layer, specifically: the multi-head graph attention layer discovers multiple relevant features of a central node and all adjacent nodes thereof through multiple groups of independent attention mechanisms, distributes different attention weights to adjacent nodes of the central node, and learns multiple relevant features between the central node and the adjacent nodes thereof.
Further preferably, assume that the central node is viThen the central node viAnd its adjacent node vjThe complete attention weight calculation formula of (a) is shown as formula (i):
Figure RE-GDA0003206134920000031
in the formula (I), the compound is shown in the specification,
Figure RE-GDA0003206134920000032
is the k-th layer central node viAnd its adjacent node vjThe attention weight coefficient of (a) is,
Figure RE-GDA0003206134920000033
is a node viThe feature vector corresponding to the k-th layer,
Figure RE-GDA0003206134920000034
is a node vjW is a weight matrix, aTFor the weight parameter, the activation function is LeakyReLU (-), N (v)i) Represents viSet of adjacent nodes of vjDenotes viAn adjacent node of (2);
node viThe characteristic vector of the k +1 th layer is shown as the formula (II):
Figure RE-GDA0003206134920000041
in the formula (II), w(k)Is a weight parameter of the k-th layer node feature transformation,
Figure RE-GDA0003206134920000042
sigma (·) is a sigmoid activation function, | | | represents splicing operation, and final feature embedding of the first road map neural network layer is obtained by aggregating feature vectors of nodes of the first road map neural network layer.
Further preferably, the multi-head attention mechanism layer comprises a plurality of groups of self-attention mechanisms which are independently and equally distributed, and the formula of the self-attention mechanism is shown as formula (iii):
Figure RE-GDA0003206134920000043
in formula (III), Q, K, V are the dot product matrix of the input node feature vector and the weight, dkFor the feature vector dimension, a self-attention mechanism calculates the association between each node feature vector and other node feature vectors, and takes context features into account well.
According to the optimization of the invention, in the other two graph neural network layers of the TRGA model, the front edge and the reverse edge of the abnormal organization financial transaction network graph are respectively regarded as two edges with different types, wherein in the second graph neural network layer, feature aggregation is carried out on a central node and a neighbor node which are connected through the front edge type by means of a graph convolution layer so as to update the characteristics of the central node; and in the third graph neural network layer, feature aggregation is carried out on the central node and the neighbor nodes which are connected through the reverse edge type by means of the graph convolution layer so as to update the characteristics of the central node.
More preferably, the calculation method of the map convolution layer is represented by formula (iv):
Figure RE-GDA0003206134920000044
in the formula (IV), the compound is shown in the specification,
Figure RE-GDA0003206134920000045
in the second road map neural network layer, the and node viThe node set connected with the outgoing edge is the node v in the third route graph neural network layeriA set of nodes connected by an incoming edge;
Figure RE-GDA0003206134920000046
is the weight of the k-th layer graph neural network, ci,rIs and node viThe total number of connected nodes;
information weighting is carried out on the obtained three node characteristics through a multi-head attention mechanism layer, so that the TRGA model pays attention to more effective node topological structure information; the calculation of the attention layer of the multigraph is shown in formula (V) and formula (VI):
headi=Att(QWi Q,KWi K,VWi V) (Ⅴ)
h=MultiHead(Q,K,V)=Concat(head1,...,headh)WO (Ⅵ)
head in the formulae (V) and (VI)iFor the ith head feature embedding, Att (-) is to perform an attention calculation; wi Q,Wi K,Wi V,WORespectively are weight coefficients of the neural network; concat (·) represents one vector splicing, and h is the final output vector of the TRGA model.
According to the invention, the discovery of the key account of the abnormal organization is realized based on the output vector, which specifically includes: when the value of the first element of the output vector is larger than the value of the second element, the account represented by the node is considered to be a non-key account, otherwise, the account represented by the node is considered to be a key account.
An abnormal key account discovery system based on a graph neural network comprises a data preprocessing module, an abnormal organization financial transaction network graph construction module and an abnormal organization key account discovery module;
the data preprocessing module is used for: sequentially performing data cleaning, key data item extraction and internal account transaction relationship organization operation on the historical transaction records of the abnormal financial accounts; the abnormal organization financial transaction network graph construction module is used for: constructing an abnormal organization financial transaction network graph according to the internal organization account transaction relationship constructed by the data preprocessing module; the abnormal organization key account discovery module is used for: and the discovery of the key account of the abnormal organization is realized through the trained TRGA model.
A computer device comprising a memory storing a computer program and a processor implementing the steps of a graph neural network based non-normal critical account discovery method when the computer program is executed.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the graph neural network-based abnormal key account discovery method.
The invention has the beneficial effects that:
1. the abnormal key account discovery method based on the graph neural network can analyze and mine information such as abnormal financial accounts, organization key fund flows and the like from a complex financial transaction network by applying machine learning, deep learning, complex network research and a combined optimization method according to the characteristics of account transaction and fund flow aiming at abnormal financial transaction behaviors of abnormal organizations.
2. The abnormal key account discovery method provided by the invention can learn the transaction relation characteristics of the account nodes in the financial transaction network in the abnormal organization based on the proposed TRGA neural network model so as to classify the account nodes, thereby discovering the key accounts of the abnormal organization and reducing the investment of labor-intensive characteristic engineering to a certain extent. By using the single type of abnormal transaction flow data and less characteristics, a good abnormal key account discovery effect can be achieved. The method can provide auxiliary study and judgment information for abnormal investigation work of related workers, improve the working efficiency and save time. With the discovery of more abnormal marking data, the classification model can be further improved, and the accuracy of the detection and identification result tends to increase.
3. The practical range of the invention comprises the characteristic mining of abnormal key accounts based on abnormal transaction running water data and the key account discovery based on the abnormal transaction running water data. Has wide application prospect.
Drawings
FIG. 1 is a schematic flow chart of an abnormal key account discovery method based on a graph neural network according to the present invention;
FIG. 2 is a partial schematic diagram of a network diagram of a financial transaction for an abnormal organization according to the present invention;
FIG. 3 is a schematic diagram of a network structure of the TRGA model of the present invention;
FIG. 4 is a diagram illustrating a multi-head graph attention layer according to the present invention, which discovers multiple relevant features of a central node and all neighboring nodes thereof through multiple independent attention mechanisms and assigns different attention weights to the neighboring nodes of the central node;
FIG. 5 is a schematic flowchart of an abnormal key account discovery method based on the neural network of the embodiment 1;
FIG. 6 is a detailed network structure diagram of the TRGA model of the present invention;
Detailed Description
The invention is further defined in the following, but not limited to, the figures and examples in the description.
Example 1
An abnormal key account discovery method based on a graph neural network is disclosed, as shown in fig. 1 and 5, and comprises the following steps:
(1) data preprocessing: the historical transaction records of the abnormal financial accounts are sequentially subjected to operations such as data cleaning, key data item extraction, account transaction relationship construction in organization and the like;
data cleansing, which means: cleaning all transaction data related to normal accounts, and only keeping historical transaction records of abnormal financial accounts of both transaction parties;
the invention mainly focuses on financial transaction networks inside abnormal organizations, and aims to discover key accounts of the abnormal organizations from transaction behaviors among financial accounts inside the abnormal organizations, wherein the key accounts include high-level accounts (organizers and leaders) of the organizations, and purchase-applying accounts which are responsible for absorbing funds and rebate accounts which are responsible for releasing the funds. Once participating in the abnormal organization's purchase-rebate behavior, the financial account will become an abnormal financial account. Thus, the present invention recognizes that abnormally related financial transactions flow only within an abnormal organization.
Key data item extraction, which refers to: extracting transaction account, counter-party account and access sign information item data from the historical transaction records of the abnormal financial accounts;
extracting key data items, specifically: first, data denoising processing is performed to remove the transaction records lost by individual data items. And then, carrying out structural splitting on the historical transaction record in a mode of manually defining rules, and extracting data according to fields such as a transaction account, an opponent account and the like. And finally, adding an in-out mark item according to the fund flow direction relation between the transaction account and the counter-party account in the historical transaction record.
Inbound and outbound transaction information for an abnormal account is critical to critical account discovery. For a single account, the transaction characteristics of the account that the incoming and outgoing transactions constitute are reflected in the transaction relationship with other accounts. Thus, the key to building an abnormally organized financial transaction network is whether there is a transaction between accounts, and the flow of transaction amounts to the data. Therefore, the invention extracts the data of the transaction account, the opponent account and the access mark information item from the data.
And (3) establishing an account transaction relationship in the organization, which means that:
to facilitate subsequent network patterning of transactions, the present invention reconstructs the data based on the entry and exit flag information for each transaction record. The invention creates two data items for new data, i.e. an intra-organization account transaction relationship: a source account and a target account; the source account and the target account are referred to herein with respect to the direction of the flow of funds in the transaction. The source account refers to an account which transfers a certain amount of money from the source account in the current transaction, and the target account refers to an account which receives the amount transferred from the source account; for the historical transaction records of each abnormal financial account, if the data of the in-out mark information item is 'out', the source account is a transaction account, and the target account is an opponent account; if the data of the in-out mark information item is 'in', the source account is an opponent account, and the target account is a transaction account;
when the piece of data does not exist in the new data, adding the piece of information containing the source account and the target account into the new data; meanwhile, coding mapping is carried out on each abnormal financial account again, and the abnormal financial accounts are mapped into codes from the interval 0 to the number of the abnormal accounts;
the part of the data of the account transaction relationship in the organization finally constructed is shown in table 1:
TABLE 1
Figure RE-GDA0003206134920000071
(2) Constructing a financial transaction network graph by abnormal organization;
constructing an abnormal organization financial transaction network diagram according to the account transaction relationship in the organization constructed in the step (1); in the network diagram of the abnormal organization financial transaction, nodes represent codes of abnormal financial accounts, a directed edge connecting two nodes represents that the two abnormal financial accounts have transferred the transaction, and the direction of an arrow represents the flow direction of funds;
the method specifically comprises the following steps: and setting a random number seed according to the account node code, generating a rectangular coordinate of the node according to the random number seed, and finally performing normalization processing to obtain the graph coordinate of the account node. And generating a graph coordinate aiming at each node to obtain the abnormal organization financial transaction network graph.
(3) Abnormally organizing key account discovery; and (3) realizing abnormal organization key account discovery through a trained TRGA (Three-Route Graph Attention Network) model.
Example 2
The abnormal key account discovery method based on the graph neural network is characterized by comprising the following steps of:
the method is suitable for a computing platform with the CPU version or performance not lower than intel i5 and the memory more than 4G and configured with a Linux operating system which needs to be configured with Tensorflow and Keras frames. In the above configuration, a computing platform with powerful GPU computing power is a more preferable choice for running the method.
In the step (1), a threshold method is adopted for data cleaning, and specifically the method comprises the following steps: and if the absolute difference of the fund inflow and outflow times of the current transaction record is smaller than a given threshold value, determining that the current transaction record is a normal account and cleaning, otherwise, keeping the current transaction record.
The invention adopts a threshold value method to clean data. The fund inflow times and outflow times of the key accounts and the normal accounts in the abnormal organization are greatly different, and the fund outflow times of the key accounts are obviously higher than those of the normal accounts. Thus, if the number of funds outflows of the current transaction record is less than a given threshold, then it is considered a normal account and the data set is removed. According to data statistical analysis, the fund inflow times of the key accounts are obviously smaller than the outflow times, so if the absolute difference of the fund inflow times and the fund outflow times of the current transaction records is smaller than a given threshold value, the current transaction records are determined to be normal accounts, and the data set is removed.
The concrete implementation steps of the step (3) comprise:
3.1 constructing and training a TRGA model;
for account nodes in the financial transaction network of abnormal organization, the invention abstracts the key account discovery problem of the abnormal organization into the classification problem of graph nodes;
the TRGA model is a deep learning model, so the TRGA model is based on a Pythrch deep learning framework to build a neural network. And (3) transmitting neural network hyper-parameters such as learning rate, training step length, training times and the like into the network, setting the gradient descent strategy of the optimizer as random gradient descent, setting the loss function as cross entropy, and starting the training of the TRGA model.
3.2 extracting the topological characteristics of the account transaction, and finding the key account of abnormal organization.
The TRGA comprises an input layer, a three-way graph neural network layer, a multi-head attention mechanism layer, a linear layer and a Softmax layer which are sequentially connected;
the three-way Graph neural network layer is divided into three-way Graph neural networks, the first Graph neural network layer regards Graph as a directionless Graph and carries out iterative weighting on adjacent nodes aiming at a central node, and then each layer of Graph neural network carries out feature vector processing on the nodes; in the second road graph neural network layer, feature aggregation is carried out on the central node and the adjacent nodes which are connected through the front edge type by means of the graph convolution layer so as to update the characteristics of the central node; in the third graph neural network layer, feature aggregation is carried out on the central node and the adjacent nodes which are connected through the reverse edge type by means of the graph convolution layer so as to update the characteristics of the central node. The multi-head attention mechanism layer is composed of 8 independent self-attention layers which are distributed in the same way, and output characteristic embedding is converted into a final output vector h through splicing.
Fig. 3 and 6 are neural network structure diagrams of the TRGA model, and one-hot features of the directed financial transaction network diagram and the nodes are independently provided to a three-way graph neural network layer. That is, the inputs to the first layer network of the three-way network layer of the model are identical. In each path, the input of other network layers is provided by the output of the previous network layer in the current path. And finally, each path obtains a node characteristic matrix with the same dimension.
From the overall architecture, the input of the input layer of the TRGA model is the abnormal organization of the financial transaction network Graph and the one-hot characteristic X of the account node;
the one-hot feature X of the account node is obtained in the following mode: one-hot encoding uses an N-bit status register to encode N states, each having its own independent register bit and only One of which is active at any One time. For example, the one-hot characteristics of nodes A, B and C are [1,0,0], [0,1,0], [0,0 and 1], respectively.
The core part of the TRGA model is divided into three paths, the three-path graph neural network layer carries out feature aggregation on nodes of the abnormal organization financial transaction network graph from different angles respectively to update node features, then the node features obtained from different layers are spliced, and information weighting is carried out on the three obtained node features through the multi-head attention mechanism layer, so that the TRGA model focuses on more effective node topological structure information; the TRGA integrates the output of the multi-head attention mechanism layer, namely the TRGA integrates the characteristic vectors of the nodes through the linear layer and performs data dimensionality reduction, the TRGA finally outputs a vector with the length of 2, and abnormal organization key account discovery is achieved based on the output vector.
Each road graph neural network layer of the TRGA model independently aggregates adjacent node information of the nodes.
In the TRGA model, abnormally organizing a financial transaction network Graph and one-hot characteristics X of account nodes are respectively input into a three-way Graph neural network layer; that is, the inputs to the first layer network of the three-way network layer of the model are identical. In each road map neural network layer, the input of other network layers is the output of the previous network layer in the current road map neural network layer; each path of graph neural network layer finally obtains a node feature matrix with the same dimension;
the first road map neural network layer of the TRGA model extracts account node characteristics in the financial transaction network map with abnormal organization through a multi-head attention mechanism layer, and specifically means that: the multi-head graph attention layer discovers multiple relevant features of a central node and all adjacent nodes thereof through multiple groups of independent attention mechanisms, distributes different attention weights to adjacent nodes of the central node, and learns multiple relevant features between the central node and the adjacent nodes thereof.
Assume a central node of viThen the central node viAnd its adjacent node vjThe complete attention weight calculation formula of (a) is shown as formula (i):
Figure RE-GDA0003206134920000091
in the formula (I), the compound is shown in the specification,
Figure RE-GDA0003206134920000092
is the k-th layer central node viAnd its adjacent node vjThe attention weight coefficient of (a) is,
Figure RE-GDA0003206134920000093
is a node viThe feature vector corresponding to the k-th layer,
Figure RE-GDA0003206134920000094
is a node vjW is a weight matrix, aTFor the weight parameter, the activation function is LeakyReLU (-), N (v)i) Represents viSet of adjacent nodes of vjDenotes viAn adjacent node of (2);
node viThe characteristic vector of the k +1 th layer is shown as the formula (II):
Figure RE-GDA0003206134920000101
in the formula (II), w(k)Is a weight parameter of the k-th layer node feature transformation,
Figure RE-GDA0003206134920000102
sigma (·) is a sigmoid activation function, | | | represents splicing operation, and final feature embedding of the first road map neural network layer is obtained by aggregating feature vectors of nodes of the first road map neural network layer.
Thus, in fig. 2, there are 4 nodes in total, 466, 497, 457 and 454, among the neighbor nodes of the central node 548. The multi-head graph attention layer in fig. 4 utilizes 8 independent attention mechanisms to learn the correlation characteristics between node 548 and its 4 neighboring nodes, which are shown by 8 dotted lines in the figure. In the calculation of each attention mechanism, the invention firstly calculates the correlation degree of the central node and the neighbor nodes thereof, and then uses the LeakyReLU function for activation. In order to better distribute the weight, the correlation degree calculated by the central node and all the neighbors is subjected to unified normalization processing by using softmax.
The lower half of fig. 4 shows an illustration of the calculation process of the attention weight of node 548 and its neighbor node 454.
The multi-head attention mechanism layer comprises a plurality of groups of self-attention mechanisms which are independently and equally distributed, and the formula of the self-attention mechanism is shown as the formula (III):
Figure RE-GDA0003206134920000103
in formula (III), Q, K, V are the dot product matrix of the input node feature vector and the weight, dkFor the feature vector dimension, a self-attention mechanism calculates the association between each node feature vector and other node feature vectors, and takes context features into account well. Thereby enhancing the learning ability of the system.
In the other two graph neural network layers of the TRGA model, the front edge and the reverse edge of the abnormal organization financial transaction network graph are respectively regarded as two edges of different types, wherein in the second graph neural network layer, feature aggregation is carried out on a central node and a neighbor node which are connected through the front edge type by means of a graph convolution layer so as to update the characteristics of the central node; and in the third graph neural network layer, feature aggregation is carried out on the central node and the neighbor nodes which are connected through the reverse edge type by means of the graph convolution layer so as to update the characteristics of the central node.
The calculation method of the graph convolution layer is shown as the formula (IV):
Figure RE-GDA0003206134920000104
in the formula (IV), the compound is shown in the specification,
Figure RE-GDA0003206134920000105
in the second road map neural network layer, the and node viThe node set connected with the outgoing edge is the node v in the third route graph neural network layeriA set of nodes connected by an incoming edge;
Figure RE-GDA0003206134920000106
is the weight of the k-th layer graph neural network, ci,rIs and node viThe total number of connected nodes;
for each graph neural network layer, each additional network layer is superposed, and information of neighbor nodes of higher order is aggregated. After a certain account node in the financial transaction network of the abnormal organization is subjected to feature aggregation of a three-way graph neural network layer of the TRGA, 3 feature vectors Q, K and V of the node are obtained. Information weighting is carried out on the obtained three node characteristics through a multi-head attention mechanism layer, so that the TRGA model pays attention to more effective node topological structure information; the calculation of the attention layer of the multigraph is shown in formula (V) and formula (VI):
headi=Att(QWi Q,KWi K,VWi V) (Ⅴ)
h=MultiHead(Q,K,V)=Concat(head1,...,headh)WO (Ⅵ)
head in the formulae (V) and (VI)iFor the ith head feature embedding, Att (-) is to perform an attention calculation; wi Q,Wi K, Wi V,WORespectively are weight coefficients of the neural network; concat (·) represents one vector splicing, and h is the final output vector of the TRGA model.
Based on the output vector, the discovery of the key account of the abnormal organization is realized, which specifically comprises the following steps: when the value of the first element of the output vector is larger than the value of the second element, the account represented by the node is considered to be a non-key account, otherwise, the account represented by the node is considered to be a key account.
Example 3
An abnormal key account discovery system based on a graph neural network is used for realizing the abnormal key account discovery method based on the graph neural network in the embodiment 1 or 2, and comprises a data preprocessing module, an abnormal organization financial transaction network graph construction module and an abnormal organization key account discovery module;
the historical transaction records of the abnormal financial accounts can obtain the transaction relationship of the accounts in the organization after the preprocessing work such as data cleaning and the like. The data preprocessing module is used for: the historical transaction records of the abnormal financial accounts are sequentially subjected to operations such as data cleaning, key data item extraction, account transaction relationship construction in organization and the like; the abnormal organization financial transaction network graph construction module is used for: constructing an abnormal organization financial transaction network graph according to the internal organization account transaction relationship constructed by the data preprocessing module; the abnormal organization key account discovery module is used for: and (3) realizing abnormal organization key account discovery through a trained TRGA (Three-Route Graph Attention Network) model.
Example 4
A computer device comprising a memory storing a computer program and a processor implementing the steps of the graph neural network based abnormal key account discovery method of embodiments 1 or 2 when the processor executes the computer program.
Example 5
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the graph neural network-based abnormal key account discovery method of embodiment 1 or 2.

Claims (10)

1. An abnormal key account discovery method based on a graph neural network is characterized by comprising the following steps:
(1) data preprocessing: sequentially performing data cleaning, key data item extraction and internal account transaction relationship organization operation on the historical transaction records of the abnormal financial accounts;
data cleansing, which means: cleaning all transaction data related to normal accounts, and only keeping historical transaction records of abnormal financial accounts of both transaction parties;
key data item extraction, which refers to: extracting transaction account, counter-party account and access sign information item data from the historical transaction records of the abnormal financial accounts;
and (3) establishing an account transaction relationship in the organization, which means that:
two data items are created for new data, i.e., intra-organization account transaction relationships: a source account and a target account; the source account refers to an account which transfers a certain amount of money from the source account in the current transaction, and the target account refers to an account which receives the amount transferred from the source account; for the historical transaction records of each abnormal financial account, if the data of the in-out mark information item is 'out', the source account is a transaction account, and the target account is an opponent account; if the data of the in-out mark information item is 'in', the source account is an opponent account, and the target account is a transaction account;
when the piece of data does not exist in the new data, adding the piece of information containing the source account and the target account into the new data; meanwhile, coding mapping is carried out on each abnormal financial account again, and the abnormal financial accounts are mapped into codes from the interval 0 to the number of the abnormal accounts;
(2) constructing a financial transaction network graph by abnormal organization;
constructing an abnormal organization financial transaction network diagram according to the account transaction relationship in the organization constructed in the step (1); in the network diagram of the abnormal organization financial transaction, nodes represent codes of abnormal financial accounts, a directed edge connecting two nodes represents that the two abnormal financial accounts have transferred the transaction, and the direction of an arrow represents the flow direction of funds;
(3) abnormally organizing key account discovery; and the discovery of the key account of the abnormal organization is realized through the trained TRGA model.
2. The abnormal key account discovery method based on the graph neural network as claimed in claim 1, wherein the step (3) is implemented by:
3.1 constructing and training a TRGA model;
3.2 extracting the topological characteristics of the account transaction, and finding the key account of abnormal organization.
3. The abnormal key account discovery method based on the graph neural network is characterized in that the TRGA model comprises an input layer, a three-way graph neural network layer, a multi-head attention mechanism layer, a linear layer and a Softmax layer which are sequentially connected;
the input of the input layer of the TRGA model is abnormal organization of the financial transaction network Graph and the one-hot characteristic X of the account node;
the three-way graph neural network layer respectively carries out feature aggregation on nodes of the abnormal organization financial transaction network graph from different angles to update node features, then the node features obtained from different layers are spliced, and information weighting is carried out on the three obtained node features through the multi-head attention mechanism layer, so that a TRGA (tree trunk genetic algorithm) model focuses on more effective node topological structure information; the TRGA integrates the output of the multi-head attention mechanism layer, namely the TRGA integrates the characteristic vectors of the nodes through the linear layer and performs data dimensionality reduction, the TRGA finally outputs a vector with the length of 2, and abnormal organization key account discovery is achieved based on the output vector.
4. The method of claim 3, wherein each of the neural network layers of the TRGA model independently aggregates the adjacent node information of the nodes;
in the TRGA model, abnormally organizing a financial transaction network Graph and one-hot characteristics X of account nodes are respectively input into a three-way Graph neural network layer; in each road map neural network layer, the input of other network layers is the output of the previous network layer in the current road map neural network layer; and each path of graph neural network layer finally obtains a node feature matrix with the same dimension.
5. The method for discovering abnormal key accounts based on the graph neural network as claimed in claim 3, wherein the first graph neural network layer of the TRGA model extracts the account node characteristics in the abnormal organization financial transaction network graph through a multi-head attention mechanism layer, specifically: the multi-head graph attention layer discovers multiple related features of a central node and all adjacent nodes thereof through multiple groups of independent attention mechanisms to distribute different attention weights to adjacent nodes of the central node, so that multiple related features between the central node and the adjacent nodes thereof are learned;
further preferably, assume that the central node is viThen the central node viAnd its adjacent node vjThe complete attention weight calculation formula of (a) is shown as formula (i):
Figure RE-FDA0003206134910000021
in the formula (I), the compound is shown in the specification,
Figure RE-FDA0003206134910000022
is the k-th layer central node viAnd its adjacent node vjThe attention weight coefficient of (a) is,
Figure RE-FDA0003206134910000023
is a node viThe feature vector corresponding to the k-th layer,
Figure RE-FDA0003206134910000024
is a node vjW is a weight matrix, aTFor the weight parameter, the activation function is LeakyReLU (-), N (v)i) Represents viSet of adjacent nodes of vjDenotes viAn adjacent node of (2);
node viThe characteristic vector of the k +1 th layer is shown as the formula (II):
Figure RE-FDA0003206134910000025
in the formula (II), w(k)Is a weight parameter of the k-th layer node feature transformation,
Figure RE-FDA0003206134910000026
sigma (-) is a sigmoid activation function, | | | represents splicing operation, and final special characteristics of the first road map neural network layer are obtained by aggregating characteristic vectors of nodes of the first road map neural network layerSign embedding;
further preferably, the multi-head attention mechanism layer comprises a plurality of groups of self-attention mechanisms which are independently and equally distributed, and the formula of the self-attention mechanism is shown as formula (iii):
Figure RE-FDA0003206134910000031
in formula (III), Q, K, V are the dot product matrix of the input node feature vector and the weight, dkFor the feature vector dimension, a self-attention mechanism calculates the association between each node feature vector and other node feature vectors, and takes context features into account well.
6. The method for discovering the abnormal key account based on the graph neural network as claimed in claim 3, wherein in the other two graph neural network layers of the TRGA model, the front edge and the back edge of the abnormal organization financial transaction network graph are respectively regarded as two edges of different types, wherein in the second graph neural network layer, feature aggregation is performed on the central node and the neighbor nodes which are connected through the front edge type by means of a graph convolutional layer to update the characteristics of the central node; in the third graph neural network layer, feature aggregation is carried out on the central node and the neighbor nodes which are connected through the reverse edge type by means of the graph convolution layer so as to update the characteristics of the central node;
more preferably, the calculation method of the map convolution layer is represented by formula (iv):
Figure RE-FDA0003206134910000032
in the formula (IV), the compound is shown in the specification,
Figure RE-FDA0003206134910000033
in the second road map neural network layer, the and node viThe node set connected with the outgoing edge refers to the node in the third route graph neural network layerPoint viA set of nodes connected by an incoming edge;
Figure RE-FDA0003206134910000034
is the weight of the k-th layer graph neural network, ci,rIs and node viThe total number of connected nodes;
information weighting is carried out on the obtained three node characteristics through a multi-head attention mechanism layer, so that the TRGA model pays attention to more effective node topological structure information; the calculation of the attention layer of the multigraph is shown in formula (V) and formula (VI):
headi=Att(QWi Q,KWi K,VWi V) (Ⅴ)
h=MultiHead(Q,K,V)=Concat(head1,...,headh)WO (Ⅵ)
head in the formulae (V) and (VI)iFor the ith head feature embedding, Att (-) is to perform an attention calculation; wi Q,Wi K,Wi V,WORespectively are weight coefficients of the neural network; concat (·) represents one vector splicing, and h is the final output vector of the TRGA model.
7. The method for discovering the abnormal key account based on the neural network of the graph according to any one of claims 1 to 6, wherein the abnormal organization key account discovery is realized based on the output vector, specifically: when the value of the first element of the output vector is larger than the value of the second element, the account represented by the node is considered to be a non-key account, otherwise, the account represented by the node is considered to be a key account.
8. An abnormal key account discovery system based on a graph neural network, which is used for realizing the abnormal key account discovery method based on the graph neural network as claimed in any one of claims 1 to 7, and is characterized by comprising a data preprocessing module, an abnormal organization financial transaction network graph construction module and an abnormal organization key account discovery module;
the data preprocessing module is used for: sequentially performing data cleaning, key data item extraction and internal account transaction relationship organization operation on the historical transaction records of the abnormal financial accounts; the abnormal organization financial transaction network graph construction module is used for: constructing an abnormal organization financial transaction network graph according to the internal organization account transaction relationship constructed by the data preprocessing module; the abnormal organization key account discovery module is used for: and the discovery of the key account of the abnormal organization is realized through the trained TRGA model.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the graph neural network-based abnormal key account discovery method of any one of claims 1-7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the graph neural network-based abnormal key account discovery method of any one of claims 1 to 7.
CN202110805932.3A 2021-07-16 2021-07-16 Abnormal key account discovery method, system, equipment and storage medium based on graph neural network Active CN113469804B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110805932.3A CN113469804B (en) 2021-07-16 2021-07-16 Abnormal key account discovery method, system, equipment and storage medium based on graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110805932.3A CN113469804B (en) 2021-07-16 2021-07-16 Abnormal key account discovery method, system, equipment and storage medium based on graph neural network

Publications (2)

Publication Number Publication Date
CN113469804A true CN113469804A (en) 2021-10-01
CN113469804B CN113469804B (en) 2024-03-12

Family

ID=77880890

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110805932.3A Active CN113469804B (en) 2021-07-16 2021-07-16 Abnormal key account discovery method, system, equipment and storage medium based on graph neural network

Country Status (1)

Country Link
CN (1) CN113469804B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110400220A (en) * 2019-07-23 2019-11-01 上海氪信信息技术有限公司 A kind of suspicious transaction detection method of intelligence based on semi-supervised figure neural network
US20200167787A1 (en) * 2018-11-26 2020-05-28 Bank Of America Corporation System for anomaly detection and remediation based on dynamic directed graph network flow analysis
CN111797177A (en) * 2020-07-06 2020-10-20 哈尔滨工业大学(威海) Financial time sequence classification method for abnormal financial account detection and application
CN112912961A (en) * 2018-05-23 2021-06-04 恩维萨基因学公司 Systems and methods for analyzing alternative splicing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112912961A (en) * 2018-05-23 2021-06-04 恩维萨基因学公司 Systems and methods for analyzing alternative splicing
US20200167787A1 (en) * 2018-11-26 2020-05-28 Bank Of America Corporation System for anomaly detection and remediation based on dynamic directed graph network flow analysis
CN110400220A (en) * 2019-07-23 2019-11-01 上海氪信信息技术有限公司 A kind of suspicious transaction detection method of intelligence based on semi-supervised figure neural network
CN111797177A (en) * 2020-07-06 2020-10-20 哈尔滨工业大学(威海) Financial time sequence classification method for abnormal financial account detection and application

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
秦学志;李静一;: "基于大数据样本的银行异常账户监测方法", 系统管理学报 *

Also Published As

Publication number Publication date
CN113469804B (en) 2024-03-12

Similar Documents

Publication Publication Date Title
Tang et al. # exploration: A study of count-based exploration for deep reinforcement learning
Gao et al. ipool—information-based pooling in hierarchical graph neural networks
CN113780002B (en) Knowledge reasoning method and device based on graph representation learning and deep reinforcement learning
CN107341611A (en) A kind of operation flow based on convolutional neural networks recommends method
Bacanin et al. RFID network planning by ABC algorithm hybridized with heuristic for initial number and locations of readers
Allesina et al. The consequences of the aggregation of detritus pools in ecological networks
Hocquet et al. Ova-inn: Continual learning with invertible neural networks
CN114708479B (en) Self-adaptive defense method based on graph structure and characteristics
Chatterjee et al. Artificial neural network and the financial markets: A survey
Zhang et al. Multi-agent system application in accordance with game theory in bi-directional coordination network model
Xu et al. Multi-graph tensor networks
CN113469804A (en) Abnormal key account discovery method, system, equipment and storage medium based on graph neural network
CN117078259A (en) Cross-chain abnormal transaction detection method and system based on graph random neural network
CN117272195A (en) Block chain abnormal node detection method and system based on graph convolution attention network
Liang et al. Construction of probabilistic Boolean network for credit default data
CN111369052B (en) Simplified road network KSP optimization algorithm
Liu et al. Preventing Attacks in Interbank Credit Rating with Selective-aware Graph Neural Network.
CN114756713A (en) Graph representation learning method based on multi-source interaction fusion
CN116502132A (en) Account set identification method, device, equipment, medium and computer program product
CN113360732A (en) Big data multi-view graph clustering method
Lanbouri et al. A new approach for Trading based on Long-Short Term memory Ensemble technique
Pan et al. Role-Oriented Dynamic Network Embedding
Wu et al. Evolving deep parallel neural networks for multi-task learning
Zhang et al. Bayesian Layer Graph Convolutioanl Network for Hyperspetral Image Classification
Ou et al. Differentiable search of accurate and robust architectures

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CB03 Change of inventor or designer information

Inventor after: Wang Bailing

Inventor after: Wei Xueguang

Inventor after: Huang Junheng

Inventor after: Wei Yuliang

Inventor after: Liu Hongri

Inventor after: Wang Wei

Inventor before: Wei Xueguang

Inventor before: Huang Junheng

Inventor before: Wei Yuliang

Inventor before: Wang Bailing

Inventor before: Liu Hongri

Inventor before: Wang Wei

CB03 Change of inventor or designer information