CN114862587A - Abnormal transaction account identification method and device and computer readable storage medium - Google Patents

Abnormal transaction account identification method and device and computer readable storage medium Download PDF

Info

Publication number
CN114862587A
CN114862587A CN202210589196.7A CN202210589196A CN114862587A CN 114862587 A CN114862587 A CN 114862587A CN 202210589196 A CN202210589196 A CN 202210589196A CN 114862587 A CN114862587 A CN 114862587A
Authority
CN
China
Prior art keywords
node
training
feature vector
transaction
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210589196.7A
Other languages
Chinese (zh)
Inventor
唐旻俊
池纪锋
吴垠
高玉森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210589196.7A priority Critical patent/CN114862587A/en
Publication of CN114862587A publication Critical patent/CN114862587A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Abstract

The application discloses a method and a device for identifying an abnormal transaction account and a computer readable storage medium, and relates to the field of financial technology or other related fields. Wherein, the method comprises the following steps: acquiring a fund transaction map; generating a node feature set based on the fund transaction graph; and inputting the feature vector of each node into a pre-trained graph attention network model to obtain a prediction result corresponding to each node, wherein the prediction result is used for representing whether an account corresponding to each node is an abnormal account for fund transaction or not, and the graph attention network model is obtained based on graph training different from a fund transaction graph. The method and the device solve the technical problem that the prediction efficiency of the existing graph neural network model for the new abnormal transaction account is low.

Description

Abnormal transaction account identification method and device and computer readable storage medium
Technical Field
The application relates to the field of financial technology or other related fields, in particular to a method and a device for identifying an abnormal transaction account and a computer-readable storage medium.
Background
In the financial field, abnormal behavior of fund transaction can seriously disturb normal financial institution operation and endanger social security, so striking abnormal behavior of fund transaction is a great strategy for maintaining national long-term security. Where monetary transaction anomalous behavior can be understood to be a process in which the source and nature of revenue from illegal crimes is disguised, concealed, and by various means and means of transaction.
With the rapid development of the internet financial industry, transaction modes and economic activities become more complex and diversified, so that the work difficulty of identifying abnormal behavior of fund transaction is further improved. In the prior art, an abnormal transaction account which performs an abnormal behavior of fund transaction is usually found from a plurality of accounts through a graph neural network model, but the existing graph neural network model is good at handling a direct-push learning task, namely all training data and testing data are considered simultaneously in a learning process, and a training stage and a testing stage are on line. In other words, when the existing graph neural network model identifies an abnormal transaction account, if the graph neural network model is obtained by training according to the training map a, the graph neural network model can only predict based on the training map a, and identify an abnormal node in the training map a, which characterizes the abnormal transaction account, but the graph neural network model cannot predict based on nodes in other maps, and if the nodes in other maps need to be predicted, a new graph neural network model can only be constructed according to other maps.
Therefore, for the fund transaction graph which changes continuously and dynamically, the new nodes cannot be predicted in real time by using the existing graph neural network model, namely, the prediction efficiency of the new abnormal transaction account is low.
Disclosure of Invention
The embodiment of the application provides an identification method and device of an abnormal transaction account and a computer readable storage medium, which are used for at least solving the technical problem that the prediction efficiency of the existing graph neural network model for a new abnormal transaction account is low.
According to an aspect of the embodiments of the present application, there is provided a method for identifying an abnormal transaction account, including: acquiring a fund transaction map, wherein the fund transaction map consists of nodes and edges, the nodes represent account data of accounts corresponding to the nodes, and the edges between the two nodes represent transaction flow directions between the two nodes; generating a node feature set based on the fund transaction graph, wherein the node feature set is composed of at least one node feature vector, and each node feature vector corresponds to one node in the fund transaction graph; and inputting the feature vector of each node into a pre-trained graph attention network model to obtain a prediction result corresponding to each node, wherein the prediction result is used for representing whether an account corresponding to each node is an abnormal account for fund transaction or not, and the graph attention network model is obtained based on graph training different from a fund transaction graph.
Further, the identification method of the abnormal transaction account further comprises the following steps: before inputting each node feature vector into a pre-trained graph attention network model and obtaining a prediction result corresponding to each node, obtaining historical transaction data, wherein the historical transaction data at least comprises a plurality of abnormal accounts, a plurality of normal accounts and an account label of each account, the account label of the abnormal account represents that the abnormal account has fund transaction abnormal behavior, and the account label of the normal account represents that the normal account has not fund transaction abnormal behavior; constructing a training map according to historical transaction data; generating a first feature set and a second feature set based on a training map, wherein the first feature set comprises a plurality of historical node feature vectors, the second feature set comprises a plurality of historical edge feature vectors, each historical node feature vector corresponds to one training node in the training map, and each historical edge feature vector corresponds to one training edge in the training map; and training to obtain the graph attention network model according to the first feature set and the second feature set.
Further, the identification method of the abnormal transaction account further comprises the following steps: step 1: determining a target edge and two target nodes associated with the target edge from the training graph, wherein the target edge is any one training edge in the training graph; step 2: acquiring a first node feature vector corresponding to a target node from the first feature set, and acquiring a first edge feature vector corresponding to a target edge from the second feature set; and step 3: performing superposition processing on the first node feature vector and the first edge feature vector to obtain a second edge feature vector corresponding to the target edge; and 4, step 4: carrying out nonlinear processing on the second edge feature vector to obtain a target edge feature vector corresponding to a target edge; and 5: the steps 1 to 4 are circulated to obtain a target edge feature vector corresponding to each training edge in the training atlas; step 6: and training to obtain a graph attention network model according to the target edge feature vector corresponding to each training edge in the training graph.
Further, the identification method of the abnormal transaction account further comprises the following steps: determining at least one first training edge connected with each training node based on the training graph; determining a first target edge feature vector corresponding to a first training edge; determining an attention coefficient corresponding to the first target edge feature vector; performing aggregation processing on at least one first target edge feature vector based on the attention coefficient to obtain a second node feature vector corresponding to each training node; and training to obtain a graph attention network model according to the second node feature vector corresponding to each training node.
Further, the identification method of the abnormal transaction account further comprises the following steps: performing superposition processing on the second node feature vector corresponding to each training node and the historical node feature vector corresponding to the training node to obtain a third node feature vector corresponding to each training node; carrying out nonlinear processing on the third node feature vector to obtain a target node feature vector corresponding to each training node; and training to obtain the graph attention network model according to the target node feature vector and the account label corresponding to the target node feature vector.
Further, the identification method of the abnormal transaction account further comprises the following steps: acquiring a target fund transaction map, wherein the target fund transaction map at least comprises a plurality of nodes to be confirmed and at least one abnormal node in the fund transaction map, an account corresponding to the abnormal node is a fund transaction abnormal account, and the plurality of nodes to be confirmed are nodes which do not appear in the fund transaction map; and constructing a social network model based on the target fund transaction graph, wherein in the social network model, the larger the transaction amount between the two nodes is, the larger the weight value of the social relationship between the two nodes is, and the tighter the social relationship between the two nodes is.
Further, the identification method of the abnormal transaction account further comprises the following steps: after a social network model is established based on a fund transaction map, determining transaction nodes which have fund transaction behaviors with abnormal nodes in a plurality of nodes to be confirmed; determining the similarity between the transaction node and the abnormal node according to the social relationship weight values of the abnormal node and the transaction node in the social network model; and under the condition that the similarity is greater than the preset similarity, determining the transaction node as a candidate node, wherein the candidate node is a node with fund transaction risk.
Further, the identification method of the abnormal transaction account further comprises the following steps: after the transaction node is determined to be a candidate node, determining a fourth node feature vector corresponding to the abnormal node from the node feature set; generating a fifth node feature vector corresponding to the candidate node based on the target fund transaction map; determining covariance and variance between the feature vector of the fourth node and the feature vector of the fifth node; determining a correlation coefficient between the candidate node and the abnormal node according to the covariance and the variance; and when the correlation coefficient is larger than a preset threshold value, determining that the candidate node is a new abnormal node, and determining that the account corresponding to the candidate node is also a fund transaction abnormal account.
According to another aspect of the embodiments of the present application, there is also provided an apparatus for identifying an abnormal transaction account, including: the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a fund transaction map, the fund transaction map consists of nodes and edges, the nodes represent account data of accounts corresponding to the nodes, and the edges between the two nodes represent transaction flow directions between the two nodes; the generation module is used for generating a node feature set based on the fund transaction map, wherein the node feature set is composed of at least one node feature vector, and each node feature vector corresponds to one node in the fund transaction map; the input module is used for inputting the feature vector of each node into a pre-trained attention network model to obtain a prediction result corresponding to each node, wherein the prediction result is used for representing whether an account corresponding to each node is an abnormal fund transaction account or not, and the attention network model is obtained based on the graph training which is different from the fund transaction graph.
According to another aspect of the embodiments of the present application, there is also provided a computer-readable storage medium, in which a computer program is stored, where the computer program is configured to execute the above-mentioned method for identifying an abnormal transaction account when running.
According to another aspect of embodiments of the present application, there is also provided an electronic device, including one or more processors and a memory for storing one or more programs, wherein when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the above-mentioned method for identifying an abnormal transaction account.
In the embodiment of the application, a mode of obtaining a prediction result corresponding to each node through a pre-trained graph attention network model prediction is adopted, a fund transaction graph is obtained, and a node feature set is generated based on the fund transaction graph, wherein the node feature set is composed of at least one node feature vector, each node feature vector corresponds to one node in the fund transaction graph, the fund transaction graph is composed of nodes and edges, the nodes represent account data of accounts corresponding to the nodes, the edges between the two nodes represent transaction flow directions between the two nodes, finally, each node feature vector is input into the pre-trained graph attention network model to obtain the prediction result corresponding to each node, wherein the prediction result is used for representing whether the account corresponding to each node is an abnormal fund transaction account or not, the graph attention network model is trained based on a graph different from the fund transaction graph.
According to the above, the fund transaction graph is predicted through the graph attention network model instead of the traditional graph neural network model, so that the prediction result corresponding to each node is obtained. Because the graph attention network model is obtained based on graph training different from the fund transaction graph, in other words, the graph used by the graph attention network model in the application is different from the fund transaction graph in the training stage, on the basis, the application realizes the effect of separating and processing the training data and the test data, so that after one graph attention network model is obtained through training, any fund transaction graph can be predicted through the graph attention network model, and the fund transaction graph which changes continuously and dynamically can be responded to, thereby solving the problem that the existing graph neural network model cannot predict new nodes in real time, and improving the prediction efficiency of new abnormal transaction accounts.
Therefore, by the technical scheme, the purpose of predicting the dynamically-changed fund transaction map in real time is achieved, the effect of improving the prediction timeliness of the abnormal transaction account is achieved, and the problem that the prediction efficiency of the prior art on the new abnormal transaction account is low is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow chart of an alternative method of anomalous transaction account identification in accordance with an embodiment of the present application;
FIG. 2 is a flow chart of an alternative method of identifying anomalous transaction accounts in accordance with an embodiment of the present application;
FIG. 3 is a schematic diagram of an alternative anomalous transaction account identification mechanism according to an embodiment of the present application;
FIG. 4 is a schematic diagram of an alternative electronic device according to an embodiment of the present application.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In addition, it should be noted that the relevant information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for presentation, analyzed data, etc.) referred to in the present disclosure are information and data authorized by the user or sufficiently authorized by each party. For example, an interface is provided between the system and the relevant user or organization, before obtaining the relevant information, an obtaining request needs to be sent to the user or organization through the interface, and after receiving the consent information fed back by the user or organization, the relevant information is obtained.
Example 1
In accordance with an embodiment of the present application, there is provided an embodiment of a method for identifying anomalous transaction accounts, wherein the steps illustrated in the flowchart of the figure may be performed in a computer system, such as a set of computer-executable instructions, and wherein although a logical order is illustrated in the flowchart, in some cases the steps illustrated or described may be performed in an order different than that illustrated.
It should be noted that an abnormal account identification system may be used as an execution subject of the identification method for an abnormal transaction account in the embodiment of the present application, where the abnormal account identification system may be run on an electronic device.
Fig. 1 is a flowchart of an alternative abnormal transaction account identification method according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:
step S101, obtaining a fund transaction map.
In step S101, the fund transaction graph is composed of nodes and edges, the nodes represent account data of accounts corresponding to the nodes, and the edges between the two nodes represent transaction flow between the two nodes.
The fund transaction map is constructed based on test data, the test data can be transaction data to be predicted, for each transaction data, the abnormal account identification system can allocate a transaction ID (unique identification) to the abnormal account identification system, and simultaneously collect source user information (including user ID, age, occupation and gender), a source account ID, transaction time, transaction types (including electronic payment shopping, POS (point of sales) machine payment shopping, atm (automatic teller machine) money fetching, atm money deposit, bank counter money deposit, electronic transfer, paper transfer and the like), a beneficiary account ID and a beneficiary ID. Wherein, the source account can be understood as a payment account in the transaction, the source user can be understood as a holder of the source account, the beneficiary account can be understood as a collection account in the transaction, and the beneficiary can be understood as a holder of the beneficiary account.
After obtaining the transaction data and obtaining the information for each transaction data, the abnormal account identification system constructs a fund transaction graph by taking each user as a node and taking the transaction flow direction as a directed edge, wherein in the application, G (V, E) can be used for representing the fund transaction graph, wherein V represents a set of nodes, and E represents a set of edges between the nodes.
And step S102, generating a node feature set based on the fund transaction graph.
In step S102, the node feature set is composed of at least one node feature vector, each node feature vector corresponding to one node in the fund transaction graph.
In an alternative embodiment, the anomaly account identification system may generate an adjacency matrix H based on the constructed fund transaction graph, where H is ij Is the element of the ith row and the jth column in the adjacent matrix, if the user i has a transaction initiated to the user j, h ij If user j has a transaction initiated to user i, then h ji 1. Secondly, the abnormal account identification system can represent the characteristics of each node in the fund transaction graph through the node characteristic vector and can also represent the characteristics of each edge in the fund transaction graph through the edge characteristic vector. For example, the node feature vector for user i may be represented as f i The edge feature vector for a transaction initiated by user i to user j may be represented as f ij . Wherein the characteristics of the user (i.e., the characteristics of each node) include at least a user ID, an age, a occupation, a gender, and the like. In addition, the abnormal account identification system can also utilize the characteristics of the one-hot coded users to carry out numeralization, for example, the sex "male" is represented as "10", and the sex "female" is represented as "01"; occupation [ "actor", "chef", "official", "engineer", "lawyer"]Are sequentially encoded as [ "10000", "01000", "00100", "00010", "00001 ]"]. The characteristics of the transaction (i.e., characteristics of each edge) include at least a source user ID, a source account ID, a transaction time, a transaction type, a beneficiary account ID, and a beneficiary ID. Similarly, the anomaly account identification system also uses unique hot codes to quantify the characteristics of the transaction.
It should be noted that the characteristics of the transaction and the characteristics of the user are discrete characteristics, such as a doctor, a teacher, etc. in the career of the user, and these discrete characteristics need to be mapped to the euclidean space by the unique hot code to be used in the machine learning algorithm such as regression, classification, clustering, etc. Since the calculation of the distance between features or the calculation of the similarity in machine learning are both calculated in the euclidean space, it is necessary to convert discrete features.
Features of each node are coded by one-hot codingAnd after the feature of each edge is coded, obtaining a node feature vector corresponding to each node and an edge feature vector corresponding to each edge, and forming a node feature set by the abnormal account identification system through all the node feature vectors and forming an edge feature set through all the edge feature vectors. For example, if there are m nodes, e edges, in the fund transaction graph, the set of node features generated by the abnormal account identification system is represented as F V ={f 1 ,f 2 ,…,f m Therein of
Figure BDA0003666843490000071
n represents the dimension of the node feature vector,
Figure BDA0003666843490000072
an n-dimensional vector space. The set of edge features generated by the anomalous account identification system is denoted as F E ={f 11 ,f 12 ,…,f mm Therein of
Figure BDA0003666843490000073
e represents the dimension of the edge feature vector,
Figure BDA0003666843490000074
representing an e-dimensional vector space.
Step S103, inputting the feature vector of each node into a pre-trained graph attention network model to obtain a prediction result corresponding to each node.
In step S103, the prediction result is used to characterize whether the account corresponding to each node is an abnormal account for fund transaction, and the graph attention network model is trained based on a graph different from the fund transaction graph.
In an optional embodiment, the graph attention network model updates each edge feature vector in the edge feature set according to an update function, and then aggregates the updated edge feature vectors to each node feature vector according to an aggregation function to obtain an aggregated node feature vector, where there are many nodes in the fund transaction graphTherefore, the number of the node feature vectors after aggregation is also multiple. Subsequently, the graph attention network model updates the plurality of aggregated node feature vectors according to an update function to obtain a plurality of node feature vectors to be predicted, for example, for a certain node feature vector f in the node feature set i Finally, correspondingly generating a node feature vector to be predicted
Figure BDA0003666843490000075
The graph attention network model takes a plurality of node feature vectors to be predicted as the input of a multilayer perceptron, and generates probability distribution of each node about a corresponding label through a softmax (normalized exponential function) layer, wherein a loss function of the graph attention network model adopts a cross entropy function, and L2 regularization is added to prevent the problem that the graph attention network model is over-fitted, and the formula of the loss function is as follows:
Figure BDA0003666843490000076
wherein, y ij Is the corresponding label of the node; p is a radical of ij Probability values of the classes corresponding to the nodes predicted by the model; lambada | | theta | | non-conducting phosphor 2 The labels are regular items, and can be normal account labels and abnormal account labels, the normal account labels represent that the accounts corresponding to the nodes do not have fund transaction abnormal behaviors, and the abnormal account labels represent that the accounts corresponding to the nodes have fund transaction abnormal behaviors. And if the probability of the normal account label of the node is greater than that of the abnormal account label of the node, the account corresponding to the node is the fund transaction abnormal account.
It should be noted that, compared with the traditional graph neural network model, the graph attention network model can process an inducitive task, the inducitive task is different from a graph processed in a training stage and a testing stage, generally, the inducitive task only needs to be processed on a sub-graph in the training stage, and an unknown graph needs to be processed in the testing stage.
Based on the contents of the above steps S101 to S103, in the embodiment of the present application, a prediction result corresponding to each node is obtained by predicting through a pre-trained graph attention network model, a fund transaction graph is obtained, and a node feature set is generated based on the fund transaction graph, where the node feature set is composed of at least one node feature vector, each node feature vector corresponds to one node in the fund transaction graph, the fund transaction graph is composed of nodes and edges, the nodes represent account data of accounts corresponding to the nodes, the edges between the two nodes represent transaction flow directions between the two nodes, and finally, each node feature vector is input into the pre-trained graph attention network model to obtain a prediction result corresponding to each node, where the prediction result is used to represent whether an account corresponding to each node is an abnormal fund transaction account or not, the graph attention network model is trained based on a graph different from the fund transaction graph.
According to the above, the fund transaction graph is predicted through the graph attention network model instead of the traditional graph neural network model, so that the prediction result corresponding to each node is obtained. Because the graph attention network model is obtained based on graph training different from the fund transaction graph, in other words, the graph used by the graph attention network model in the application is different from the fund transaction graph in the training stage, on the basis, the application realizes the effect of separating and processing the training data and the test data, so that after one graph attention network model is obtained through training, any fund transaction graph can be predicted through the graph attention network model, and the fund transaction graph which changes continuously and dynamically can be responded to, thereby solving the problem that the existing graph neural network model cannot predict new nodes in real time, and improving the prediction efficiency of new abnormal transaction accounts.
Therefore, by the technical scheme, the purpose of predicting the dynamically-changed fund transaction map in real time is achieved, the effect of improving the prediction timeliness of the abnormal transaction account is achieved, and the problem that the prediction efficiency of the prior art on the new abnormal transaction account is low is solved.
In an optional embodiment, before inputting each node feature vector into a pre-trained graph attention network model and obtaining a prediction result corresponding to each node, the abnormal account identification system needs to train the graph attention network model first, specifically, the abnormal account identification system obtains historical transaction data first, wherein the historical transaction data at least comprises a plurality of abnormal accounts, a plurality of normal accounts and an account label of each account, the account label of the abnormal account represents that the abnormal account has a fund transaction abnormal behavior, and the account label of the normal account represents that the normal account has no fund transaction abnormal behavior. Then, the abnormal account recognition system constructs a training map according to the historical transaction data, and generates a first feature set and a second feature set based on the training map, wherein the first feature set comprises a plurality of historical node feature vectors, the second feature set comprises a plurality of historical edge feature vectors, each historical node feature vector corresponds to one training node in the training map, and each historical edge feature vector corresponds to one training edge in the training map. And finally, the abnormal account identification system trains to obtain a graph attention network model according to the first characteristic set and the second characteristic set.
Optionally, the abnormal account identification system may select transaction data in a historical time period from the database as historical transaction data, where the historical transaction data at least includes an abnormal account and a normal account, and includes account information of each account, information of an account holder, transaction detail information, and an account label of the account. The account label can be a label marked manually or a label marked automatically through machine learning. After the historical transaction data are obtained, the abnormal account identification system constructs a training map based on the historical transaction data, simultaneously carries out independent hot coding on the historical node characteristics corresponding to each training node and carries out independent hot coding on the historical nodes corresponding to each training edge based on the training map, and obtains historical node characteristic vectors and historical edge characteristic vectors. Because the training map has a plurality of training nodes and a plurality of training edges, the abnormal account recognition system obtains a plurality of historical node feature vectors and a plurality of historical edge feature vectors based on the training map, generates a first feature set based on the plurality of historical node feature vectors, and generates a second feature set based on the plurality of historical edge feature vectors.
In an alternative embodiment, transaction characteristics between transaction nodes can fully reflect transaction behaviors and fund flow directions of users, so that the characteristics of the nodes and simple topological structures between the nodes cannot be considered when the characteristics are aggregated by using a graph attention network, and the characteristics of edges should be considered when the nodes are embedded. Specifically, in order to aggregate the features of the training variables to each training node, the following steps need to be performed:
step 1: determining a target edge and two target nodes associated with the target edge from the training graph, wherein the target edge is any one training edge in the training graph; step 2: acquiring a first node feature vector corresponding to a target node from the first feature set, and acquiring a first edge feature vector corresponding to a target edge from the second feature set; and step 3: performing superposition processing on the first node feature vector and the first edge feature vector to obtain a second edge feature vector corresponding to the target edge; and 4, step 4: carrying out nonlinear processing on the second edge feature vector to obtain a target edge feature vector corresponding to a target edge; and 5: the steps 1 to 4 are circulated to obtain a target edge feature vector corresponding to each training edge in the training atlas; step 6: and training to obtain a graph attention network model according to the target edge feature vector corresponding to each training edge in the training graph.
Optionally, the following takes the first feature set as F V ={f 1 ,f 2 ,…,f m Is the second feature set to F E ={f 11 ,f 12 ,…,f mm Explaining by way of example, it is noted that the embedding process of the force network node can be shown as the following algorithm 1:
inputting: g (V, E), F V ,F E
And (3) outputting: f' V ={f′ 1 ,f′ 2 ,…,f′ m };
For k∈{1,...,ε}do;
Updating edge attributes
Figure BDA0003666843490000101
End for;
For h∈{1,...,m}do;
let F′ Eh ={f′ h1 ,f′ h2 ,...,f′ hm };
Aggregating the attributes of edges to each node
Figure BDA0003666843490000102
Updating node attributes
Figure BDA0003666843490000103
End for;
Let F′ V ={f′ 1 ,f′ 2 ,...,f′ m };
let F′ E ={f′ 11 ,f′ 12 ,...,f′ mm };
Aggregating attributes of global edges
Figure BDA0003666843490000104
Aggregating attributes of global points
Figure BDA0003666843490000105
Return(F′ E ,F′ V )
Where, in algorithm 1, ρ is the aggregation function,
Figure BDA0003666843490000106
is an update function. In the graph attention network, a self-attention mechanism is implemented for each node by
Figure BDA0003666843490000107
Attention coefficient of e ij =a[Wf i ||Wf j ]) The correlation between node i and node j in the self-attention mechanism is:
Figure BDA0003666843490000108
in the formula: alpha is alpha ij Attention coefficients for node j to node i; n (i) is a neighbor of node i;
Figure BDA0003666843490000109
the weight matrix to be trained represents the relationship between the input n features and the output n' features, and is used for converting the node features into the features of higher levels. Using leakyreu as the activation function, normalized by the softmax function, where | | | represents vector concatenation. Thus, the aggregation function in algorithm 1 is represented as follows:
f′ j =σ(∑ j∈N(i) α ij Wf j )
in an alternative embodiment, the first edge feature vector corresponding to the target edge may be f in algorithm 1 ij The first node feature vector may be f in algorithm 1 i 、f j . In Algorithm 1
Figure BDA0003666843490000111
Indicating that f is first updated according to the update function ij 、f i 、f j The three are superposed to obtain a second side feature vector, and then the second side feature vector is subjected to nonlinear processing to obtain a target side feature vector f' ij . In Algorithm 1After the for loop is executed, the target edge feature vector corresponding to each training edge can be obtained.
In an optional embodiment, after obtaining the target edge feature vector corresponding to each training edge, the abnormal account recognition system determines, based on the training graph, at least one first training edge connected to each training node, determines a first target edge feature vector corresponding to the first training edge, and determines an attention coefficient corresponding to the first target edge feature vector, and then the abnormal account recognition system performs aggregation processing on the at least one first target edge feature vector based on the attention coefficient to obtain a second node feature vector corresponding to each training node, and trains to obtain the graph attention network model according to the second node feature vector corresponding to each training node.
As shown in algorithm 1, For h is in an element of { 1., m } do, which represents that a training node h is selected from m training nodes; let F' Eh ={f′ h1 ,f′ h2 ,...,f′ hm Denoted by f 'is a first target edge feature vector corresponding to at least one training edge connected to the training node h' h1 ,f′ h2 ,...,f′ hm
Figure BDA0003666843490000112
The expression is that at least one first target edge feature vector is subjected to aggregation processing according to an aggregation function rho to obtain a second node feature vector corresponding to a training node h
Figure BDA0003666843490000113
The aggregation function ρ includes an attention coefficient.
In an optional embodiment, after obtaining the second node feature vector corresponding to each training node, the abnormal account identification system performs superposition processing on the second node feature vector corresponding to each training node and the historical node feature vector corresponding to the training node to obtain a third node feature vector corresponding to each training node, performs nonlinear processing on the third node feature vector to obtain a target node feature vector corresponding to each training node, and finally trains to obtain the graph attention network model according to the target node feature vector and the account label corresponding to the target node feature vector.
Alternatively, as shown in algorithm 1,
Figure BDA0003666843490000114
representing the feature vector of the second node corresponding to the training node h according to the updating function
Figure BDA0003666843490000115
Historical node feature vector f corresponding to training node h h And performing superposition processing to obtain a third node feature vector corresponding to the training node h, and then performing nonlinear processing on the third node feature vector to obtain a target node feature vector corresponding to the training node h. It should be noted that, in the present application, only the feature update and feature aggregation are performed on the node feature vectors of the training nodes, and the number of the node feature vectors and the corresponding relationship of the account labels are not changed, in other words, if the first feature set is F V ={f 1 ,f 2 ,…,f m F 'is the feature vector set of the target node' V ={f′ 1 ,f′ 2 ,…,f′ m Where, the feature vector f of the target node' 1 And historical node feature vector f 1 Corresponding to f' 1 The corresponding account label is f 1 A corresponding account label; target node feature vector f' m And historical node feature vector f m Corresponding to f' m The corresponding account label is f m The corresponding account label. On the basis, according to the obtained target node feature vector and the account label corresponding to the target node feature vector, the abnormal account identification system can obtain a graph attention network model through training.
Optionally, letF 'in Algorithm 1' V ={f′ 1 ,f′ 2 ,...,f′ m Get aggregated global point attributes
Figure BDA0003666843490000121
Representing that target node feature vectors corresponding to each training node are obtained circularly, and a plurality of target node feature vectors are aggregated again, wherein let F 'in algorithm 1' E ={f′ 11 ,f′ 12 ,...,f′ mm H, aggregating the attributes of the global edges
Figure BDA0003666843490000122
The method includes the steps of circularly obtaining a target side feature vector corresponding to each training side and aggregating a plurality of target side feature vectors again, but only F 'needs to be obtained in the application' V Namely, the target node feature vector corresponding to each training edge is only needed.
In an optional embodiment, after inputting each node feature vector into a pre-trained graph attention network model and obtaining a prediction result corresponding to each node, the abnormal account identification system may further obtain a target fund transaction map, where the target fund transaction map at least includes a plurality of nodes to be confirmed and at least one abnormal node in the fund transaction map, an account corresponding to the abnormal node is a fund transaction abnormal account, and the plurality of nodes to be confirmed are nodes that do not appear in the fund transaction map. And then the abnormal account identification system constructs a social network model based on the target fund transaction graph, wherein in the social network model, the larger the transaction amount between the two nodes is, the larger the weight value of the social relationship between the two nodes is, and the closer the social relationship between the two nodes is.
Optionally, the present application further provides another method for detecting an abnormal account for fund transaction, where based on the abnormal node already identified by the graph attention network model, if an account corresponding to the abnormal node also exists in a batch of new transaction data, the batch of new transaction data may have a new abnormal account for fund transaction, in other words, a new abnormal node associated with the abnormal node may exist in the new transaction data. To find these new abnormal nodes more quickly and determine the association between the abnormal nodes and the abnormal nodes already foundThe abnormal account identification system of the present application may first construct a target fund transaction map according to the new transaction data, and then construct the social network model G' (V, E, ω) based on the target fund transaction map. In the social network model, the fund in-and-out relationship between the nodes is mapped into a social relationship, the transaction amount between the two nodes is normalized into a social relationship weight value omega between the two nodes, the larger the transaction amount is, the larger the social relationship weight value omega is, the tighter the social relationship between the two nodes is represented, in addition, V belongs to V as a node set, E belongs to E as a set of edges, and omega belongs to E as a set of edges ij E ω is the set of weights. For normalized weights, the calculation formula is as follows:
Figure BDA0003666843490000131
where N (i) denotes all the neighbors of node i, N ij Representing the amount of funds transferred out by node i to node j.
In an optional embodiment, after a social network model is constructed based on a fund transaction graph, the abnormal account identification system firstly determines transaction nodes which have fund transaction behaviors with the abnormal nodes in a plurality of nodes to be confirmed, then determines similarity between the transaction nodes and the abnormal nodes according to social relationship weight values of the abnormal nodes and the transaction nodes in the social network model, and finally determines the transaction nodes as candidate nodes under the condition that the similarity is greater than a preset similarity, wherein the candidate nodes are nodes with fund transaction risks.
Optionally, in order to avoid the fund transaction abnormal behavior recognition rule set in the monitoring system, the plurality of fund transaction abnormal accounts participating in the same fund transaction abnormal behavior usually split a large amount of fund into small amounts and then perform multiple transfer operations through a plurality of intermediate nodes, so that in most cases, the abnormal nodes participating in the same fund transaction abnormal behavior usually have similar characteristics. On the basis, the method and the system firstly find the transaction nodes which have fund transaction behaviors with the abnormal nodes according to the determined abnormal nodes, and then calculate the similarity between the transaction nodes and the abnormal nodes. The following is a calculation formula for calculating the similarity between the nodes by combining the social relationship closeness degree:
Figure BDA0003666843490000132
where j ∈ N (i), N (i) represents all trading nodes associated with node i.
In an optional embodiment, after the transaction node is determined to be a candidate node, the abnormal account identification system determines a fourth node feature vector corresponding to the abnormal node from the node feature set, generates a fifth node feature vector corresponding to the candidate node based on the target fund transaction map, then determines covariance and variance between the fourth node feature vector and the fifth node feature vector, determines a correlation coefficient between the candidate node and the abnormal node according to the covariance and the variance, and finally, when the correlation coefficient is greater than a preset threshold value, the abnormal account identification system determines that the candidate node is a new abnormal node and an account corresponding to the candidate node is also an abnormal fund transaction account.
In the practical application process, the transaction node is high in similarity with the similar node and cannot be fully proved to be an abnormal node, therefore, after the candidate node with the similarity larger than the preset similarity is determined from the transaction node, the association relationship between the candidate node and the abnormal node is deeply mined by using correlation analysis, and the correlation coefficient is defined as:
Figure BDA0003666843490000141
where j is a candidate node having a similarity higher than a predetermined similarity with the abnormal node i, cov (f) i ,f j ) Is the covariance of the eigenvectors of node i and node j, D (f) i ) And D (f) j ) The variance of the eigenvectors of node i and node j, respectively. The value of the correlation coefficient rho is [ -1,1 [ ]]Absolute of rhoValues closer to 1 indicate higher linear correlation between node i and node j, and values closer to 0 indicate lower correlation. Finally, if the correlation coefficient of the node j is larger than the preset threshold, the abnormal account identification system considers the node j as a new abnormal node and adds the node j to the fund transaction abnormal account set M associated with the node i i In (1).
In an alternative embodiment, as shown in fig. 2, when identifying an abnormal fund transaction account by the technical method of the present application, the transaction data is first obtained, and then the transaction data is preprocessed, for example, a transaction ID is assigned to each transaction data, source user information, a source account ID, a transaction time, a transaction type, a beneficiary account ID, and a beneficiary ID are collected. And constructing a fund transaction map according to the preprocessed transaction data, then obtaining a node feature vector of each node by using the fund transaction map, inputting the node feature vector into a pre-trained graph attention network model, and generating a prediction result by the graph attention network model based on account labels of a multilayer perceptron to obtain abnormal nodes. After the abnormal node is determined, if a target fund transaction map containing the abnormal node exists, determining a transaction node having a transaction behavior with the abnormal node from the target fund transaction map, determining a candidate node by comparing the similarity between the transaction node and the abnormal node, and then determining whether the candidate node is a new abnormal node by calculating a correlation coefficient between the candidate node and the abnormal node, thereby achieving the purpose of excavating all fund transaction abnormal accounts.
Therefore, according to the technical scheme, the dynamic fund transaction map can be predicted in real time, the problem that an existing map neural network model cannot predict fund transaction abnormal accounts in real time is solved, and meanwhile, the association relation among a plurality of abnormal nodes can be obtained by establishing a social network model, so that the identification efficiency of the abnormal fund transaction accounts is further improved, and the purpose of fully identifying all the fund transaction abnormal accounts is achieved.
Example 2
According to an embodiment of the present application, there is further provided an apparatus for identifying an abnormal transaction account, where fig. 3 is a schematic diagram of an optional apparatus for identifying an abnormal transaction account according to an embodiment of the present application, and as shown in fig. 3, the apparatus includes: an obtaining module 301, configured to obtain a fund transaction graph, where the fund transaction graph is composed of nodes and edges, the nodes represent account data of accounts corresponding to the nodes, and the edge between two nodes represents a transaction flow direction between two nodes; a generating module 302, configured to generate a node feature set based on the fund transaction graph, where the node feature set is composed of at least one node feature vector, and each node feature vector corresponds to one node in the fund transaction graph; the input module 303 is configured to input each node feature vector into a pre-trained graph attention network model to obtain a prediction result corresponding to each node, where the prediction result is used to characterize whether an account corresponding to each node is an abnormal account for fund transaction, and the graph attention network model is obtained based on a graph different from the fund transaction graph through training.
It should be noted that the acquiring module 301, the generating module 302, and the inputting module 303 correspond to steps S101 to S103 in the above embodiment 1, and the three modules are the same as the corresponding steps in the implementation example and the application scenario, but are not limited to the disclosure in the above embodiment 1.
Optionally, the device for identifying an abnormal transaction account further includes: the device comprises a first acquisition module, a construction module, a first generation module and a training module. The system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring historical transaction data, the historical transaction data at least comprises a plurality of abnormal accounts, a plurality of normal accounts and an account label of each account, the account label of the abnormal account represents that the abnormal account has fund transaction abnormal behavior, and the account label of the normal account represents that the normal account has not fund transaction abnormal behavior; the construction module is used for constructing a training map according to historical transaction data; the first generation module is used for generating a first feature set and a second feature set based on a training map, wherein the first feature set comprises a plurality of historical node feature vectors, the second feature set comprises a plurality of historical edge feature vectors, each historical node feature vector corresponds to one training node in the training map, and each historical edge feature vector corresponds to one training edge in the training map; and the training module is used for training to obtain the graph attention network model according to the first characteristic set and the second characteristic set.
Optionally, the training module further includes: the device comprises a first execution module, a second execution module, a third execution module, a fourth execution module, a fifth execution module and a sixth execution module. The first execution module is configured to execute step 1: determining a target edge and two target nodes associated with the target edge from the training graph, wherein the target edge is any one training edge in the training graph; a second executing module, configured to execute step 2: acquiring a first node feature vector corresponding to a target node from the first feature set, and acquiring a first edge feature vector corresponding to a target edge from the second feature set; a third executing module, configured to execute step 3: performing superposition processing on the first node feature vector and the first edge feature vector to obtain a second edge feature vector corresponding to the target edge; a fourth executing module, configured to execute step 4: carrying out nonlinear processing on the second edge feature vector to obtain a target edge feature vector corresponding to a target edge; a fifth executing module, configured to execute step 5: the steps 1 to 4 are circulated to obtain a target edge feature vector corresponding to each training edge in the training atlas; a sixth executing module, configured to execute step 6: and training to obtain a graph attention network model according to the target edge feature vector corresponding to each training edge in the training graph.
Optionally, the sixth execution module further includes: the device comprises a first determination module, a second determination module, a third determination module, an aggregation module and a first training module. The first determining module is used for determining at least one first training edge connected with each training node based on the training graph; the second determining module is used for determining a first target edge feature vector corresponding to the first training edge; the third determining module is used for determining an attention coefficient corresponding to the first target edge feature vector; the aggregation module is used for carrying out aggregation processing on at least one first target edge feature vector based on the attention coefficient to obtain a second node feature vector corresponding to each training node; and the first training module is used for training to obtain the graph attention network model according to the second node feature vector corresponding to each training node.
Optionally, the first training module further includes: the device comprises a first superposition processing module, a first nonlinear processing module and a second training module. The first superposition processing module is used for carrying out superposition processing on a second node feature vector corresponding to each training node and a historical node feature vector corresponding to the training node to obtain a third node feature vector corresponding to each training node; the first nonlinear processing module is used for carrying out nonlinear processing on the third node feature vector to obtain a target node feature vector corresponding to each training node; and the second training module is used for training to obtain the graph attention network model according to the target node feature vector and the account label corresponding to the target node feature vector.
Optionally, the device for identifying an abnormal transaction account further includes: the device comprises a second acquisition module and a first construction module. The system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a target fund transaction map, the target fund transaction map at least comprises a plurality of nodes to be confirmed and at least one abnormal node in the fund transaction map, an account corresponding to the abnormal node is a fund transaction abnormal account, and the plurality of nodes to be confirmed are nodes which do not appear in the fund transaction map; the first construction module is used for constructing a social network model based on the target fund transaction graph, wherein in the social network model, the larger the transaction amount between two nodes is, the larger the weight value of the social relationship between the two nodes is, and the closer the social relationship between the two nodes is.
Optionally, the device for identifying an abnormal transaction account further includes: a fourth determination module, a fifth determination module, and a sixth determination module. The fourth determining module is used for determining a transaction node which has a fund transaction behavior with the abnormal node in the plurality of nodes to be confirmed; the fifth determining module is used for determining the similarity between the transaction node and the abnormal node according to the social relationship weight value of the abnormal node and the transaction node in the social network model; and the sixth determining module is used for determining the transaction node as a candidate node under the condition that the similarity is greater than the preset similarity, wherein the candidate node is a node with fund transaction risk.
Optionally, the device for identifying an abnormal transaction account further includes: the device comprises a seventh determining module, a second generating module, an eighth determining module, a ninth determining module and a tenth determining module. The seventh determining module is configured to determine a fourth node feature vector corresponding to the abnormal node from the node feature set; the second generation module is used for generating a fifth node feature vector corresponding to the candidate node based on the target fund transaction map; an eighth determining module, configured to determine a covariance and a variance between the feature vector of the fourth node and the feature vector of the fifth node; a ninth determining module, configured to determine a correlation coefficient between the candidate node and the abnormal node according to the covariance and the variance; and the tenth determining module is used for determining the candidate node as a new abnormal node when the correlation coefficient is greater than the preset threshold, and the account corresponding to the candidate node is also an abnormal fund transaction account.
Example 3
According to an embodiment of the present application, there is also provided a computer-readable storage medium, in which a computer program is stored, where the computer program is configured to execute the method for identifying an abnormal transaction account in embodiment 1 when the computer program is run.
Example 4
According to an embodiment of the present application, there is also provided an embodiment of an electronic device, where fig. 4 is a schematic diagram of an alternative electronic device according to the embodiment of the present application, as shown in fig. 4, the electronic device includes a processor, a memory, and a program stored in the memory and executable on the processor, and the processor implements the following steps when executing the program:
acquiring a fund transaction map, wherein the fund transaction map consists of nodes and edges, the nodes represent account data of accounts corresponding to the nodes, and the edges between the two nodes represent transaction flow directions between the two nodes; generating a node feature set based on the fund transaction graph, wherein the node feature set is composed of at least one node feature vector, and each node feature vector corresponds to one node in the fund transaction graph; and inputting the feature vector of each node into a pre-trained graph attention network model to obtain a prediction result corresponding to each node, wherein the prediction result is used for representing whether an account corresponding to each node is an abnormal account for fund transaction or not, and the graph attention network model is obtained based on graph training different from a fund transaction graph.
Optionally, the processor executes the program to further implement the following steps: before inputting each node feature vector into a pre-trained graph attention network model and obtaining a prediction result corresponding to each node, obtaining historical transaction data, wherein the historical transaction data at least comprises a plurality of abnormal accounts, a plurality of normal accounts and an account label of each account, the account label of the abnormal account represents that the abnormal account has fund transaction abnormal behavior, and the account label of the normal account represents that the normal account has not fund transaction abnormal behavior; constructing a training map according to historical transaction data; generating a first feature set and a second feature set based on a training map, wherein the first feature set comprises a plurality of historical node feature vectors, the second feature set comprises a plurality of historical edge feature vectors, each historical node feature vector corresponds to one training node in the training map, and each historical edge feature vector corresponds to one training edge in the training map; and training to obtain the graph attention network model according to the first feature set and the second feature set.
Optionally, the processor executes the program to further implement the following steps: step 1: determining a target edge and two target nodes associated with the target edge from the training graph, wherein the target edge is any one training edge in the training graph; step 2: acquiring a first node feature vector corresponding to a target node from the first feature set, and acquiring a first edge feature vector corresponding to a target edge from the second feature set; and step 3: performing superposition processing on the first node feature vector and the first edge feature vector to obtain a second edge feature vector corresponding to the target edge; and 4, step 4: carrying out nonlinear processing on the second edge feature vector to obtain a target edge feature vector corresponding to a target edge; and 5: the steps 1 to 4 are circulated to obtain a target edge feature vector corresponding to each training edge in the training atlas; step 6: and training to obtain a graph attention network model according to the target edge feature vector corresponding to each training edge in the training graph.
Optionally, the processor executes the program to further implement the following steps: determining at least one first training edge connected with each training node based on the training graph; determining a first target edge feature vector corresponding to a first training edge; determining an attention coefficient corresponding to the first target edge feature vector; performing aggregation processing on at least one first target edge feature vector based on the attention coefficient to obtain a second node feature vector corresponding to each training node; and training to obtain a graph attention network model according to the second node feature vector corresponding to each training node.
Optionally, the processor executes the program to further implement the following steps: performing superposition processing on the second node feature vector corresponding to each training node and the historical node feature vector corresponding to the training node to obtain a third node feature vector corresponding to each training node; carrying out nonlinear processing on the third node feature vector to obtain a target node feature vector corresponding to each training node; and training to obtain the graph attention network model according to the target node feature vector and the account label corresponding to the target node feature vector.
Optionally, the processor executes the program to further implement the following steps: acquiring a target fund transaction map, wherein the target fund transaction map at least comprises a plurality of nodes to be confirmed and at least one abnormal node in the fund transaction map, an account corresponding to the abnormal node is a fund transaction abnormal account, and the plurality of nodes to be confirmed are nodes which do not appear in the fund transaction map; and constructing a social network model based on the target fund transaction graph, wherein in the social network model, the larger the transaction amount between the two nodes is, the larger the weight value of the social relationship between the two nodes is, and the tighter the social relationship between the two nodes is.
Optionally, the processor executes the program to further implement the following steps: after a social network model is established based on a fund transaction map, determining transaction nodes which have fund transaction behaviors with abnormal nodes in a plurality of nodes to be confirmed; determining the similarity between the transaction node and the abnormal node according to the social relationship weight values of the abnormal node and the transaction node in the social network model; and under the condition that the similarity is greater than the preset similarity, determining the transaction node as a candidate node, wherein the candidate node is a node with fund transaction risk.
Optionally, the processor executes the program to further implement the following steps: after the transaction node is determined to be a candidate node, determining a fourth node feature vector corresponding to the abnormal node from the node feature set; generating a fifth node feature vector corresponding to the candidate node based on the target fund transaction map; determining covariance and variance between the feature vector of the fourth node and the feature vector of the fifth node; determining a correlation coefficient between the candidate node and the abnormal node according to the covariance and the variance; and when the correlation coefficient is larger than a preset threshold value, determining that the candidate node is a new abnormal node, wherein the account corresponding to the candidate node is also a fund transaction abnormal account.
The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, a division of a unit may be a division of a logic function, and an actual implementation may have another division, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or may not be executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present application and it should be noted that those skilled in the art can make several improvements and modifications without departing from the principle of the present application, and these improvements and modifications should also be considered as the protection scope of the present application.

Claims (11)

1. A method for identifying an abnormal transaction account, comprising:
acquiring a fund transaction graph, wherein the fund transaction graph consists of nodes and edges, the nodes represent account data of accounts corresponding to the nodes, and the edges between the two nodes represent transaction flow directions between the two nodes;
generating a node feature set based on the fund transaction graph, wherein the node feature set is composed of at least one node feature vector, and each node feature vector corresponds to one node in the fund transaction graph;
and inputting the feature vector of each node into a pre-trained graph attention network model to obtain a prediction result corresponding to each node, wherein the prediction result is used for representing whether an account corresponding to each node is an abnormal account of fund transaction, and the graph attention network model is obtained based on graph training different from the fund transaction graph.
2. The method of claim 1, wherein before inputting the feature vector of each node into a pre-trained attention network model to obtain a prediction result corresponding to each node, the method further comprises:
acquiring historical transaction data, wherein the historical transaction data at least comprises a plurality of abnormal accounts, a plurality of normal accounts and an account label of each account, the account label of the abnormal account represents that the abnormal account has abnormal behavior of fund transaction, and the account label of the normal account represents that the normal account has not abnormal behavior of fund transaction;
constructing a training map according to the historical transaction data;
generating a first feature set and a second feature set based on the training graph, wherein the first feature set comprises a plurality of historical node feature vectors, the second feature set comprises a plurality of historical edge feature vectors, each historical node feature vector corresponds to one training node in the training graph, and each historical edge feature vector corresponds to one training edge in the training graph;
and training to obtain the graph attention network model according to the first feature set and the second feature set.
3. The method of claim 2, wherein training the graph attention network model based on the first feature set and the second feature set comprises:
step 1: determining a target edge and two target nodes associated with the target edge from the training graph, wherein the target edge is any one of the training edges in the training graph;
step 2: acquiring a first node feature vector corresponding to the target node from the first feature set, and acquiring a first edge feature vector corresponding to the target edge from the second feature set;
and step 3: performing superposition processing on the first node feature vector and the first edge feature vector to obtain a second edge feature vector corresponding to the target edge;
and 4, step 4: carrying out nonlinear processing on the second edge feature vector to obtain a target edge feature vector corresponding to the target edge;
and 5: the steps 1 to 4 are circulated to obtain a target edge feature vector corresponding to each training edge in the training atlas;
step 6: and training to obtain the graph attention network model according to the target edge feature vector corresponding to each training edge in the training graph.
4. The method according to claim 3, wherein training the graph attention network model according to the target feature vector corresponding to each training edge in the training graph comprises:
determining at least one first training edge to which each training node is connected based on the training graph;
determining a first target edge feature vector corresponding to the first training edge;
determining an attention coefficient corresponding to the first target edge feature vector;
performing aggregation processing on at least one first target edge feature vector based on the attention coefficient to obtain a second node feature vector corresponding to each training node;
and training to obtain the graph attention network model according to the second node feature vector corresponding to each training node.
5. The method of claim 4, wherein the training of the graph attention network model according to the second node feature vector corresponding to each training node comprises:
performing superposition processing on the second node feature vector corresponding to each training node and the historical node feature vector corresponding to the training node to obtain a third node feature vector corresponding to each training node;
carrying out nonlinear processing on the third node feature vector to obtain a target node feature vector corresponding to each training node;
and training to obtain the graph attention network model according to the target node feature vector and the account label corresponding to the target node feature vector.
6. The method of claim 1, wherein after inputting the feature vector of each node into a pre-trained attention network model to obtain a prediction result corresponding to each node, the method further comprises:
acquiring a target fund transaction map, wherein the target fund transaction map at least comprises a plurality of nodes to be confirmed and at least one abnormal node in the fund transaction map, an account corresponding to the abnormal node is the fund transaction abnormal account, and the plurality of nodes to be confirmed are nodes which do not appear in the fund transaction map;
and constructing a social network model based on the target fund transaction graph, wherein in the social network model, the larger the transaction amount between two nodes is, the larger the weight value of the social relationship between the two nodes is, and the closer the social relationship between the two nodes is.
7. The method of claim 6, wherein after building a social network model based on the funding transaction graph, the method further comprises:
determining a transaction node which has performed fund transaction with the abnormal node in the plurality of nodes to be confirmed;
determining the similarity between the transaction node and the abnormal node according to the social relationship weight values of the abnormal node and the transaction node in the social network model;
and under the condition that the similarity is greater than the preset similarity, determining the transaction node as a candidate node, wherein the candidate node is a node with fund transaction risk.
8. The method of claim 7, wherein after determining that the transaction node is a candidate node, the method further comprises:
determining a fourth node feature vector corresponding to the abnormal node from the node feature set;
generating a fifth node feature vector corresponding to the candidate node based on the target fund transaction map;
determining a covariance and a variance between the fourth node feature vector and the fifth node feature vector;
determining a correlation coefficient between the candidate node and the abnormal node according to the covariance and the variance;
and when the correlation coefficient is larger than a preset threshold value, determining that the candidate node is a new abnormal node, wherein the account corresponding to the candidate node is also the fund transaction abnormal account.
9. An apparatus for identifying an anomalous transaction account, comprising:
the fund transaction graph consists of nodes and edges, wherein the nodes represent account data of accounts corresponding to the nodes, and the edges between the two nodes represent transaction flow directions between the two nodes;
a generation module, configured to generate a node feature set based on the fund transaction graph, where the node feature set is composed of at least one node feature vector, and each node feature vector corresponds to one node in the fund transaction graph;
an input module, configured to input the feature vector of each node into a pre-trained graph attention network model, so as to obtain a prediction result corresponding to each node, where the prediction result is used to characterize whether an account corresponding to each node is an abnormal account for fund transaction, and the graph attention network model is obtained by training based on a graph different from the fund transaction graph.
10. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to execute the method for identifying an anomalous transaction account as claimed in any one of claims 1 to 8 when the computer program is run.
11. An electronic device comprising one or more processors and memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of identifying anomalous transaction accounts of any of claims 1 to 8.
CN202210589196.7A 2022-05-27 2022-05-27 Abnormal transaction account identification method and device and computer readable storage medium Pending CN114862587A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210589196.7A CN114862587A (en) 2022-05-27 2022-05-27 Abnormal transaction account identification method and device and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210589196.7A CN114862587A (en) 2022-05-27 2022-05-27 Abnormal transaction account identification method and device and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN114862587A true CN114862587A (en) 2022-08-05

Family

ID=82641238

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210589196.7A Pending CN114862587A (en) 2022-05-27 2022-05-27 Abnormal transaction account identification method and device and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114862587A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115456789A (en) * 2022-11-10 2022-12-09 杭州衡泰技术股份有限公司 Abnormal transaction detection method and system based on transaction pattern recognition
CN115982664A (en) * 2023-03-09 2023-04-18 北京芯盾时代科技有限公司 Abnormal account identification method, device, equipment and storage medium
CN117421254A (en) * 2023-12-19 2024-01-19 杭银消费金融股份有限公司 Automatic test method and system for reconciliation business

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115456789A (en) * 2022-11-10 2022-12-09 杭州衡泰技术股份有限公司 Abnormal transaction detection method and system based on transaction pattern recognition
CN115456789B (en) * 2022-11-10 2023-04-07 杭州衡泰技术股份有限公司 Abnormal transaction detection method and system based on transaction pattern recognition
CN115982664A (en) * 2023-03-09 2023-04-18 北京芯盾时代科技有限公司 Abnormal account identification method, device, equipment and storage medium
CN115982664B (en) * 2023-03-09 2023-08-04 北京芯盾时代科技有限公司 Abnormal account identification method, device, equipment and storage medium
CN117421254A (en) * 2023-12-19 2024-01-19 杭银消费金融股份有限公司 Automatic test method and system for reconciliation business
CN117421254B (en) * 2023-12-19 2024-03-22 杭银消费金融股份有限公司 Automatic test method and system for reconciliation business

Similar Documents

Publication Publication Date Title
Sariev et al. Bayesian regularized artificial neural networks for the estimation of the probability of default
Nami et al. Cost-sensitive payment card fraud detection based on dynamic random forest and k-nearest neighbors
Caruso et al. Cluster Analysis for mixed data: An application to credit risk evaluation
Benchaji et al. Enhanced credit card fraud detection based on attention mechanism and LSTM deep model
Tsai Combining cluster analysis with classifier ensembles to predict financial distress
Tsai et al. Credit rating by hybrid machine learning techniques
US8355896B2 (en) Co-occurrence consistency analysis method and apparatus for finding predictive variable groups
CN114862587A (en) Abnormal transaction account identification method and device and computer readable storage medium
US11763137B2 (en) Machine learning system for various computer applications
Li et al. On performance of case-based reasoning in Chinese business failure prediction from sensitivity, specificity, positive and negative values
Ala’raj et al. A deep learning model for behavioural credit scoring in banks
Wang et al. Research on bank anti-fraud model based on K-means and hidden Markov model
Li et al. A data-driven explainable case-based reasoning approach for financial risk detection
CN116307671A (en) Risk early warning method, risk early warning device, computer equipment and storage medium
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
Pandey et al. A review of credit card fraud detection techniques
Kim et al. Predicting multiple demographic attributes with task specific embedding transformation and attention network
CN116821759A (en) Identification prediction method and device for category labels, processor and electronic equipment
Zhao et al. Network-based feature extraction method for fraud detection via label propagation
Sagar et al. Online transaction fraud detection techniques: A review of data mining approaches
CN115358878A (en) Financing user risk preference level analysis method and device
US11551317B2 (en) Property valuation model and visualization
CN114723554A (en) Abnormal account identification method and device
Xiao et al. Explainable fraud detection for few labeled time series data
CN114331463A (en) Risk identification method based on linear regression model and related equipment thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination