CN116383708A - Transaction account identification method and device - Google Patents

Transaction account identification method and device Download PDF

Info

Publication number
CN116383708A
CN116383708A CN202310604862.4A CN202310604862A CN116383708A CN 116383708 A CN116383708 A CN 116383708A CN 202310604862 A CN202310604862 A CN 202310604862A CN 116383708 A CN116383708 A CN 116383708A
Authority
CN
China
Prior art keywords
node
transaction
matrix
transaction network
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310604862.4A
Other languages
Chinese (zh)
Other versions
CN116383708B (en
Inventor
栗位勋
孙悦
蔡准
郭晓鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Trusfort Technology Co ltd
Original Assignee
Beijing Trusfort Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Trusfort Technology Co ltd filed Critical Beijing Trusfort Technology Co ltd
Priority to CN202310604862.4A priority Critical patent/CN116383708B/en
Publication of CN116383708A publication Critical patent/CN116383708A/en
Application granted granted Critical
Publication of CN116383708B publication Critical patent/CN116383708B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/27Regression, e.g. linear or logistic regression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The disclosure provides a method and a device for identifying a transaction account, which relate to the field of data analysis, wherein the method comprises the following steps: constructing corresponding transaction networks according to transaction records contained in each transaction data set, updating node attributes of all nodes in each transaction network through masks to obtain an initial node attribute matrix of the transaction network, performing first-stage propagation of node attributes according to the initial node attribute matrix and an adjacent matrix to obtain a target node attribute matrix, performing second-stage propagation of node attributes according to the target node attribute matrix and the adjacent matrix to obtain a first embedded feature matrix, updating and denoising the first embedded feature matrix of the transaction data set with the longest appointed time range, and determining node types of nodes in the corresponding transaction network. By applying the method, the node attribute is updated based on the mask, so that the phenomena of over fitting and over smoothing can be further prevented, and the analysis accuracy of abnormal transaction behaviors is higher.

Description

Transaction account identification method and device
Technical Field
The disclosure relates to the field of data analysis, and in particular relates to a method and a device for identifying a transaction account.
Background
With the continuous development of internet technology, the transaction behavior in the financial field is increasingly dependent on the internet, and electronic banking has become one of the main competitive means of banking channels and marketing.
At present, the abnormal transaction behavior is usually found by analyzing the transaction behavior through the graph rolling neural network model, but in order to improve the embedding effect of the model and prevent the phenomena of overfitting and overcomplete, the operation of deleting the node attribute is usually carried out, but the operation is only enhanced in the aspect of the node attribute, so that the accuracy of analyzing the abnormal transaction behavior through the graph rolling neural network model is not high.
Disclosure of Invention
The disclosure provides a method and a device for identifying a transaction account, which are used for at least solving the technical problems in the prior art.
According to a first aspect of the present disclosure, there is provided a method for identifying a transaction account, the method comprising:
acquiring a plurality of groups of transaction data sets with specified time ranges, wherein each transaction data set comprises a plurality of transaction records, each transaction record comprises transaction accounts of both transaction parties, the specified starting time of each group of transaction data sets is the same, and the specified ending time is different;
Constructing a transaction network of the transaction data set, wherein nodes of the transaction network are transaction accounts, and edges used for connecting two nodes in the transaction network represent transaction behaviors between the two transaction accounts;
determining node attributes of each node in the transaction network, updating the node attributes of all nodes in the transaction network through a mask, and generating an initial node attribute matrix of the transaction network;
constructing an adjacency matrix of the transaction network;
according to the initial node attribute matrix and the adjacent matrix of the transaction network, carrying out first-stage propagation of node attributes to obtain a target node attribute matrix of the transaction network;
performing second-stage propagation of node attributes according to the target node attribute matrix and the adjacent matrix of the transaction network to obtain a first embedded feature matrix of the transaction network;
updating the first embedded feature matrix of the transaction data set with the longest appointed time range to obtain a second embedded feature matrix;
denoising the second embedded feature matrix to obtain a third embedded feature matrix;
and determining the characteristic value of each node in the transaction network corresponding to the transaction data set with the longest appointed time range according to the third embedded characteristic matrix, and determining the node type according to the characteristic value.
In an embodiment, the determining the node attribute of each node in the transaction network includes: acquiring an initial attribute of each node in the transaction network; performing linear regression analysis on the initial attributes, and determining an influence value of each initial attribute on the node type; and selecting a preset number of initial attributes as the node attributes according to the influence value.
In an embodiment, the updating node attributes of all nodes in the transaction network through the mask to generate an initial node attribute matrix of the transaction network includes: acquiring a mask corresponding to each node in the transaction network; updating node attributes of nodes in the transaction network by the following formula:
Figure SMS_1
the method comprises the steps of carrying out a first treatment on the surface of the Said->
Figure SMS_2
For node->
Figure SMS_3
Node properties of>
Figure SMS_4
Is +.>
Figure SMS_5
Corresponding mask, said->
Figure SMS_6
For node->
Figure SMS_7
Updated node attributes; and generating an initial node attribute matrix of the transaction network according to the node attributes updated by all the nodes.
In an embodiment, the performing the first-stage propagation of node attributes according to the initial node attribute matrix and the adjacency matrix of the transaction network to obtain a target node attribute matrix of the transaction network includes: obtaining a target node attribute matrix of the transaction network according to the following formula:
Figure SMS_9
The method comprises the steps of carrying out a first treatment on the surface of the Said->
Figure SMS_13
Representing the +.f in the first phase propagation>
Figure SMS_15
Secondary transmission, said->
Figure SMS_10
Representing the +.sup.th in the first phase propagation>
Figure SMS_12
A target node attribute matrix obtained by secondary propagation, wherein ∈>
Figure SMS_14
Representing said adjacency matrix, said +.>
Figure SMS_16
Representing the +.sup.th in the first phase propagation>
Figure SMS_8
A target node attribute matrix obtained by secondary propagation, wherein ∈>
Figure SMS_11
Representing the initial node attribute matrix.
In an embodiment, after obtaining the attribute matrix of the target node of the transaction network, the method further includes: normalizing the target node attribute matrix by the following formula:
Figure SMS_17
the method comprises the steps of carrying out a first treatment on the surface of the Said->
Figure SMS_18
Representing the normalized target node attribute matrix, said ++>
Figure SMS_19
Representing the total number of propagation times of the first phase propagation, said +.>
Figure SMS_20
Representing the +.f in the first phase propagation>
Figure SMS_21
Secondary transmission, said->
Figure SMS_22
Representing the +.sup.th in the first phase propagation>
Figure SMS_23
And secondarily propagating the obtained target node attribute matrix.
In an embodiment, the determining, according to the third embedded feature matrix, a feature value of each node in the transaction network corresponding to the transaction data set with the longest specified time range, and determining a node type according to the feature value includes: performing inverse coding processing on the third embedded feature matrix, and determining a feature value of each node in the transaction network corresponding to the transaction data set with the longest appointed time range; and comparing the characteristic value with a preset characteristic threshold value to determine the node type.
According to a second aspect of the present disclosure, there is provided an identification device of a transaction account, the device comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a plurality of groups of transaction data sets with specified time ranges, each transaction data set comprises a plurality of transaction records, each transaction record comprises transaction accounts of two transaction parties, the specified starting time of each group of transaction data sets is the same, and the specified ending time is different;
the construction module is used for constructing a transaction network of the transaction data set, wherein nodes of the transaction network are transaction accounts, and edges used for connecting two nodes in the transaction network represent transaction behaviors between the two transaction accounts;
the generation module is used for determining the node attribute of each node in the transaction network, updating the node attribute of all nodes in the transaction network through a mask, and generating an initial node attribute matrix of the transaction network;
the construction module is also used for constructing an adjacency matrix of the transaction network;
the first propagation module is used for performing first-stage propagation of node attributes according to the initial node attribute matrix and the adjacent matrix of the transaction network to obtain a target node attribute matrix of the transaction network;
The second propagation module is used for performing second-stage propagation of node attributes according to the target node attribute matrix and the adjacent matrix of the transaction network to obtain a first embedded feature matrix of the transaction network;
the first processing module is used for updating the first embedded feature matrix of the transaction data set with the longest appointed time range to obtain a second embedded feature matrix;
the first processing module is further configured to perform denoising processing on the second embedded feature matrix to obtain a third embedded feature matrix;
and the determining module is used for determining the characteristic value of each node in the transaction network corresponding to the transaction data set with the longest appointed time range according to the third embedded characteristic matrix, and determining the node type according to the characteristic value.
In an embodiment, the generating module further includes: the first acquisition sub-module is used for acquiring the initial attribute of each node in the transaction network; the analysis submodule is used for carrying out linear regression analysis on the initial attributes and determining an influence value of each initial attribute on the node type; and the selecting sub-module is also used for selecting a preset number of initial attributes as the node attributes according to the magnitude of the influence value.
In an embodiment, the generating module further includes: a second obtaining sub-module, configured to obtain a mask corresponding to each node in the transaction network; an updating sub-module, configured to update node attributes of nodes in the transaction network according to the following formula:
Figure SMS_24
the method comprises the steps of carrying out a first treatment on the surface of the Said->
Figure SMS_25
For node->
Figure SMS_26
Node properties of>
Figure SMS_27
Is +.>
Figure SMS_28
Corresponding mask, said->
Figure SMS_29
For node->
Figure SMS_30
Updated node attributes; and the generation sub-module is used for generating an initial node attribute matrix of the transaction network according to the node attributes updated by all the nodes.
In an embodiment, the first propagation module is specifically configured to obtain the target node attribute matrix of the transaction network according to the following formula:
Figure SMS_32
the method comprises the steps of carrying out a first treatment on the surface of the Said->
Figure SMS_35
Represent the firstFirst>
Figure SMS_37
Secondary transmission, said->
Figure SMS_33
Representing the +.sup.th in the first phase propagation>
Figure SMS_36
A target node attribute matrix obtained by secondary propagation, wherein ∈>
Figure SMS_38
Representing said adjacency matrix, said +.>
Figure SMS_39
Representing the +.sup.th in the first phase propagation>
Figure SMS_31
A target node attribute matrix obtained by secondary propagation, wherein ∈>
Figure SMS_34
Representing the initial node attribute matrix.
In an embodiment, the device further comprises: the second processing module is used for carrying out normalization processing on the target node attribute matrix of the transaction network through the following formula after obtaining the target node attribute matrix:
Figure SMS_40
The method comprises the steps of carrying out a first treatment on the surface of the Said->
Figure SMS_41
Representing the normalized target node attribute matrix, said ++>
Figure SMS_42
Representing the total number of propagation times of the first phase propagation, said +.>
Figure SMS_43
Representing the +.f in the first phase propagation>
Figure SMS_44
Secondary transmission, said->
Figure SMS_45
Representing the +.sup.th in the first phase propagation>
Figure SMS_46
And secondarily propagating the obtained target node attribute matrix.
In an embodiment, the determining module is specifically configured to perform an inverse encoding process on the third embedded feature matrix, and determine a feature value of each node in the transaction network corresponding to the transaction data set with the longest specified time range; and comparing the characteristic value with a preset characteristic threshold value to determine the node type.
According to the method and the device for identifying the transaction account, a plurality of groups of transaction data sets with specified time ranges are obtained, corresponding transaction networks are constructed according to transaction records contained in each group of transaction data sets, node attributes of all nodes in each transaction network are updated according to masks, an initial node attribute matrix of the transaction network is obtained, first-stage propagation of the node attributes is carried out according to the initial node attribute matrix and an adjacent matrix of the transaction network, a target node attribute matrix of the transaction network is obtained, second-stage propagation of the node attributes is carried out according to the target node attribute matrix and the adjacent matrix, a first embedded feature matrix of the transaction network is obtained, and after the first embedded feature matrix of the transaction data set with the longest specified time range is subjected to processing such as updating and denoising, the node type of each node in the transaction network corresponding to the transaction data set with the longest specified time range is determined.
By applying the method, the node attribute of the nodes in the transaction network is updated based on the mask, so that the phenomena of over fitting and over smoothing can be further prevented, the robustness of the model for determining the node type is enhanced, and the accuracy of analysis of abnormal transaction behaviors is higher.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description when read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings, in which:
in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Fig. 1 is a schematic implementation flow diagram of a method for identifying a transaction account according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram illustrating an implementation flow of determining node attributes for each node in a transaction network according to an embodiment of the present disclosure;
FIG. 3 illustrates a schematic diagram of an implementation flow of determining a node type in an embodiment of the present disclosure;
Fig. 4 is a schematic block diagram of an identification device of a transaction account according to an embodiment of the disclosure;
fig. 5 shows a schematic diagram of a composition structure of an electronic device according to an embodiment of the present disclosure.
Detailed Description
In order to make the objects, features and advantages of the present disclosure more comprehensible, the technical solutions in the embodiments of the present disclosure will be clearly described in conjunction with the accompanying drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, but not all embodiments. Based on the embodiments in this disclosure, all other embodiments that a person skilled in the art would obtain without making any inventive effort are within the scope of protection of this disclosure.
Fig. 1 shows a schematic implementation flow diagram of a method for identifying a transaction account according to an embodiment of the disclosure, including:
step 101, obtaining a plurality of groups of transaction data sets with specified time ranges, wherein each transaction data set comprises a plurality of transaction records, each transaction record comprises transaction accounts of both transaction parties, the specified starting time of each group of transaction data sets is the same, and the specified ending time is different.
Firstly, a plurality of groups of transaction data sets with appointed time ranges are obtained, each group of transaction data sets comprises a plurality of transaction records, the appointed starting time of each group of transaction data sets is the same, and the appointed ending time is different, so that the plurality of groups of transaction data sets have inclusion relations. The differences between the specified time ranges of the two adjacent sets of transaction data remain as consistent as possible. For example, 7 sets of transaction data sets of specified time ranges are acquired, the first set is a transaction data set composed of 1 month and 1 day transaction records, the second set is a transaction data set composed of 1 month and 2 days transaction records, and so on, the seventh set is a transaction data set composed of 1 month and 1 day to 1 month and 7 days transaction records, namely, the specified time range of the first set is 1 day in length, the specified time range of the second set is 2 days in length, and so on, and the length of the specified time range of the seventh set is 7 days. The 7 sets of transaction data sets have the same designated starting time of 1 month and 1 day, the designated ending time of the 7 sets of transaction data sets is different, the latter set of transaction data sets comprises all transaction records in the former set of transaction data sets, and the difference between the lengths of designated time ranges of the two adjacent sets of transaction data sets is 1 day.
The transaction records are used for recording transaction behaviors of transaction accounts, including transaction accounts of transaction parties, transaction time, transaction amount and the like, wherein the transaction accounts of the transaction parties are specifically transaction accounts of transaction originators and transaction accounts of transaction receivers. For example, a transaction record is that an account A transfers to an account B, the account A and the account B are transaction accounts, the account A is a transaction account of a transaction initiator, and the account B is a transaction account of a transaction receiver. The transaction account number has uniqueness, and a bank card number can be preferably selected as the transaction account number in the transaction record.
Step 102, constructing a transaction network of a transaction data set, wherein nodes of the transaction network are transaction accounts, and edges used for connecting two nodes in the transaction network represent that transaction behaviors exist between the two transaction accounts.
And aiming at the transaction records contained in each transaction data set, taking the transaction account numbers as nodes, and constructing a corresponding transaction network by taking the transaction behaviors between the two transaction account numbers as edges connecting the two nodes.
It will be appreciated that the transaction network may be a directed transaction network or a directed transaction network. When the constructed transaction network is an undirected transaction network, the edges connecting the two nodes only represent that transaction behaviors exist between the nodes; when the constructed transaction network is a directed transaction network, the edges connecting the two nodes not only indicate the transaction behavior between the nodes, but also indicate the direction of the transaction behavior, the direction of the transaction behavior indicates the transfer-in or transfer-out of the transaction amount, the edges connecting the two nodes can be line segments with arrows, and the arrows point to the transaction account number of the transaction receiver from the transaction account number of the transaction initiator.
Step 103, determining the node attribute of each node in the transaction network, and updating the node attributes of all nodes in the transaction network through a mask to generate an initial node attribute matrix of the transaction network.
The node attribute is a transaction attribute related to an account corresponding to the node, such as account opening time length, accumulated transaction times, number of transaction opponents, transaction amount in a sensitive time period and the like. The determination method of the attribute value can be selected according to actual conditions, for example, if the node attribute is the account opening time length of the account, the actual account opening time length of each node can be determined first, and the attribute value of the node attribute, which is the account opening time length of each node, is obtained after normalization processing; or setting threshold values of the opening time period, wherein different opening time periods correspond to different levels and are represented by different representative values, for example, the setting representative value of the opening time period is 0 which is less than 1 year, the setting representative value of the opening time period is 1 which is 1 to 5 years, and the like, so that the corresponding representative value can be determined according to the opening time period and is used as the attribute value of the node attribute of the opening time period.
For example, if the node attribute includes an account opening time length and accumulated transaction times, when the account opening time length is longer than 5 years, the attribute value corresponding to the node attribute of the account opening time length is 1, and when the account opening time length is less than 5 years, the corresponding attribute value is 0; when the accumulated transaction number is greater than 1000 times, the node has an attribute value corresponding to the node attribute of the accumulated transaction number of 1, and when the accumulated transaction number is less than 1000 times, the corresponding attribute value is 0, and if the account opening time of one transaction account is longer than 5 years, the accumulated transaction number is less than 1000 times, and the node attribute of the node is (1, 0).
The mask is a string of binary codes, and the update of the node attributes is realized by carrying out operation on the mask and the attribute value of each node attribute of the node. The node attribute of each node is updated by performing operation with the mask, for example, in the above example, the node attribute of the node is (1, 0), and if the mask is 1, the mask and the attribute value of each node attribute of the node are subjected to AND operation, so that the updated node attribute is still (1, 0); and if the adopted mask is 0, performing AND operation on the mask and the attribute value of each node attribute of the node to obtain the updated node attribute as (0, 0).
And generating an initial node attribute matrix of the transaction network according to the node attributes updated by all the nodes, wherein one row of the initial node attribute matrix represents the node attribute updated by one node.
Step 104, constructing an adjacency matrix of the transaction network.
The adjacency matrix is a two-dimensional array for storing the relation among nodes, and the adjacency matrix of the transaction network is constructed according to the connection relation among the nodes in the transaction network. The adjacency matrix is one
Figure SMS_47
×/>
Figure SMS_48
Square matrix of->
Figure SMS_49
For the total number of nodes in the transaction network, the first +.>
Figure SMS_50
Line- >
Figure SMS_51
The column values are used to characterize the node +.>
Figure SMS_52
And node->
Figure SMS_53
Connection relation between the two.
In one embodiment, if the transaction network is an undirected transaction network, the node is
Figure SMS_58
And node->
Figure SMS_63
With a connection relationship between them, then the first +.>
Figure SMS_66
Line->
Figure SMS_56
The value of column is set to 1, while the value of the j-th row and the i-th column is set to 1; when node->
Figure SMS_67
And node->
Figure SMS_73
There is no connection between them, then the first +.>
Figure SMS_76
Line->
Figure SMS_57
The value of column is set to 0 and the value of the j-th row and i-th column is also set to 0. If the transaction network is a directed transaction network, the node is +>
Figure SMS_64
And node->
Figure SMS_72
With nodes->
Figure SMS_78
Pointing node +.>
Figure SMS_61
Is not defined by the node +.>
Figure SMS_70
Pointing node +.>
Figure SMS_79
Is to be adjacent to the directed edge of the matrix +.>
Figure SMS_80
Line->
Figure SMS_60
Column value is set to 1, +.>
Figure SMS_69
Line->
Figure SMS_75
The column value is set to 0; conversely when node->
Figure SMS_81
And node->
Figure SMS_54
With nodes->
Figure SMS_62
Pointing node +.>
Figure SMS_65
Has a directed edge of (2) and has a node +.>
Figure SMS_71
Pointing node +.>
Figure SMS_55
Is to be adjacent to the directed edge of the matrix +.>
Figure SMS_68
Line->
Figure SMS_74
Column value is set to 1, +.>
Figure SMS_77
Line->
Figure SMS_59
The column value is also set to 1.
Step 105, performing first-stage propagation of node attributes according to the initial node attribute matrix and the adjacency matrix of the transaction network, so as to obtain a target node attribute matrix of the transaction network.
Specifically, the first stage propagation refers to random propagation of updated node attributes among nodes in a transaction network, and matrix multiplication operation can be performed on an initial node attribute matrix and an adjacent matrix to obtain a target node attribute matrix of the transaction network, wherein the adjacent matrix is
Figure SMS_82
×/>
Figure SMS_83
Square matrix of->
Figure SMS_84
For the number of nodes in the transaction network, the initial node attribute matrix is as follows
Figure SMS_85
Matrix of->
Figure SMS_86
For the number of node attributes, the obtained target node attribute matrix is +.>
Figure SMS_87
Is a matrix of (a) in the matrix.
And 106, carrying out second-stage propagation of node attributes according to the target node attribute matrix and the adjacent matrix of the transaction network to obtain a first embedded feature matrix of the transaction network.
The second-stage propagation of the node attribute is achieved through a target node attribute matrix and an adjacent matrix of the transaction network, and in an implementation manner, the target node attribute matrix and the adjacent matrix of the transaction network are input into a convolutional neural network model to achieve the second-stage propagation of the node attribute, so as to obtain a first embedded feature matrix of the transaction network, wherein the convolutional neural network can be a graph convolutional neural network (Graph Convolutional Network, GCN), and the first embedded feature matrix of the transaction network is determined through the following formula:
Figure SMS_88
…………………(1)
Wherein the method comprises the steps of
Figure SMS_105
For the first embedded feature matrix,/a>
Figure SMS_91
For the adjacency matrix of the transaction network, +.>
Figure SMS_103
Is->
Figure SMS_95
Degree matrix of->
Figure SMS_104
For the target node attribute matrix, < >>
Figure SMS_106
For learning the parameter matrix. Wherein adjacency matrix->
Figure SMS_107
Is->
Figure SMS_90
Square matrix of->
Figure SMS_100
Is the number of nodes in the transaction network, thus +.>
Figure SMS_89
Also is->
Figure SMS_99
Square matrix of->
Figure SMS_93
Is->
Figure SMS_97
Matrix of->
Figure SMS_94
For node attribute, ++>
Figure SMS_102
Is->
Figure SMS_92
Matrix of->
Figure SMS_98
Is a superparameter and->
Figure SMS_96
Less than->
Figure SMS_101
And step 107, updating the first embedded feature matrix of the transaction data set with the longest appointed time range to obtain a second embedded feature matrix.
Each group of transaction data sets constructs a transaction network, the acquired multiple groups of transaction data sets construct multiple transaction networks, and each transaction network determines a corresponding first embedded feature matrix, namely, each transaction data set corresponds to one first embedded feature matrix. The first embedded feature matrix of all the transaction data sets is input into the cyclic neural network to update the first embedded feature matrix of the transaction data set with the longest appointed time range, so as to obtain a second embedded feature matrix of the transaction data set with the longest appointed time range, the rows of the second embedded feature matrix represent nodes, the columns represent attributes, and the values in the matrix represent attribute values of the nodes updated by the cyclic neural network.
Since the appointed starting time and the appointed ending time of the transaction data sets are the same, the transaction data set with the longest appointed time range can be regarded as the latest transaction data set, other groups of transaction data sets with the appointed time range can be regarded as historical transaction data sets, and the first embedded feature matrix of the historical transaction data set and the first embedded feature matrix of the latest transaction data set are input into the recurrent neural network, so that the first embedded feature matrix of the latest transaction data set is updated through the first embedded feature matrix of the historical transaction data set. The cyclic neural network can be a long-term memory (Long Short Term Memory, LSTM) neural network, preferably a gated cyclic unit (Gated Recurrent Unit, GRU) cyclic neural network, has fewer parameters, and can reduce the risk of overfitting.
The number of transaction data sets can be selected according to actual conditions, and if the result obtained by machine learning in the early stage when the transaction data of the bank is analyzed shows that the effect of inputting seven first embedded feature matrices corresponding to seven groups of transaction data sets into a second embedded feature matrix obtained in the cyclic neural network model is best, the seven groups of transaction data sets can be selected when the method is applied to the analysis of the transaction data of the bank.
And step 108, denoising the second embedded feature matrix to obtain a third embedded feature matrix.
And denoising the second embedded feature matrix through node attributes of all nodes in the transaction network corresponding to the transaction data set with the longest appointed time range.
And determining an attribute matrix according to node attributes of all nodes included in the transaction network corresponding to the transaction data set with the longest appointed time range, wherein the attribute values in the attribute matrix are not updated by the mask. The rows of the attribute matrix represent nodes, the columns represent node attributes, the number of nodes in the transaction network, the number of rows and columns of the attribute matrix can be determined according to actual conditions, and the number of the columns in the attribute matrix is the first number of nodes in the transaction network
Figure SMS_108
Line->
Figure SMS_109
The value of the column represents node +.>
Figure SMS_110
Is>
Figure SMS_111
Attribute values corresponding to the individual node attributes. And inputting the attribute matrix and the second embedded feature matrix into a graph attention model, and denoising the second embedded feature matrix, namely, reallocating the weights of the node attributes in the second embedded feature matrix by utilizing the graph attention model by means of the attribute matrix, increasing the weight proportion of important node attributes, and reducing the weight proportion of non-important node attributes.
And step 109, determining the characteristic value of each node in the transaction network corresponding to the transaction data set with the longest appointed time range according to the third embedded characteristic matrix, and determining the node type according to the characteristic value.
The third embedded feature matrix corresponds to the transaction data set with the longest appointed time range, the number of lines of the third embedded feature matrix is the number of nodes in the transaction network corresponding to the transaction data set with the longest appointed time range, the third embedded feature matrix is subjected to anti-coding processing, a value corresponding to each line of the third embedded feature matrix is determined as a feature value of a corresponding node according to an anti-coding processing result, each feature value is compared with a preset feature threshold, and the node type of each node is determined according to the size of the feature value and the preset feature threshold.
According to the identification method of the transaction account number, a plurality of groups of transaction data sets with specified time ranges are obtained, corresponding transaction networks are constructed according to transaction records contained in each group of transaction data sets, node attributes of all nodes in each transaction network are updated through masks, an initial node attribute matrix of the transaction network is obtained, first-stage propagation of the node attributes is carried out according to the initial node attribute matrix and an adjacent matrix of the transaction network, a target node attribute matrix of the transaction network is obtained, second-stage propagation of the node attributes is carried out according to the target node attribute matrix and the adjacent matrix, a first embedded feature matrix of the transaction network is obtained, and after updating and denoising processing is carried out on the first embedded feature matrix of the transaction data set with the longest specified time range, the node type of each node in the corresponding transaction network is determined. By applying the method, the node attribute of the nodes in the transaction network is updated based on the mask, so that the phenomena of over fitting and over smoothing can be further prevented, the robustness of the model for determining the node type is enhanced, and the accuracy of analysis of abnormal transaction behaviors is higher.
In one embodiment, as shown in fig. 2, the determining the node attribute of each node in the transaction network includes the following steps:
step 201, obtain an initial attribute of each node in the transaction network.
Specifically, the initial attribute of the node is a transaction attribute related to an account corresponding to the node, including: the method comprises the steps of opening time, transaction days in a preset time range, proportion of the transaction days in the preset time range to the preset time range, accumulated account number in the preset time range, ratio of accumulated account number in the preset time range to accumulated account number in the preset time range, transaction number of sensitive time periods in the preset time range, transaction amount of sensitive time periods in the preset time range, accumulated account amount in the preset time range, transaction opponent number in the preset time range, account opponent number in the preset time range, fast-forward and fast-out times in the preset time range and the like. Wherein the length of the preset time range is greater than or equal to the length of the maximum specified time range.
And 202, performing linear regression analysis on the initial attributes, and determining an influence value of each initial attribute on the node type.
Because it can be determined whether the account number is an abnormal account number according to the transaction time of the account number, the transaction amount and the like, the obtained initial attribute is mostly related to the transaction behavior of the account number. The node type comprises abnormal nodes and normal nodes, the influence of different initial attributes on the node type is different, and the influence value of each initial attribute on node classification is determined by carrying out linear regression analysis on the initial attributes.
Step 203, selecting a preset number of initial attributes as node attributes according to the magnitude of the influence value.
The larger the impact value indicates a greater impact of the initial attribute on node classification. The influence values of the initial attribute on node classification are ordered from big to small, and before selection
Figure SMS_112
Initial attribute corresponding to the individual influence value is taken as node attribute, < >>
Figure SMS_113
The value of (2) can be determined according to the actual situation.
In one embodiment, updating node attributes of all nodes in the transaction network by masking, generating an initial node attribute matrix of the transaction network includes:
acquiring a mask corresponding to each node in a transaction network;
updating node attributes of nodes in the transaction network by the following formula:
Figure SMS_114
…………………(2)
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_115
for node->
Figure SMS_116
Node properties of- >
Figure SMS_117
Is +.>
Figure SMS_118
Corresponding mask->
Figure SMS_119
For node->
Figure SMS_120
Updated node attributes;
and generating an initial node attribute matrix of the transaction network according to the node attributes updated by all the nodes.
Specifically, for each node in the transaction network, a mask corresponding to the node is randomly generated by a mask generation function. In this example, there are two masks, one is a mask of 0 and the other is a mask of 1. And acquiring the node attribute of each node in the transaction network, updating the attribute value of each node of the nodes according to the generated mask, and multiplying the node attribute value by the mask to obtain an updated attribute value. When the generated mask is 0, pass
Figure SMS_121
After updating the node attribute values, the node attribute values of the nodes are all set to 0; when the generated mask is 1, by +.>
Figure SMS_122
And updating the node attribute value, wherein the updated node attribute value is unchanged from the original node attribute value. Generating an initial node attribute matrix of the transaction network according to the node attribute values updated by all nodes in the transaction network, wherein the initial node attribute matrix is +.>
Figure SMS_123
Go->
Figure SMS_124
A matrix of columns, rows representing nodes, columns representing node attributes, and rows representing multiple node attributes after a node update.
Since the initial node attribute matrix is obtained after updating the node attribute of each node through the mask, when the mask corresponding to a certain node is 0, the attribute value corresponding to the node attribute of the certain node is set to 0, and when the node attribute is propagated through the initial node attribute matrix and the adjacent matrix, the updated attribute of the certain node is equivalent to not being propagated because the attribute value is 0, so that the node in the transaction network can be deleted, which is equivalent to performing topology enhancement processing on the transaction network.
In one embodiment, the first stage propagation of node attributes is performed according to an initial node attribute matrix and an adjacent matrix of the transaction network, to obtain a target node attribute matrix of the transaction network, including: obtaining a target node attribute matrix of the transaction network according to the following formula:
Figure SMS_125
…………………(3)
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_128
representing the +.f in the first phase propagation>
Figure SMS_129
Secondary transmission (I)>
Figure SMS_132
Representing the +.sup.th in the first phase propagation>
Figure SMS_127
Target node attribute matrix obtained by secondary propagation, +.>
Figure SMS_130
Representing adjacency matrix->
Figure SMS_131
Representing the +.sup.th in the first phase propagation>
Figure SMS_133
Target node attribute matrix obtained by secondary propagation, +.>
Figure SMS_126
Representing an initial node attribute matrix.
Specifically, the first-stage propagation of the node attribute may include multiple propagation, and the specific propagation times of the first-stage propagation may be determined according to the influence degree of the propagation times on the final node classification, for example, if the accuracy of the final node classification is high when the first-stage propagation includes 5 propagation times, the propagation times of the first-stage propagation may be selected to be 5 times.
By the formula
Figure SMS_134
Determining the corresponding result of each of the first-stage propagation, wherein A is the adjacency matrix of the transaction network, +.>
Figure SMS_135
Is->
Figure SMS_136
The target node attribute matrix obtained by the sub-propagation, that is to say by the +.>
Figure SMS_137
Determining the first part of the attribute matrix and the adjacent matrix of the target node obtained by secondary propagation>
Figure SMS_138
And secondarily propagating the obtained target node attribute matrix. If the first-stage propagation comprises 5 propagation times, the target node attribute matrix obtained by the 5 th propagation time is used as the target node attribute matrix of the transaction network for subsequent processing.
In order to prevent the target node attribute matrix obtained after multiple propagation from being too abstract and causing the loss of key node attributes, a residual method, namely adding an initial node attribute matrix during each propagation, can be adopted
Figure SMS_139
To improve the propagation capability of the first-stage propagation of the node attributes.
After obtaining the target node attribute matrix of the transaction network, carrying out normalization processing on the target node attribute matrix by the following formula:
Figure SMS_140
…………………(4)
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_141
representation after normalizationIs>
Figure SMS_142
Representing the total number of propagation times of the first phase propagation, +.>
Figure SMS_143
Representing the +.f in the first phase propagation>
Figure SMS_144
Secondary transmission (I)>
Figure SMS_145
Representing the +.sup.th in the first phase propagation>
Figure SMS_146
And secondarily propagating the obtained target node attribute matrix.
Further, in the multiple propagation of the first stage propagation, the corresponding target node attribute matrix is obtained in each propagation, the first stage propagation comprises several propagation steps to obtain several target node attribute matrices, and in order to make the target node attribute matrix of the transaction network obtained by the first stage propagation more optimal, the target node attribute matrix can be obtained by the formula
Figure SMS_147
And carrying out normalization processing on the plurality of target node attribute matrixes, and taking the matrix obtained after the normalization processing as the target node attribute matrix of the transaction network.
In an embodiment, as shown in fig. 3, determining, according to the third embedded feature matrix, a feature value of each node in the transaction network corresponding to the transaction data set with the longest specified time range, and determining the node type according to the feature value includes:
step 301, performing inverse coding processing on the third embedded feature matrix, and determining a feature value of each node in the transaction network corresponding to the transaction data set with the longest designated time range;
And step 302, comparing the characteristic value with a preset characteristic threshold value to determine the node type.
And performing inverse coding on the third embedded feature matrix, determining the feature value of each node in the matrix, specifically performing linear transformation on the third embedded feature matrix, determining the value of each row of the matrix after the linear transformation, and calculating the value of each row through an activating function sigmoid to obtain the feature value of each node. The sigmoid activation function is used to map a real number to the interval of (0, 1), so that the range of the eigenvalues of the nodes is 0 to 1, and the eigenvalues are compared with the preset eigenvalues to determine the node type, and the preset eigenvalues are a specific value between 0 and 1.
The node types comprise normal nodes and abnormal nodes, wherein the normal nodes represent transaction accounts without abnormal transaction behaviors, and the abnormal nodes represent transaction accounts with abnormal transaction behaviors. When the characteristic value corresponding to the node is larger than the preset characteristic threshold, the node is indicated to be an abnormal node, and when the characteristic value of the node is smaller than or equal to the preset characteristic threshold, the node is indicated to be a normal node, and the setting of the preset characteristic threshold can be determined according to the application scene.
In one embodiment, if the transaction network is a directed transaction network, the ingress adjacency matrix and the egress adjacency matrix of the transaction network may also be determined by the following formulas:
Figure SMS_148
…………………(5)
Figure SMS_149
…………………(6)
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_157
for shifting into the adjacency matrix, & lt & gt>
Figure SMS_166
、/>
Figure SMS_174
、/>
Figure SMS_186
And->
Figure SMS_190
Representing nodes->
Figure SMS_194
The value of (2) is 1 to +.>
Figure SMS_198
,/>
Figure SMS_151
The value of (2) is 1 to +.>
Figure SMS_161
,/>
Figure SMS_173
The value of (2) is 1 to +.>
Figure SMS_181
,/>
Figure SMS_156
The value of (2) is 1 to +.>
Figure SMS_159
N is the number of nodes contained in the directed transaction network,
Figure SMS_167
representing node->
Figure SMS_175
Pointing node +.>
Figure SMS_153
Is directed to (a) and (b) is (are) added to>
Figure SMS_162
Representing node->
Figure SMS_172
Pointing node +.>
Figure SMS_179
Is directed to (a) and (b) is (are) added to>
Figure SMS_154
Representing node->
Figure SMS_163
Pointing node +.>
Figure SMS_171
Is a directed edge of (2); if there is a node in the directed transaction network +.>
Figure SMS_177
Pointing node +.>
Figure SMS_152
Is (are) directed edges>
Figure SMS_158
Has a value of 1 if there is no node +.>
Figure SMS_165
Pointing node +.>
Figure SMS_168
Is (are) directed edges>
Figure SMS_155
The value of (2) is 0, (-)>
Figure SMS_164
Principle of (2) and->
Figure SMS_170
The same applies. />
Figure SMS_180
Is->
Figure SMS_182
And
Figure SMS_187
is only +.>
Figure SMS_195
And->
Figure SMS_199
At the same time 1, i.e. when there is a node +.>
Figure SMS_183
Pointing node +.>
Figure SMS_188
Directed edges and nodes of->
Figure SMS_193
Pointing node +.>
Figure SMS_197
In the case of directed edges of (2)/(c)>
Figure SMS_150
Is 1, indicating node->
Figure SMS_160
And node->
Figure SMS_169
Through node->
Figure SMS_178
There is a transfer-in association relationship.
Figure SMS_184
Representing node->
Figure SMS_189
Pointing node +.>
Figure SMS_192
Is directed to (a) and (b) is (are) added to>
Figure SMS_196
For all the nodes->
Figure SMS_176
The sum of the pointed directed edges, i.e. how many are bound by a node->
Figure SMS_185
Directed edge pointing out, add >
Figure SMS_191
The value of (2) is what.
Figure SMS_222
To rotate out the adjacency matrix +.>
Figure SMS_231
Representing node->
Figure SMS_235
Pointing node +.>
Figure SMS_204
Is directed to (a) and (b) is (are) added to>
Figure SMS_212
Representing node->
Figure SMS_220
Pointing node +.>
Figure SMS_226
Is directed to (a) and (b) is (are) added to>
Figure SMS_216
Representing node->
Figure SMS_225
Pointing node +.>
Figure SMS_201
Is a directional edge of (a). If there is a node in the directed transaction network +.>
Figure SMS_209
Pointing node +.>
Figure SMS_221
Is (are) directed edges>
Figure SMS_229
Has a value of 1 if not presentNode->
Figure SMS_232
Pointing node +.>
Figure SMS_236
Is (are) directed edges>
Figure SMS_205
The value of (2) is 0, (-)>
Figure SMS_214
Principle of (2) and->
Figure SMS_202
The same applies. />
Figure SMS_211
Is->
Figure SMS_200
And->
Figure SMS_208
Is only +.>
Figure SMS_215
And->
Figure SMS_224
At the same time 1, i.e. when there is a node +.>
Figure SMS_219
Pointing node +.>
Figure SMS_228
Directed edges and nodes of->
Figure SMS_207
Pointing node +.>
Figure SMS_213
In the case of directed edges of (2)/(c)>
Figure SMS_206
1 is the indication sectionPoint->
Figure SMS_210
And node->
Figure SMS_218
Through node->
Figure SMS_227
There is a roll-out association. />
Figure SMS_223
Representing node->
Figure SMS_234
Pointing node +.>
Figure SMS_203
Is directed to (a) and (b) is (are) added to>
Figure SMS_217
For all pointing nodes +.>
Figure SMS_230
Is the sum of the directed edges of (i) how many points point to the node +.>
Figure SMS_237
Is directed to (a) and (b) is (are) added to>
Figure SMS_233
The value of (2) is what.
Thus in this embodiment, for a directed transaction network, the in-and out-of-target node attribute matrices are determined by the following formula:
Figure SMS_238
……(7)
Figure SMS_239
……(8)
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_242
is the first stage of propagation +.>
Figure SMS_245
Attribute matrix of transfer-in target node obtained by secondary propagation, < >>
Figure SMS_249
Representing the transfer-in adjacency matrix @>
Figure SMS_241
Representing the +.sup.th in the first phase propagation >
Figure SMS_243
Attribute matrix of transfer-in target node obtained by secondary propagation, < >>
Figure SMS_246
Representing an initial node attribute matrix. />
Figure SMS_248
Is the first stage of propagation +.>
Figure SMS_240
The attribute matrix of the outgoing target node obtained by secondary propagation, < >>
Figure SMS_244
Representing the roll-out adjacency matrix->
Figure SMS_247
Representing the +.sup.th in the first phase propagation>
Figure SMS_250
And (5) transferring out the attribute matrix of the target node obtained by secondary propagation.
Determining a transfer-in embedded feature matrix and a transfer-out embedded feature matrix according to a transfer-in target node attribute matrix, a transfer-in adjacent matrix, a transfer-out target node attribute matrix and a transfer-out adjacent matrix of the transaction network by the following formulas:
Figure SMS_251
…………………(9)
Figure SMS_252
…………………(10)
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_254
for the transfer into the embedding feature matrix->
Figure SMS_258
For shifting into the adjacency matrix, & lt & gt>
Figure SMS_261
Is->
Figure SMS_255
Degree matrix of->
Figure SMS_256
For switching into the target node attribute matrix +.>
Figure SMS_259
For learning the parameter matrix. />
Figure SMS_262
To roll out the embedded feature matrix +.>
Figure SMS_253
To rotate out the adjacency matrix +.>
Figure SMS_257
Is->
Figure SMS_260
Degree matrix of->
Figure SMS_263
To roll out the target node attribute matrix. And combining the in-embedded feature matrix with the out-embedded feature matrix to obtain a first embedded feature matrix.
Fig. 4 is a schematic block diagram of an identification device of a transaction account according to an embodiment of the disclosure.
Referring to fig. 4, according to a second aspect of an embodiment of the present disclosure, there is provided an identification apparatus for a transaction account, the apparatus including:
The acquiring module 401 is configured to acquire a plurality of sets of transaction data sets within a specified time range, where each transaction data set includes a plurality of transaction records, each transaction record includes a transaction account of both parties of a transaction, and each set of transaction data sets has a same specified start time and a different specified end time;
the construction module 402 is configured to construct a transaction network of the transaction data set, wherein a node of the transaction network is a transaction account, and an edge, which is used for connecting two nodes, in the transaction network represents that a transaction behavior exists between the two transaction accounts;
a generating module 403, configured to determine a node attribute of each node in the transaction network, and update node attributes of all nodes in the transaction network through a mask, to generate an initial node attribute matrix of the transaction network;
a construction module 402, configured to construct an adjacency matrix of the transaction network;
a first propagation module 404, configured to perform a first stage propagation of node attributes according to the initial node attribute matrix and the adjacency matrix of the transaction network, so as to obtain a target node attribute matrix of the transaction network;
the second propagation module 405 is configured to perform second-stage propagation of node attributes according to the target node attribute matrix and the adjacency matrix of the transaction network, so as to obtain a first embedded feature matrix of the transaction network;
A first processing module 406, configured to update a first embedded feature matrix of the transaction data set with the longest specified time range, to obtain a second embedded feature matrix;
the first processing module 406 is further configured to perform denoising processing on the second embedded feature matrix to obtain a third embedded feature matrix;
the determining module 407 is configured to determine, according to the third embedded feature matrix, a feature value of each node in the transaction network corresponding to the transaction data set with the longest specified time range, and determine a node type according to the feature value.
In an embodiment, the generating module 403 further includes: a first obtaining submodule 4031, configured to obtain an initial attribute of each node in the transaction network; an analysis submodule 4032, configured to perform linear regression analysis on the initial attributes, and determine an influence value of each initial attribute on the node type; the selection submodule 4033 is further configured to select a preset number of initial attributes as node attributes according to the magnitude of the influence value.
In an embodiment, the generating module 403 further includes: a second obtaining submodule 4034, configured to obtain a mask corresponding to each node in the transaction network; an updating sub-module 4035, configured to update node attributes of nodes in the transaction network according to the following formula:
Figure SMS_264
;/>
Figure SMS_265
For node->
Figure SMS_266
Node properties of->
Figure SMS_267
Is +.>
Figure SMS_268
Corresponding mask->
Figure SMS_269
For node->
Figure SMS_270
Updated node attributes; a generating submodule 4036, configured to generate an initial node attribute matrix of the transaction network according to the node attributes updated by all the nodes.
In one embodiment, the first propagation module 404 is specifically configured to obtain the target node attribute matrix of the transaction network according to the following formula:
Figure SMS_273
;/>
Figure SMS_276
representing the +.f in the first phase propagation>
Figure SMS_277
Secondary transmission (I)>
Figure SMS_272
Representing the +.sup.th in the first phase propagation>
Figure SMS_275
Target node attribute matrix obtained by secondary propagation, +.>
Figure SMS_278
Representing adjacency matrix->
Figure SMS_279
Representing the +.sup.th in the first phase propagation>
Figure SMS_271
Target node attribute matrix obtained by secondary propagation, +.>
Figure SMS_274
Representing an initial node attribute matrix.
In an embodiment, the apparatus further comprises: the second processing module 408 is configured to normalize the target node attribute matrix of the transaction network by the following formula after obtaining the target node attribute matrix:
Figure SMS_280
;/>
Figure SMS_281
representing the normalized target node attribute matrix, < + >>
Figure SMS_282
Representing the total number of propagation times of the first phase propagation, +.>
Figure SMS_283
Representing the +.f in the first phase propagation>
Figure SMS_284
Secondary transmission (I)>
Figure SMS_285
Representing the +.sup.th in the first phase propagation>
Figure SMS_286
And secondarily propagating the obtained target node attribute matrix.
In an embodiment, the determining module 407 is specifically configured to perform an inverse encoding process on the third embedded feature matrix, and determine a feature value of each node in the transaction network corresponding to the transaction data set with the longest specified time range; and comparing the characteristic value with a preset characteristic threshold value to determine the node type.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device and a readable storage medium.
Fig. 5 illustrates a schematic block diagram of an example electronic device 500 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 5, the apparatus 500 includes a computing unit 501 that can perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM) 502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the device 500 can also be stored. The computing unit 501, ROM 502, and RAM 503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.
Various components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, etc.; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508 such as a magnetic disk, an optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 501 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 501 performs the various methods and processes described above, such as a transaction account identification method. For example, in some embodiments, a method of identifying a transaction account number may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 508. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM 502 and/or the communication unit 509. When the computer program is loaded into RAM 503 and executed by the computing unit 501, one or more steps of one of the transaction account identification methods described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured to perform a transaction account identification method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
Furthermore, the terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present disclosure, the meaning of "a plurality" is two or more, unless explicitly defined otherwise.
The foregoing is merely specific embodiments of the disclosure, but the protection scope of the disclosure is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the disclosure, and it is intended to cover the scope of the disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (10)

1. A method for identifying a transaction account, the method comprising:
acquiring a plurality of groups of transaction data sets with specified time ranges, wherein each transaction data set comprises a plurality of transaction records, each transaction record comprises transaction accounts of both transaction parties, the specified starting time of each group of transaction data sets is the same, and the specified ending time is different;
constructing a transaction network of the transaction data set, wherein nodes of the transaction network are transaction accounts, and edges used for connecting two nodes in the transaction network represent transaction behaviors between the two transaction accounts;
determining node attributes of each node in the transaction network, updating the node attributes of all nodes in the transaction network through a mask, and generating an initial node attribute matrix of the transaction network;
constructing an adjacency matrix of the transaction network;
according to the initial node attribute matrix and the adjacent matrix of the transaction network, carrying out first-stage propagation of node attributes to obtain a target node attribute matrix of the transaction network;
performing second-stage propagation of node attributes according to the target node attribute matrix and the adjacent matrix of the transaction network to obtain a first embedded feature matrix of the transaction network;
Updating the first embedded feature matrix of the transaction data set with the longest appointed time range to obtain a second embedded feature matrix;
denoising the second embedded feature matrix to obtain a third embedded feature matrix;
and determining the characteristic value of each node in the transaction network corresponding to the transaction data set with the longest appointed time range according to the third embedded characteristic matrix, and determining the node type according to the characteristic value.
2. The method of claim 1, wherein said determining node attributes for each node in the transaction network comprises:
acquiring an initial attribute of each node in the transaction network;
performing linear regression analysis on the initial attributes, and determining an influence value of each initial attribute on the node type;
and selecting a preset number of initial attributes as the node attributes according to the influence value.
3. The method of claim 1, wherein updating node attributes of all nodes in the transaction network via a mask generates an initial node attribute matrix for the transaction network, comprising:
acquiring a mask corresponding to each node in the transaction network;
Updating node attributes of nodes in the transaction network by the following formula:
Figure QLYQS_1
the said
Figure QLYQS_2
For node->
Figure QLYQS_3
Node properties of>
Figure QLYQS_4
Is +.>
Figure QLYQS_5
Corresponding mask, said->
Figure QLYQS_6
For node->
Figure QLYQS_7
Updated node attributes;
and generating an initial node attribute matrix of the transaction network according to the node attributes updated by all the nodes.
4. A method according to claim 3, wherein said first stage of node attribute propagation from said initial node attribute matrix and said adjacency matrix of said transaction network results in a target node attribute matrix of said transaction network, comprising:
obtaining a target node attribute matrix of the transaction network according to the following formula:
Figure QLYQS_8
the said
Figure QLYQS_9
Representing the +.f in the first phase propagation>
Figure QLYQS_12
Secondary transmission, said->
Figure QLYQS_14
Representing the +.sup.th in the first phase propagation>
Figure QLYQS_11
A target node attribute matrix obtained by secondary propagation, wherein ∈>
Figure QLYQS_13
Representing said adjacency matrix, said +.>
Figure QLYQS_15
Representing the +.sup.th in the first phase propagation>
Figure QLYQS_16
A target node attribute matrix obtained by secondary propagation, wherein ∈>
Figure QLYQS_10
Representing the initial node attribute matrix.
5. The method of claim 4, wherein after obtaining the target node attribute matrix for the transaction network, the method further comprises:
Normalizing the target node attribute matrix by the following formula:
Figure QLYQS_17
the said
Figure QLYQS_18
Representing the normalized target node attribute matrix, said ++>
Figure QLYQS_19
Representing the total number of propagation times of the first phase propagation, said +.>
Figure QLYQS_20
Representing the +.f in the first phase propagation>
Figure QLYQS_21
Secondary transmission, said->
Figure QLYQS_22
Representing the +.sup.th in the first phase propagation>
Figure QLYQS_23
And secondarily propagating the obtained target node attribute matrix.
6. The method according to claim 1, wherein determining the feature value of each node in the transaction network corresponding to the transaction data set with the longest specified time range according to the third embedded feature matrix, and determining the node type according to the feature value, includes:
performing inverse coding processing on the third embedded feature matrix, and determining a feature value of each node in the transaction network corresponding to the transaction data set with the longest appointed time range;
and comparing the characteristic value with a preset characteristic threshold value to determine the node type.
7. An apparatus for identifying a transaction account, the apparatus comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a plurality of groups of transaction data sets with specified time ranges, each transaction data set comprises a plurality of transaction records, each transaction record comprises transaction accounts of two transaction parties, the specified starting time of each group of transaction data sets is the same, and the specified ending time is different;
The construction module is used for constructing a transaction network of the transaction data set, wherein nodes of the transaction network are transaction accounts, and edges used for connecting two nodes in the transaction network represent transaction behaviors between the two transaction accounts;
the generation module is used for determining the node attribute of each node in the transaction network, updating the node attribute of all nodes in the transaction network through a mask, and generating an initial node attribute matrix of the transaction network;
the construction module is also used for constructing an adjacency matrix of the transaction network;
the first propagation module is used for performing first-stage propagation of node attributes according to the initial node attribute matrix and the adjacent matrix of the transaction network to obtain a target node attribute matrix of the transaction network;
the second propagation module is used for performing second-stage propagation of node attributes according to the target node attribute matrix and the adjacent matrix of the transaction network to obtain a first embedded feature matrix of the transaction network;
the first processing module is used for updating the first embedded feature matrix of the transaction data set with the longest appointed time range to obtain a second embedded feature matrix;
The first processing module is further configured to perform denoising processing on the second embedded feature matrix to obtain a third embedded feature matrix;
and the determining module is used for determining the characteristic value of each node in the transaction network corresponding to the transaction data set with the longest appointed time range according to the third embedded characteristic matrix, and determining the node type according to the characteristic value.
8. The apparatus of claim 7, wherein the generating module further comprises:
the first acquisition sub-module is used for acquiring the initial attribute of each node in the transaction network;
the analysis submodule is used for carrying out linear regression analysis on the initial attributes and determining an influence value of each initial attribute on the node type;
and the selecting sub-module is also used for selecting a preset number of initial attributes as the node attributes according to the magnitude of the influence value.
9. The apparatus of claim 7, wherein the generating module further comprises:
a second obtaining sub-module, configured to obtain a mask corresponding to each node in the transaction network;
an updating sub-module, configured to update node attributes of nodes in the transaction network according to the following formula:
Figure QLYQS_24
The said
Figure QLYQS_25
For node->
Figure QLYQS_26
Node properties of>
Figure QLYQS_27
Is +.>
Figure QLYQS_28
Corresponding mask, said->
Figure QLYQS_29
For node->
Figure QLYQS_30
Updated node attributes;
and the generation sub-module is used for generating an initial node attribute matrix of the transaction network according to the node attributes updated by all the nodes.
10. The apparatus according to claim 9, wherein the first propagation module is specifically configured to obtain the target node attribute matrix of the transaction network according to the following formula:
Figure QLYQS_31
the said
Figure QLYQS_33
Representing the +.f in the first phase propagation>
Figure QLYQS_36
Secondary transmission, said->
Figure QLYQS_38
Representing the +.sup.th in the first phase propagation>
Figure QLYQS_34
A target node attribute matrix obtained by secondary propagation, wherein ∈>
Figure QLYQS_35
Representing said adjacency matrix, said +.>
Figure QLYQS_37
Representing the +.sup.th in the first phase propagation>
Figure QLYQS_39
A target node attribute matrix obtained by secondary propagation, wherein ∈>
Figure QLYQS_32
Representing the initial node attribute matrix.
CN202310604862.4A 2023-05-25 2023-05-25 Transaction account identification method and device Active CN116383708B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310604862.4A CN116383708B (en) 2023-05-25 2023-05-25 Transaction account identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310604862.4A CN116383708B (en) 2023-05-25 2023-05-25 Transaction account identification method and device

Publications (2)

Publication Number Publication Date
CN116383708A true CN116383708A (en) 2023-07-04
CN116383708B CN116383708B (en) 2023-08-29

Family

ID=86971313

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310604862.4A Active CN116383708B (en) 2023-05-25 2023-05-25 Transaction account identification method and device

Country Status (1)

Country Link
CN (1) CN116383708B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117495571A (en) * 2023-12-28 2024-02-02 北京芯盾时代科技有限公司 Data processing method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165950A (en) * 2018-08-10 2019-01-08 哈尔滨工业大学(威海) A kind of abnormal transaction identification method based on financial time series feature, equipment and readable storage medium storing program for executing
CN110705996A (en) * 2019-10-17 2020-01-17 支付宝(杭州)信息技术有限公司 User behavior identification method, system and device based on feature mask
CN111475587A (en) * 2020-05-22 2020-07-31 支付宝(杭州)信息技术有限公司 Multi-precision data classification model, method and system
CN113420190A (en) * 2021-08-23 2021-09-21 连连(杭州)信息技术有限公司 Merchant risk identification method, device, equipment and storage medium
WO2021252677A1 (en) * 2020-06-10 2021-12-16 Nvidia Corporation Behavior modeling using client-hosted neural networks
CN115618926A (en) * 2022-11-11 2023-01-17 西安交通大学 Important factor extraction method and device for taxpayer enterprise classification
CN115953240A (en) * 2022-12-12 2023-04-11 电子科技大学 Method for identifying fraudulent user in network transaction

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109165950A (en) * 2018-08-10 2019-01-08 哈尔滨工业大学(威海) A kind of abnormal transaction identification method based on financial time series feature, equipment and readable storage medium storing program for executing
CN110705996A (en) * 2019-10-17 2020-01-17 支付宝(杭州)信息技术有限公司 User behavior identification method, system and device based on feature mask
CN111475587A (en) * 2020-05-22 2020-07-31 支付宝(杭州)信息技术有限公司 Multi-precision data classification model, method and system
WO2021252677A1 (en) * 2020-06-10 2021-12-16 Nvidia Corporation Behavior modeling using client-hosted neural networks
CN113420190A (en) * 2021-08-23 2021-09-21 连连(杭州)信息技术有限公司 Merchant risk identification method, device, equipment and storage medium
CN115618926A (en) * 2022-11-11 2023-01-17 西安交通大学 Important factor extraction method and device for taxpayer enterprise classification
CN115953240A (en) * 2022-12-12 2023-04-11 电子科技大学 Method for identifying fraudulent user in network transaction

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117495571A (en) * 2023-12-28 2024-02-02 北京芯盾时代科技有限公司 Data processing method and device, electronic equipment and storage medium
CN117495571B (en) * 2023-12-28 2024-04-05 北京芯盾时代科技有限公司 Data processing method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN116383708B (en) 2023-08-29

Similar Documents

Publication Publication Date Title
CN111970163B (en) Network flow prediction method of LSTM model based on attention mechanism
JP6771751B2 (en) Risk assessment method and system
CN116383708B (en) Transaction account identification method and device
CN112580733B (en) Classification model training method, device, equipment and storage medium
CN113360580A (en) Abnormal event detection method, device, equipment and medium based on knowledge graph
CN115809569B (en) Reliability evaluation method and device based on coupling competition failure model
CN113538070B (en) User life value cycle detection method and device and computer equipment
CN113011155B (en) Method, apparatus, device and storage medium for text matching
CN115018656A (en) Risk identification method, and training method, device and equipment of risk identification model
CN112507323A (en) Model training method and device based on unidirectional network and computing equipment
CN111079930B (en) Data set quality parameter determining method and device and electronic equipment
CN114792097B (en) Method and device for determining prompt vector of pre-training model and electronic equipment
CN116308805B (en) Transaction account identification method and device and electronic equipment
CN117495571B (en) Data processing method and device, electronic equipment and storage medium
CN115758271A (en) Data processing method, data processing device, computer equipment and storage medium
CN113361621B (en) Method and device for training model
CN115860505A (en) Object evaluation method and device, terminal equipment and storage medium
CN115330579A (en) Model watermark construction method, device, equipment and storage medium
CN114943608A (en) Fraud risk assessment method, device, equipment and storage medium
US11475324B2 (en) Dynamic recommendation system for correlated metrics and key performance indicators
CN115641481A (en) Method and device for training image processing model and image processing
CN114615092B (en) Network attack sequence generation method, device, equipment and storage medium
CN117592550B (en) Black box attack method and device for graphic neural network model
CN115456167B (en) Lightweight model training method, image processing device and electronic equipment
CN113591095B (en) Identification information processing method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant