CN111784502A - Abnormal transaction account group identification method and device - Google Patents

Abnormal transaction account group identification method and device Download PDF

Info

Publication number
CN111784502A
CN111784502A CN202010608903.3A CN202010608903A CN111784502A CN 111784502 A CN111784502 A CN 111784502A CN 202010608903 A CN202010608903 A CN 202010608903A CN 111784502 A CN111784502 A CN 111784502A
Authority
CN
China
Prior art keywords
account
risk
community
node
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010608903.3A
Other languages
Chinese (zh)
Inventor
纪耀宗
贾玉红
李晓萍
赖昂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202010608903.3A priority Critical patent/CN111784502A/en
Publication of CN111784502A publication Critical patent/CN111784502A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/01Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Abstract

The embodiment of the application provides a method and a device for identifying abnormal transaction account groups, wherein the method comprises the following steps: respectively inputting the attribute information of each account into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account; respectively executing local community mining steps aiming at each high-risk account: respectively determining the association values between other nodes and the initiating node in graph data by taking the current high-risk account as the initiating node, and determining a risk account community corresponding to the high-risk account; and if a plurality of risk account communities are obtained and preset non-similar requirements are met among the risk account communities, determining the risk account communities as abnormal transaction account groups of the target financial institution respectively. According to the method and the device, the efficiency, the reliability and the accuracy of identification of the abnormal transaction account group can be effectively improved, and further the operation safety and the reliability of the financial institution for identifying the abnormal transaction account group can be improved.

Description

Abnormal transaction account group identification method and device
Technical Field
The application relates to the technical field of data processing, in particular to a method and a device for identifying abnormal transaction account groups.
Background
Abnormal financial activity has become an increasingly serious threat to financial institution and regional security, among other things. Although the anti-abnormal financial work has been highly concerned by all parties, it is still a great challenge to effectively detect abnormal financial activities in view of the complexity and variability of abnormal financial means, the fact that the abnormal financial means exist in the form of group accounts and account transfers within the group are frequent.
Currently, most anti-abnormal financial activity methods generally build recognition models based on rules or based on account characteristic information. On one hand, although the abnormal financial account identification method based on the rules can help to find some abnormal transaction behaviors, the rules are mostly summarized according to historical data and depend too much on manual experience, so that carelessness is inevitable. Moreover, criminals also know more or less rules of anti-abnormal financial activities and intentionally avoid detection, so that the abnormal financial account identification mode based on the rules is difficult to meet the requirement of large-scale and efficient identification. On the other hand, machine learning or artificial neural network recognition models such as the GDBT and the fully-connected neural network model are established based on the characteristic information of the account, although the recognition accuracy of the abnormal financial account is greatly improved, the current abnormal financial activities often involve group crimes. The existing anti-abnormal financial model only uses the characteristic information of the account as a training sample, is only suitable for identifying the abnormal financial behavior of a single account, and cannot identify the accounts which are hidden in a group and closely related to other accounts in the group, but the accounts are often the last receiving accounts of funds.
In recent years, with the rapid development of new technologies such as electronic communication technology and social media technology, community discovery algorithms attract the attention of many scholars at home and abroad. The community refers to a node set in which nodes in the community are closely linked and nodes outside the community are sparsely linked in graph data. The community discovery means that nodes in the graph are divided into a node set which is tightly connected internally and sparsely connected with the outside. Abnormal financial behaviors in the form of group accounts have frequent account transfer in the group, but have less account transfer with accounts outside the group, and accord with the definition of a community. Community discovery is divided into local mining and full-map mining, for example, although there are many algorithms for community discovery, it cannot be directly applied to identification of group-type abnormal financial transaction accounts. Firstly, the number of accounts of a bank is huge, and the method is not suitable for a full-image community discovery algorithm with large calculation amount, so that the problem of low identification efficiency caused by overlarge calculation amount is caused; second, compared with the full-graph mining algorithm, the local community discovery algorithm has a smaller calculation amount, but has no clear choice for the initiating node of the community discovery, or the selected node does not conform to the group to perform the business interpretation of the abnormal financial behavior, which may affect the accuracy of the abnormal financial behavior identification. That is to say, the conventional abnormal transaction account community identification method cannot meet the requirements of identification efficiency and identification accuracy at the same time.
Disclosure of Invention
Aiming at the problems in the prior art, the application provides the abnormal transaction account group identification method and device, which can effectively improve the efficiency, reliability and accuracy of abnormal transaction account group identification, and further improve the operation safety and reliability of the financial institution identifying the abnormal transaction account group.
In order to solve the technical problem, the application provides the following technical scheme:
in a first aspect, the present application provides a method for identifying an abnormal transaction account group, including:
respectively inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model;
respectively executing a local community mining step for each high-risk account, wherein the local community mining step comprises the following steps: respectively determining the association values between other nodes and the initiating node in graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining a risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule;
and if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are met among the risk account communities, determining the risk account communities as abnormal transaction account groups of the target financial institution respectively.
Further, the step of inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model includes:
respectively inputting the attribute information of each account corresponding to the target financial institution into a LightGBM model, and determining at least one account as a high-risk account according to the output of the LightGBM model;
the LightGBM model is obtained by pre-training based on an attribute information training set, wherein the attribute information training set comprises attribute information of a plurality of historical accounts and labels corresponding to the historical accounts, and the labels are used for indicating whether the corresponding historical accounts are high-risk accounts or not.
Further, before the step of inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model, the method further includes:
acquiring attribute information of each account of a target financial institution within a preset time period and transaction information for constructing graph data;
the attribute information comprises attribute characteristic information and transaction characteristic information, and the transaction information comprises transfer record information between accounts.
Further, before the step of performing local community mining on each high-risk account, the method further includes:
and constructing graph data for reflecting the incidence relation between the accounts by applying the transaction information of each account, wherein each node in the graph data is in one-to-one correspondence with each account, and an edge in the graph data is used for representing the transaction information between two adjacent nodes.
Further, the determining, with the current high-risk account as an initiating node, an association value between each other node and the initiating node in graph data including the initiating node, and determining a risk account community corresponding to the current high-risk account according to the association value between each node and the initiating node and a preset close association determination rule, includes:
respectively determining the association values between other nodes and the initiating node in the graph data containing the initiating node by taking the current high-risk account as the initiating node;
and screening the nodes of which the internal association relation meets the preset close association judgment rule from other nodes except the initiating node in the graph data, and generating a risk account community corresponding to the current high-risk account according to the nodes of which the internal association relation meets the preset close association judgment rule and the initiating node.
Further, the determining, by using the current high-risk account as an originating node, the association values between each of the other nodes and the originating node in the graph data including the originating node includes:
taking the current high-risk account as an initiating node, and performing approximate page-rank calculation on the graph data including the initiating node to obtain page-rank values of other nodes except the initiating node in the graph data, wherein the page-rank values are used for representing the association degree between the corresponding nodes and the initiating node.
Further, the step of screening, among other nodes in the graph data except for the originating node, a node whose internal association satisfies a preset close association determination rule, and generating a risk account community corresponding to a current high risk account according to the node whose internal association satisfies the preset close association determination rule and the originating node includes:
sequencing the association values between each node and the initiating node according to a descending order to obtain a sequence formed by the sequenced nodes, and taking the initiating node as an initial account community;
conductivity obtaining step: extracting a first node in the current sequence, adding the currently extracted node into the account community, and acquiring a current lead value of the account community;
judging whether the current lead value of the account community is continuously decreased for a preset number of times, if so, determining all nodes except the initiating node in the current account community as nodes of which the internal association relationship meets a preset close association judgment rule; if not, returning to execute the conductivity obtaining step;
and the nodes with the internal association relation meeting the preset close association judgment rule and the initiating node form a risk account community corresponding to the current high-risk account.
Further, still include:
if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are not met among the risk account communities, merging the risk account communities with similar relationships, wherein the non-similar requirements comprise: the same initiating node is not included among the risk account communities;
and executing the local community mining step aiming at least two high-risk accounts in the combined risk account communities at the same time, wherein the at least two high-risk accounts in the combined risk account communities are all initiating nodes in the local community mining step until the rest risk account communities all meet preset non-similar requirements.
Further, still include:
and if one risk account community is obtained through the local community mining step, determining the risk account community as an abnormal transaction account group of the target financial institution.
In a second aspect, the present application further provides an abnormal transaction account group identification apparatus, including:
the high-risk account determining module is used for respectively inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model;
a local mining module, configured to perform a local community mining step for each high-risk account, where the local community mining step includes: respectively determining the association values between other nodes and the initiating node in graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining a risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule;
and the first group determining module is used for determining each risk account community as an abnormal transaction account group of the target financial institution if a plurality of risk account communities are obtained through the local community mining step and each risk account community meets a preset non-similar requirement.
Further, the high-risk account determination module is configured to perform the following:
respectively inputting the attribute information of each account corresponding to the target financial institution into a LightGBM model, and determining at least one account as a high-risk account according to the output of the LightGBM model;
the LightGBM model is obtained by pre-training based on an attribute information training set, wherein the attribute information training set comprises attribute information of a plurality of historical accounts and labels corresponding to the historical accounts, and the labels are used for indicating whether the corresponding historical accounts are high-risk accounts or not.
Further, still include:
the account information acquisition module is used for acquiring attribute information of each account of the target financial institution within a preset time period and transaction information used for constructing graph data;
the attribute information comprises attribute characteristic information and transaction characteristic information, and the transaction information comprises transfer record information between accounts.
Further, still include:
the graph data construction module is used for constructing graph data for reflecting the incidence relation between the accounts by applying the transaction information of the accounts, wherein each node in the graph data is in one-to-one correspondence with each account, and an edge in the graph data is used for representing the transaction information between two adjacent nodes.
Further, the local excavation module includes:
the initial relationship determining submodule is used for respectively determining the association values between each other node and the initiating node in the graph data containing the initiating node by taking the current high-risk account as the initiating node;
and the affinity determination submodule is used for screening the nodes of which the internal association relations meet the preset affinity association judgment rule from other nodes except the initiating node in the graph data, and generating a risk account community corresponding to the current high-risk account according to the nodes of which the internal association relations meet the preset affinity association judgment rule and the initiating node.
Further, the initial relationship determination submodule includes: an approximate page-rank calculation unit configured to perform the following:
taking the current high-risk account as an initiating node, and performing approximate page-rank calculation on the graph data including the initiating node to obtain page-rank values of other nodes except the initiating node in the graph data, wherein the page-rank values are used for representing the association degree between the corresponding nodes and the initiating node.
Further, the affinity determination submodule includes: a conductivity calculation unit for performing the following:
sequencing the association values between each node and the initiating node according to a descending order to obtain a sequence formed by the sequenced nodes, and taking the initiating node as an initial account community;
conductivity obtaining step: extracting a first node in the current sequence, adding the currently extracted node into the account community, and acquiring a current lead value of the account community;
judging whether the current lead value of the account community is continuously decreased for a preset number of times, if so, determining all nodes except the initiating node in the current account community as nodes of which the internal association relationship meets a preset close association judgment rule; if not, returning to execute the conductivity obtaining step;
and the nodes with the internal association relation meeting the preset close association judgment rule and the initiating node form a risk account community corresponding to the current high-risk account.
Further, still include: a second population determination module to perform the following:
if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are not met among the risk account communities, merging the risk account communities with similar relationships, wherein the non-similar requirements comprise: the same initiating node is not included among the risk account communities;
and executing the local community mining step aiming at least two high-risk accounts in the combined risk account communities at the same time, wherein the at least two high-risk accounts in the combined risk account communities are all initiating nodes in the local community mining step until the rest risk account communities all meet preset non-similar requirements.
Further, still include: a third population determination module to perform the following:
and if one risk account community is obtained through the local community mining step, determining the risk account community as an abnormal transaction account group of the target financial institution.
In a third aspect, the present application provides an electronic device, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the abnormal transaction account group identification method.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the abnormal transaction account group identification method.
According to the technical scheme, the abnormal transaction account group identification method and device provided by the application comprise the following steps: respectively inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model; respectively executing a local community mining step for each high-risk account, wherein the local community mining step comprises the following steps: respectively determining the association values between other nodes and the initiating node in graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining a risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule; if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are met among the risk account communities, the risk account communities are respectively determined as abnormal transaction account communities of the target financial institution, an initiating node for local community mining is determined by applying a machine learning model, local account community mining is carried out by taking a high-risk account as a starting point, the target communities have pertinence and accord with group-type abnormal financial behaviors, local community mining is carried out by applying the initiating node, the efficiency, effectiveness and accuracy of local community mining can be effectively improved, the calculated amount required by the risk account community mining corresponding to the high-risk account can be effectively reduced, the efficiency, reliability and accuracy of identifying the abnormal transaction account communities can be effectively improved, and when the group-type abnormal financial behaviors are met, aiming at the group type abnormal financial behaviors, the accounts which are hidden in the group and closely related to other accounts in the group can be identified, the requirement on abnormal finance is met, the account community excavation is more targeted, the clear high-risk account is taken as a starting point, the calculation consumption is low, the bank account community of the group type abnormal transaction behaviors can be rapidly and accurately excavated, the manpower is greatly saved, and the efficiency and the probability of identifying the abnormal financial accounts are improved. The method can help banking staff to detect the group-type abnormal financial behaviors more efficiently, greatly improve the efficiency of abnormal financial work, and further effectively improve the operation safety and reliability of financial institutions recognizing abnormal transaction account groups.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flow chart of an abnormal transaction account group identification method in an embodiment of the present application.
Fig. 2 is a flowchart illustrating an embodiment of a method for identifying an abnormal transaction account group including step 110.
Fig. 3 is a flowchart illustrating an abnormal transaction account group identification method including step 010 according to an embodiment of the present application.
Fig. 4 is a schematic flowchart of an abnormal transaction account group identification method including step 020 in the embodiment of the present application.
Fig. 5 is a schematic flowchart of a step 200 in the abnormal transaction account group identification method in the embodiment of the present application.
Fig. 6 is a flowchart illustrating a specific process of step 200 of the abnormal transaction account group identification method including step 211 according to an embodiment of the present invention.
Fig. 7 is a specific flowchart illustrating step 220 in the abnormal transaction account group identification method in the embodiment of the present application.
Fig. 8 is a flowchart illustrating an abnormal transaction account group identification method including step 410 and step 420 according to an embodiment of the present application.
Fig. 9 is a flowchart illustrating an abnormal transaction account group identification method including step 500 according to an embodiment of the present invention.
Fig. 10 is a schematic specific flowchart of an application example of the application of the abnormal transaction account group identification system to implement the abnormal transaction account group identification method.
Fig. 11 is an exemplary diagram of graph data provided in an application example of the present application.
Fig. 12 is a schematic flowchart of a specific process of extracting an account community by the fourth module provided in the application example of the present application.
Fig. 13 is a schematic diagram of an account community segmentation process in step 2 provided by an application example of the present application.
Fig. 14 is a schematic diagram of a first structure of an abnormal transaction account group identification apparatus in an embodiment of the present application.
Fig. 15 is a second configuration diagram of the abnormal transaction account group identification apparatus in the embodiment of the present application.
Fig. 16 is a third structural diagram of the abnormal transaction account group identification apparatus in the embodiment of the present application.
Fig. 17 is a schematic structural diagram of a local mining module in the abnormal transaction account group identification apparatus in the embodiment of the present application.
Fig. 18 is a schematic structural diagram of an initial relationship determination submodule in the abnormal transaction account group identification apparatus in the embodiment of the present application.
Fig. 19 is a schematic structural diagram of an affinity determination submodule in the abnormal transaction account group identification apparatus in the embodiment of the present application.
Fig. 20 is a fourth structural diagram of the abnormal transaction account group identification apparatus in the embodiment of the present application.
Fig. 21 is a schematic diagram of a fifth structure of an abnormal transaction account group identification apparatus in an embodiment of the present application.
Fig. 22 is a schematic structural diagram of an electronic apparatus in the embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In view of the problem that the existing identification mode cannot give consideration to both efficiency and accuracy, the embodiment of the application provides an abnormal transaction account group identification method, an abnormal transaction account group identification device, an electronic device and a computer-readable storage medium, wherein attribute information of each account corresponding to a target financial institution is respectively input into a machine learning model for predicting account risk, and at least one account is determined as a high-risk account according to the output of the machine learning model; respectively executing a local community mining step for each high-risk account, wherein the local community mining step comprises the following steps: respectively determining the association values between other nodes and the initiating node in graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining a risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule; if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are met among the risk account communities, the risk account communities are respectively determined as abnormal transaction account communities of the target financial institution, an initiating node for local community mining is determined by applying a machine learning model, local account community mining is carried out by taking a high-risk account as a starting point, the target communities have pertinence and accord with group-type abnormal financial behaviors, local community mining is carried out by applying the initiating node, the efficiency, effectiveness and accuracy of local community mining can be effectively improved, the calculated amount required by the risk account community mining corresponding to the high-risk account can be effectively reduced, the efficiency, reliability and accuracy of identifying the abnormal transaction account communities can be effectively improved, and when the group-type abnormal financial behaviors are met, aiming at the group type abnormal financial behaviors, the accounts which are hidden in the group and closely related to other accounts in the group can be identified, the requirement on abnormal finance is met, the account community excavation is more targeted, the clear high-risk account is taken as a starting point, the calculation consumption is low, the group type bank account community for abnormal financial behaviors can be rapidly and accurately excavated, the manpower is greatly saved, and the efficiency and the probability for identifying the abnormal transaction accounts are improved. The method can help banking staff to detect the group-type abnormal financial behaviors more efficiently, greatly improve the efficiency of abnormal financial work, and further effectively improve the operation safety and reliability of financial institutions recognizing abnormal transaction account groups.
Specifically, the following examples are given to illustrate the respective embodiments.
In one or more embodiments of the present application, the account refers to a bank account set up by a user at a target financial institution, and the attribute information of the account is divided into attribute feature information and transaction feature information of the account. The attribute characteristic information of the account is attribute information of an account owner, such as the age of a legal person of a public account, the location of a business to which the legal person belongs and the like; the transaction characteristic information of the account is transaction attribute information of the account, such as the number of times of account transactions within a certain period of account opening. The transaction information of the accounts is transfer record information between the accounts.
In order to solve the problem that the existing identification method cannot give consideration to both efficiency and accuracy, the application provides an embodiment of an abnormal transaction account group identification method, and referring to fig. 1, the abnormal transaction account group identification method specifically includes the following contents:
step 100: and respectively inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model.
In step 100, the machine learning model may specifically adopt, for example, a decision tree model to predict the account risk, and each account corresponding to the target financial institution may be obtained in advance within a preset time period of the target financial institution, and the content output by the machine learning model includes a high-risk identification result and a low-risk identification result, and the account corresponding to the high-risk identification result is determined as the high-risk account.
Step 200: respectively executing a local community mining step for each high-risk account, wherein the local community mining step comprises the following steps: and respectively determining the association values between other nodes and the initiating node in the graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining the risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule.
It can be understood that the graph data refers to a node connection graph for representing relationships between the accounts, each node in the graph data corresponds to each account one to one, and an edge in the graph data is used for representing transaction information between two adjacent nodes.
Step 300: and if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are met among the risk account communities, determining the risk account communities as abnormal transaction account groups of the target financial institution respectively.
In step 300, after the abnormal transaction account group of the target financial institution is determined, the abnormal transaction account group of the target financial institution is output, so that the target institution timely performs operations such as historical data verification, real-time transaction monitoring evidence obtaining, risk control system reporting and the like on the abnormal transaction account group of the target financial institution, and the operation safety and reliability of the financial institution identifying the abnormal transaction account group are effectively improved.
From the above description, it can be seen that the abnormal transaction account group identification method provided in the embodiment of the present application determines an initiation node for local community mining by applying a machine learning model, performs local account community mining with a high-risk account as a starting point, has pertinence and a target community conforming to a group-type abnormal financial behavior, performs local community mining by applying the initiation node, can effectively improve the efficiency, effectiveness and accuracy of local community mining, can effectively reduce the amount of calculation required by the risk account community mining corresponding to the high-risk account, and further can effectively improve the efficiency, reliability and accuracy of abnormal transaction account group identification, while conforming to the group-type abnormal financial behavior, can identify an account hidden in a group and closely associated with other accounts in the group, and satisfy the abnormal financial requirements, the account community mining is more targeted, a clear high-risk account is used as a starting point, the calculation consumption is low, the group bank account community for abnormal financial behaviors can be rapidly and accurately mined, the manpower is greatly saved, and the efficiency and the probability for identifying abnormal transaction accounts are improved. The method can help banking staff to detect the group-type abnormal financial behaviors more efficiently, greatly improve the efficiency of abnormal financial work, and further effectively improve the operation safety and reliability of financial institutions recognizing abnormal transaction account groups.
In order to further effectively determine the initiating node, in an embodiment of the abnormal transaction account group identification method provided by the present application, referring to fig. 2, step 100 in the abnormal transaction account group identification method specifically includes the following contents:
step 110: the method comprises the steps of respectively inputting attribute information of each account corresponding to a target financial institution into a LightGBM model, and determining at least one account as a high-risk account according to the output of the LightGBM model, wherein the LightGBM model is obtained by pre-training based on an attribute information training set, the attribute information training set comprises attribute information of a plurality of historical accounts and labels corresponding to the historical accounts, and the labels are used for indicating whether the corresponding historical accounts are the high-risk accounts or not.
It can be understood that the LightGBM model is an evolved version of the GBDT model, and the LightGBM model is a new member of the boosting set model, which is provided by microsoft and is an efficient implementation of GBDT like XGBoost, and in principle, it is similar to GBDT and XGBoost, and the negative gradient of the loss function is used as the residual approximation of the current decision tree to fit the new decision tree. The LightGBM model has the following features: a decision tree algorithm based on Histogram, a Leaf growth strategy of Leaf-wise with depth limitation, Histogram difference acceleration, direct support of class Feature (Cache hit rate optimization), sparse Feature optimization based on Histogram and multithread optimization.
From the above description, the method for identifying the abnormal transaction account group provided by the embodiment of the application can effectively improve the accuracy and efficiency of the selection of the initiating node, and further can further improve the efficiency, effectiveness and accuracy of local community mining by applying the initiating node.
In order to obtain account information in advance, in an embodiment of the abnormal transaction account group identification method provided by the present application, referring to fig. 3, before step 100 in the abnormal transaction account group identification method, the following contents are further specifically included:
step 010: the method comprises the steps of obtaining attribute information of each account of a target financial institution in a preset time period and transaction information used for constructing graph data, wherein the attribute information comprises attribute characteristic information and transaction characteristic information, and the transaction information comprises transfer record information between the accounts.
From the above description, the abnormal transaction account group identification method provided in the embodiment of the present application provides a reliable data base for determining the originating node according to the subsequent application attribute information, so as to further improve the efficiency and accuracy of originating node acquisition, and provides an accurate and reliable data base for subsequently constructing graph data, so as to further improve the efficiency and accuracy of local community mining.
In order to construct graph data in advance, in an embodiment of the abnormal transaction account group identification method provided by the present application, referring to fig. 4, the following contents are further specifically included after step 010 and before step 200 in the abnormal transaction account group identification method:
step 020: and constructing graph data for reflecting the incidence relation between the accounts by applying the transaction information of each account, wherein each node in the graph data is in one-to-one correspondence with each account, and an edge in the graph data is used for representing the transaction information between two adjacent nodes.
From the above description, the abnormal transaction account group identification method provided by the embodiment of the application can effectively improve the efficiency and reliability of graph data construction, and further can further improve the efficiency and accuracy of local community mining.
In order to further filter communities, in an embodiment of the abnormal transaction account group identification method provided by the present application, referring to fig. 5, step 200 in the abnormal transaction account group identification method specifically includes the following contents:
step 210: and taking the current high-risk account as an initiating node, and respectively determining the association values between other nodes and the initiating node in the graph data containing the initiating node.
Step 220: and screening the nodes of which the internal association relation meets the preset close association judgment rule from other nodes except the initiating node in the graph data, and generating a risk account community corresponding to the current high-risk account according to the nodes of which the internal association relation meets the preset close association judgment rule and the initiating node.
As can be seen from the above description, the method for identifying abnormal transaction account groups provided in the embodiment of the present application can effectively improve the reliability, accuracy and effectiveness of local community mining by using an initiating node, and further can effectively ensure the reliability of the account community of the acquired high-risk account.
In order to further determine the correlation value, in an embodiment of the abnormal transaction account group identification method provided in the present application, referring to fig. 6, step 210 in the abnormal transaction account group identification method specifically includes the following contents:
step 211: taking the current high-risk account as an initiating node, and performing approximate page-rank calculation on the graph data including the initiating node to obtain page-rank values of other nodes except the initiating node in the graph data, wherein the page-rank values are used for representing the association degree between the corresponding nodes and the initiating node.
It can be understood that the approximate page-rank algorithm is a web page ranking algorithm of google search engine, which is to construct a graph of all web pages, each web page is a node, and if there is a link from one web page to another, there is a directed edge connecting the two points. The calculation process of the page-rank algorithm is similar to a Markov chain. The page-rank algorithm also has a probability transition matrix.
In step 211, the abnormal transaction account group identification apparatus may calculate, based on the obtained diagonal matrix corresponding to each node in the graph data and the link matrix between adjacent nodes, a page-rank value of each node in the graph data except the initiating node. However, since the matrix calculation process in this manner is time-consuming and has a high memory requirement, in the preferred manner in step 211 of the present application, the approximate page-rank algorithm is selected, so that the approximate value of P in the page-rank algorithm iterative formula originally based on the inertia random walk can be quickly obtained
Figure BDA0002561657150000131
That is, p mentioned in the following formula (2), and then on the basis of meeting the accuracy requirement of the node page-rank value, there isThe efficiency of acquiring the page-rank value of the node is effectively improved.
As can be seen from the above description, the abnormal transaction account group identification method provided in the embodiment of the present application can effectively improve the accuracy and reliability of obtaining the associated values between each node and the initiating node in the graph data, and further can effectively improve the reliability and accuracy of local community mining by using the initiating node.
In order to further perform node screening, in an embodiment of the abnormal transaction account group identification method provided in the present application, referring to fig. 7, step 220 in the abnormal transaction account group identification method specifically includes the following contents:
step 221: and sequencing the association values between each node and the initiating node according to the sequence from large to small to obtain a sequence consisting of the sequenced nodes, and taking the initiating node as an initial account community.
Step 222: conductivity obtaining step: and extracting a first node in the current sequence, adding the currently extracted node into the account community, and acquiring a current lead value of the account community.
Step 223: judging whether the current lead value of the account community is continuously decreased for a preset number of times, if so, executing the step 224; if not, return to step 222.
Step 224: and determining all nodes except the initiating node in the current account community as nodes of which the internal association relation meets a preset close association judgment rule.
Step 225: and the nodes with the internal association relation meeting the preset close association judgment rule and the initiating node form a risk account community corresponding to the current high-risk account.
From the above description, the method for identifying the abnormal transaction account group provided by the embodiment of the application can effectively improve the accuracy and reliability of account community segmentation, and further can effectively improve the reliability and accuracy of local community mining by using the initiating node.
In order to merge similar communities and then perform re-mining, in an embodiment of the abnormal trading account group identification method provided by the present application, referring to fig. 8, the abnormal trading account group identification method further includes the following contents:
step 410: if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are not met among the risk account communities, merging the risk account communities with similar relationships, wherein the non-similar requirements comprise: the same initiating node is not included between the risk account communities.
It can be understood that the condition that the preset non-similarity requirement is not met among the risk account communities means that at least one pair of risk account communities contain the same initiating node.
Step 420: and executing the local community mining step aiming at least two high-risk accounts in the combined risk account communities at the same time, wherein the at least two high-risk accounts in the combined risk account communities are all initiating nodes in the local community mining step until the rest risk account communities all meet preset non-similar requirements.
It can be understood that, in the first execution process of step 200, the local community mining step is executed only for one high-risk account each time, and in step 420, after two or more risk account communities with similarity are merged, at least two high-risk accounts appear in the merged community, and at this time, when step 200 is executed again, the local community mining step needs to be executed by taking the at least two high-risk accounts as current initiating nodes at the same time, so that not only risk account communities with overlapping relationships can be effectively merged, but also a more complete abnormal transaction account group can be found, and a more refined, complete and accurate abnormal transaction account group can be output. Therefore, the identification workload of the staff of the target financial institution can be reduced, the convenience and the efficiency of mining the incidence relation content of the staff, the accounts or the transaction relation and the like in the abnormal transaction account group by the staff of the target financial institution can be effectively improved, and the user experience of the staff of the target financial institution can be effectively improved.
In a specific example, if there are 1000 risk account communities obtained after step 200 and it is determined that similar risk account communities exist therein, after the processing in step 410, there are 403 combined risk account communities and only 3 non-combined risk account communities, then the 3 risk account communities are not processed for the moment, and step 420 is performed on the 403 combined risk account communities respectively. Then, for the unprocessed 3 risk account communities and the 403 risk account communities obtained through the step 420 and the re-executed step 200, whether similar risk account communities exist in the 3+403 risk account communities is judged again, and so on until the last remaining risk account communities all meet the preset non-similar requirement, then the step 300 is executed for the last remaining risk account communities.
If there is similarity between the two risk account communities, that is, if the risk account community a1 includes a node Z1 of a high risk account as an initiating node in the risk account community a1, and the risk account community a2 also includes a node Z1 as a non-initiating node (a common node) and an initiating node Z2 in the risk account community a2, then there are two high risk accounts in the combined risk account community a1+ a 2: node Z1+ node Z2, at which point both node Z1 and node Z2 need to be the initiating nodes to perform the step 200 simultaneously.
If the risk account community a1 includes a node Z1 of a high risk account as an initiating node in the risk account community a1 and a node Z2 of a non-initiating node (a common node), a node Z3 of a high risk account as an initiating node in the risk account community a2 and a node Z2 of a non-initiating node (a common node) in the risk account community a2, and the risk account community A3 includes a node Z2 of a high risk account as an initiating node in the risk account community A3 and a node Z4 of a non-initiating node (a common node), since the risk account community a1 and the common node Z2 in the risk account community a2 are high risk accounts in the risk account community A3, three high risk accounts exist in the combined risk account community a1+ a2+ A3: node Z1+ node Z3+ node Z2, at which time, the step 200 needs to be executed simultaneously by taking node Z1, node Z2 and node Z3 as initiating nodes.
From the above description, the abnormal transaction account group identification method provided in the embodiment of the present application can further improve the accuracy of the finally obtained abnormal transaction account group identification result.
In order to provide a processing mode with only one community, in an embodiment of the abnormal transaction account group identification method provided by the present application, referring to fig. 9, the abnormal transaction account group identification method further includes the following steps:
step 500: and if one risk account community is obtained through the local community mining step, determining the risk account community as an abnormal transaction account group of the target financial institution.
From the above description, it can be known that the abnormal transaction account group identification method provided in the embodiment of the present application can comprehensively and highly adaptively improve the accuracy of the finally obtained abnormal transaction account group identification result.
In order to further explain the scheme, the application also provides a specific application example for realizing the abnormal transaction account group identification method by applying the abnormal transaction account group identification system, the application example overcomes the defects of the existing abnormal transaction account identification method, an abnormal transaction account group identification mode based on a page-rank algorithm and a local community mining algorithm is provided, graph data is established according to the self attribute data of bank accounts and the transfer relation between accounts, high-risk account communities are screened out through a machine learning model, local community mining is carried out by taking high-risk accounts as starting points, then the mined communities are merged, and the account community which is convenient for a risk identification expert to further carry out abnormal transaction account group processing is obtained. The method aims to better utilize characteristic information (account attribute) and structural information (transfer relation) of the bank account, provide reference for anti-abnormal financial detection personnel, reduce labor capacity and improve identification efficiency and probability of abnormal transaction accounts.
The application example screens the high-risk account group further identified for the expert, and improves the efficiency of identifying the account of abnormal transactions in a group form. The construction and application steps of the whole abnormal transaction account group identification system can be summarized as follows: firstly, collecting and arranging transaction data and attribute data of a bank account; making data of a bank account map; applying page-rank to the totality account map data; and carrying out community cutting to obtain a risk account community, and further detecting by a risk detection expert in the past.
Referring to fig. 10, a specific process for implementing the abnormal transaction account group identification method by using the abnormal transaction account group identification system is as follows:
a first module: collecting transaction data and attribute data of bank account in certain time
The first module collects transaction data and attribute data of the bank account within a certain time period and constructs graph data for subsequent modules.
Transaction data for bank accounts are transfer records between accounts as edges for subsequent graph data. The attribute data of the bank account is divided into attribute characteristics of the account and transaction characteristics. The attribute characteristics of the account are attribute information of an account owner, such as the age of a legal person of a public account, the location of a business to which the business belongs, and the like; the transaction characteristics of the account are transaction attribute information of the account, such as the number of times of account transactions within a certain period of account opening.
(II) a second module: generating graph data
The second module constructs a graph data with a node N and a relatively large correlation between the reaction accounts from the data collected by the first module.
Graph (Graph) G data is composed of sets of vertices (or nodes) and sets of edges (edges), and can be represented as
G=(V,E) (1)
Where V and E are respectively a vertex set and an edge set, see fig. 11, and are graph data composed of a bank account node 1 to a bank account node 13, and a bank account node 15 and a bank account node 16. All bank account formationSet V ═ V1,v2,v3,...,vN}. All edges between accounts (with transfer records) constitute the set E. d (V) is the degree of the node V, which is the number of edges connecting the node V, and as shown in fig. 11, the bank account node 2 connects the bank account node 1 and the bank account node 4, the bank account node 1 and the bank account node 4 are adjacent nodes of the bank account node 2, and the degree of the bank account node 1 is 2. Defining D as a diagonal matrix, wherein Di,i=d(vi). A is a link matrix of the graph data, if there is an edge (with transfer record) between account i and account j A ij1. The structure format of the graph data represents structure information of the graph data.
(III) a third module: identifying high risk accounts
And the third module predicts and obtains M high-risk accounts in the whole graph by training a machine learning model.
Abnormal financial behavior group abnormal transaction account group shows that part of accounts perform abnormal financial behaviors, money is washed out, other accounts serve as the same group, and unknown money items of the accounts are accepted, so that the aim of performing further operations (cash withdrawal, consumption and the like) without tracing is fulfilled. The account performing the abnormal financial behavior exhibits an abnormality and is easily recognized. The device selects a LightGBM model, and trains the LightGBM model by using attribute data of part of accounts identified by a risk identification expert as a sample; m high-risk accounts are predicted and identified by using the trained model to serve as black seeds and serve as starting points for subsequent approximate page-rank and local community discovery.
(IV) a fourth module: mining high-risk account communities
And the fourth module respectively takes the M high-risk accounts obtained by the third module as a starting point to perform page-rank calculation, each account obtains a page-rank value, the obtained values are sorted, M account communities are obtained by taking the conductivity as a standard, communities are merged according to the similarity, after merging, the page-rank and the account communities are repeatedly performed until the similarity of any community is 0 or is lower than a certain threshold value, iterative calculation is stopped, and K communities are obtained.
The high-risk community account mining module is a local community mining algorithm, and has the advantages that the association degree of the accounts only focuses on the predicted partial accounts (black seed nodes) during calculation, the whole graph is not calculated, the calculation amount is reduced, and meanwhile, the high-risk community account mining module is targeted and accords with the understanding of the behavior of the group-partner abnormal transaction accounts. Referring to fig. 12, the specific process of the fourth module extracting the account community is as follows:
(1) step 1: approximate page-rank calculation
And taking the given black seed node as an initial value, and carrying out approximate page-rank value calculation to obtain a page-rank value for each node in the graph.
For a given graph G, an iterative computation of the page-rank algorithm based on an inert random walk, represented by equation (2), is performed
p=as+(1-a)pW
W=(I+D-1A) (2)
Setting the set of high-risk accounts with high starting points as S and the set of other accounts as ScS is a vector of 1 × N, if i ∈ S, then SiIf 1, then
Figure BDA0002561657150000187
Then si0. D is a diagonal matrix, wherein Di,i=d(vi). A is an adjacency matrix, where A is the transfer record if there is an edge (with transfer record) between account i and account j ij1. I is an identity matrix, and the structural form of the graph data represents structural information of the graph data.
The matrix calculation not only takes time, but also has high memory requirement, and scholars provide an approximate page-rank algorithm, and can quickly calculate the approximate value of P in the formula (2). Specifically, an initial vector is set
Figure BDA0002561657150000181
r is s, r and
Figure BDA0002561657150000182
are all vectors of 1 × N, representing pages of each section-a rank residual value,
Figure BDA0002561657150000183
representing an approximation vector of P. Each node viHandle
Figure BDA0002561657150000184
Is diffused to the neighboring node vjIs/are as follows
Figure BDA0002561657150000185
On component, approximate page-rank algorithm continuously searches diffusion value
Figure BDA0002561657150000186
And the node which is larger than a certain threshold value distributes the page-rank value to the neighbor nodes. And when the page-rank residual values of all the nodes are smaller than the threshold value, finishing the algorithm. See in particular the pseudo code shown in table 1.
TABLE 1
Figure BDA0002561657150000191
And taking part of the high-risk accounts predicted by the third module as starting points, and obtaining an individualized page-rank value for each account through approximate page-rank calculation. The page-rank value of each account reflects the degree of association with the high-risk account (black seed account).
(2) Step 2: account community segmentation
The obtained account community segmentation is closely related to the black seed sub-account, and is internally related to the closely related account community.
In the conventional community mining, the conductivity (conductivity) and the Modularity (modulation) are widely used indexes, and the community is divided by using the conductivity as the index in the system. Conductivity of set S is defined as:
Figure BDA0002561657150000192
wherein A isijIs a contiguous matrix of the figure, diAnd djTo be a respective node viAnd vjDegree of (c), ScIs a set of nodes other than the nodes in the set S. In general terms, the term "a" or "an" is used to describe a device that is capable of generating a signal
Figure BDA0002561657150000201
Is far greater than
Figure BDA0002561657150000202
The denominator of the formula can be written as
Figure BDA0002561657150000203
Therefore, according to equation (3), the permeability can be understood as a value obtained by dividing the degree of closeness between the nodes in the community by the degree of closeness between the nodes in the community and the nodes outside the community. On the other hand, the page-rank value of the account calculated by the approximate page-rank algorithm represents the closeness degree with the high-risk seed account. Therefore, the account community division needs to select a set which is closely related to the internal nodes and sparsely related to the external nodes of the community as the community on the basis of selecting the accounts which are closely related to the black seed sub-account.
Referring to fig. 13, the account community segmentation process in step 2 may be subdivided into the following steps:
s21, sorting accounts according to the page-rank value, and setting a black seed account as an initial community: sorting according to the page-rank values of the accounts from large to small to obtain a sequence L, and setting the initial community as a black seed account S ═ { v ═ v }1,v2,...,vnV node1,v2,...,vnIs a black seed account.
S22, expanding the community, and calculating the community conductivity: and taking out the account with the largest page-rank value from the sequence and adding the account into the community S to obtain a new account community S', wherein the sequence L reduces one account and the community S increases one account. And calculating the conductivity phi (S') of the account community.
S23, judging whether the conductivity has w continuous decreases: and judging whether the conductivity continuously decreases for less than w times, jumping to S22, and otherwise, jumping to S24.
S24, outputting an account community: the account community (set) is exported and the algorithm ends.
In step 2, K primary account communities are obtained finally.
(3) And step 3: determining whether there are two different communities with the same black seed account
And (4) judging whether two different communities have the same black seed account at the same time, performing approximate page-rank calculation in the step (1) by taking the coincident black seed accounts in the two communities as starting points, continuously iterating until the two different communities do not have the same black seed account at the same time, and jumping to the step (4).
(4) And 4, step 4: export account community
And outputting Q account communities mined by the data of the whole graph.
(V) a fifth module: abstracting account communities
And outputting the high-risk account community.
(sixth) a sixth module: risk community scoring
From the above description, the abnormal transaction account group identification system and method provided in the embodiment of the present application perform local account community mining with a definite high-risk account as a starting point based on a bank abnormal transaction account community mining manner of an approximate page-rank algorithm and a local community mining algorithm, discover account groups with close internal connection and sparse external connection, are more targeted in account community mining while conforming to group-type abnormal financial behaviors, perform abnormal transaction with a definite high-risk account as a starting point, are low in calculation consumption, can quickly and accurately mine a bank account community for group-type abnormal transaction, greatly save manpower, and improve efficiency and probability of identifying abnormal transaction accounts. The practical application effect shows that the method can help banking staff to detect the abnormal group-form financial behaviors more efficiently, and the efficiency of abnormal financial work is greatly improved. The method has the following advantages: the account community is obtained by mining, and the accounts which are hidden in the group and closely related to other accounts in the group can be identified aiming at the abnormal group financial behavior, so that the abnormal financial requirements are met. Local account community mining is carried out by taking the high-risk account as a starting point, and the method has pertinence and the target community conforms to the group-partner abnormal financial behavior. And by applying a local community mining algorithm, the calculation consumption is low, and the calculation is efficient and quick.
In terms of software, in order to solve the problem that the existing identification method cannot achieve both efficiency and accuracy, the present application provides an embodiment of an abnormal transaction account group identification apparatus for executing all or part of the contents in the abnormal transaction account group identification method, and referring to fig. 14, the abnormal transaction account group identification apparatus specifically includes the following contents:
the high-risk account determining module 10 is configured to input attribute information of each account corresponding to the target financial institution into a machine learning model for predicting an account risk, and determine at least one account as a high-risk account according to an output of the machine learning model.
A local mining module 20, configured to perform a local community mining step for each high-risk account, where the local community mining step includes: and respectively determining the association values between other nodes and the initiating node in the graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining the risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule.
The first group determining module 30 is configured to determine, if multiple risk account communities are obtained through the local community mining step and preset non-similar requirements are met among the risk account communities, each risk account community is determined as an abnormal transaction account group of the target financial institution.
From the above description, it can be seen that the abnormal transaction account group identification device provided in the embodiment of the present application determines, by applying a machine learning model, an initiation node for local community mining, performs local account community mining with a high-risk account as a starting point, has pertinence and a target community conforms to a group-type abnormal financial behavior, and performs local community mining by applying the initiation node, can effectively improve efficiency, effectiveness, and accuracy of local community mining, can effectively reduce a calculation amount required by the risk account community mining corresponding to the high-risk account, and further can effectively improve efficiency, reliability, and accuracy of group identification of abnormal transaction accounts, and while conforming to the group-type abnormal financial behavior, can identify an account hidden in a group and closely associated with other accounts in the group, and satisfy abnormal financial requirements, the account community mining is more targeted, a clear high-risk account is used as a starting point, the calculation consumption is low, the group bank account community for abnormal financial behaviors can be rapidly and accurately mined, the manpower is greatly saved, and the efficiency and the probability for identifying abnormal transaction accounts are improved. The method can help banking staff to detect the group-type abnormal financial behaviors more efficiently, greatly improve the efficiency of abnormal financial work, and further effectively improve the operation safety and reliability of financial institutions recognizing abnormal transaction account groups.
In order to further effectively determine the initiating node, in an embodiment of the abnormal transaction account group identification apparatus provided in the present application, the high-risk account determination module 10 in the abnormal transaction account group identification apparatus is configured to perform the following:
step 110: the method comprises the steps of respectively inputting attribute information of each account corresponding to a target financial institution into a LightGBM model, and determining at least one account as a high-risk account according to the output of the LightGBM model, wherein the LightGBM model is obtained by pre-training based on an attribute information training set, the attribute information training set comprises attribute information of a plurality of historical accounts and labels corresponding to the historical accounts, and the labels are used for indicating whether the corresponding historical accounts are the high-risk accounts or not.
It can be understood that the LightGBM model is an evolved version of the GBDT model, and the LightGBM model is a new member of the boosting set model, which is provided by microsoft and is an efficient implementation of GBDT like XGBoost, and in principle, it is similar to GBDT and XGBoost, and the negative gradient of the loss function is used as the residual approximation of the current decision tree to fit the new decision tree. The LightGBM model has the following features: a decision tree algorithm based on Histogram, a Leaf growth strategy of Leaf-wise with depth limitation, Histogram difference acceleration, direct support of class Feature (Cache hit rate optimization), sparse Feature optimization based on Histogram and multithread optimization.
From the above description, the abnormal transaction account group identification device provided in the embodiment of the present application can effectively improve the accuracy and efficiency of originating node selection, and further can further improve the efficiency, effectiveness, and accuracy of local community mining using an originating node.
In order to obtain account information in advance, in an embodiment of the abnormal transaction account group identification apparatus provided in the present application, referring to fig. 15, the abnormal transaction account group identification apparatus further includes the following contents:
the account information acquisition module 01 is used for acquiring attribute information of each account of a target financial institution within a preset time period and transaction information used for constructing graph data;
the attribute information comprises attribute characteristic information and transaction characteristic information, and the transaction information comprises transfer record information between accounts.
As can be seen from the above description, the abnormal transaction account group identification device provided in the embodiment of the present application provides a reliable data base for determining the originating node according to the subsequent application attribute information, so as to further improve the efficiency and accuracy of acquiring the originating node, and provides an accurate and reliable data base for subsequently constructing graph data, so as to further improve the efficiency and accuracy of local community mining.
In order to construct graph data in advance, in an embodiment of the abnormal transaction account group identification apparatus provided in the present application, referring to fig. 16, the abnormal transaction account group identification apparatus further includes the following contents:
the graph data construction module 02 is configured to apply transaction information of each account to construct graph data for reflecting an association relationship between the accounts, where each node in the graph data corresponds to each account one to one, and an edge in the graph data is used to represent transaction information between two adjacent nodes.
From the above description, the abnormal transaction account group identification device provided in the embodiment of the present application can effectively improve the efficiency and reliability of graph data construction, and further can further improve the efficiency and accuracy of local community mining.
In order to further filter communities, in an embodiment of the abnormal transaction account group identification apparatus provided in the present application, referring to fig. 17, a local mining module 20 in the abnormal transaction account group identification apparatus specifically includes the following contents:
the initial relationship determining submodule 21 is configured to use the current high-risk account as an initiating node, and determine, in the graph data including the initiating node, association values between each of the other nodes and the initiating node.
And the affinity determining submodule 22 is configured to screen, in each node in the graph data except for the originating node, a node whose internal association satisfies a preset affinity determination rule, and generate a risk account community corresponding to the current high-risk account according to the node whose internal association satisfies the preset affinity determination rule and the originating node.
As can be seen from the above description, the abnormal transaction account group identification device provided in the embodiment of the present application can effectively improve the reliability, accuracy and effectiveness of local community mining performed by the application initiating node, and further can effectively ensure the reliability of the account community of the acquired high-risk account.
In order to further determine the correlation value, in an embodiment of the abnormal transaction account group identification apparatus provided in the present application, referring to fig. 18, the initial relationship determining submodule 21 in the abnormal transaction account group identification apparatus specifically includes the following contents:
an approximate page-rank calculation unit 2101, the approximate page-rank calculation unit 2101 configured to perform the following:
step 211: taking the current high-risk account as an initiating node, and performing approximate page-rank calculation on the graph data including the initiating node to obtain page-rank values of other nodes except the initiating node in the graph data, wherein the page-rank values are used for representing the association degree between the corresponding nodes and the initiating node.
It can be understood that the approximate page-rank algorithm is a web page ranking algorithm of google search engine, which is to construct a graph of all web pages, each web page is a node, and if there is a link from one web page to another, there is a directed edge connecting the two points. The calculation process of the page-rank algorithm is similar to a Markov chain. The page-rank algorithm also has a probability transition matrix.
As can be seen from the above description, the abnormal transaction account group identification device provided in the embodiment of the present application can effectively improve the accuracy and reliability of obtaining the association values between each node and the originating node in the graph data, and further can effectively improve the reliability and accuracy of local community mining by using the originating node.
In order to further perform node screening, in an embodiment of the abnormal transaction account group identification apparatus provided in the present application, referring to fig. 19, an affinity determination submodule 22 in the abnormal transaction account group identification apparatus specifically includes the following contents:
a conductivity calculation unit 2201, the conductivity calculation unit 2201 being configured to perform the following:
step 221: and sequencing the association values between each node and the initiating node according to the sequence from large to small to obtain a sequence consisting of the sequenced nodes, and taking the initiating node as an initial account community.
Step 222: conductivity obtaining step: and extracting a first node in the current sequence, adding the currently extracted node into the account community, and acquiring a current lead value of the account community.
Step 223: judging whether the current lead value of the account community is continuously decreased for a preset number of times, if so, executing the step 224; if not, return to step 222.
Step 224: and determining all nodes except the initiating node in the current account community as nodes of which the internal association relation meets a preset close association judgment rule.
Step 225: and the nodes with the internal association relation meeting the preset close association judgment rule and the initiating node form a risk account community corresponding to the current high-risk account.
From the above description, the abnormal transaction account group identification device provided in the embodiment of the present application can effectively improve the accuracy and reliability of account community segmentation, and further can effectively improve the reliability and accuracy of local community mining by using the originating node.
In order to merge similar communities and then perform re-mining, in an embodiment of the abnormal transaction account group identification apparatus provided in the present application, referring to fig. 20, the abnormal transaction account group identification apparatus further includes the following contents:
a second population determining module 40, said second population determining module 40 configured to perform the following:
step 410: if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are not met among the risk account communities, merging the risk account communities with similar relationships, wherein the non-similar requirements comprise: the same initiating node is not included between the risk account communities.
Step 420: and executing the local community mining step aiming at least two high-risk accounts in the combined risk account communities at the same time, wherein the at least two high-risk accounts in the combined risk account communities are all initiating nodes in the local community mining step until the rest risk account communities all meet preset non-similar requirements.
As can be seen from the above description, the abnormal transaction account group identification apparatus provided in the embodiment of the present application can further improve the accuracy of the finally obtained abnormal transaction account group identification result.
In order to provide a processing mode with only one community, in an embodiment of the abnormal transaction account group identification apparatus provided in the present application, referring to fig. 21, the abnormal transaction account group identification apparatus further includes the following contents:
a third population determining module 50, said third population determining module 50 configured to perform the following:
step 500: and if one risk account community is obtained through the local community mining step, determining the risk account community as an abnormal transaction account group of the target financial institution.
As can be seen from the above description, the abnormal transaction account group identification apparatus provided in the embodiment of the present application can comprehensively and highly adaptively improve the accuracy of the finally obtained abnormal transaction account group identification result.
In terms of hardware, in order to solve the problem that the existing identification method cannot achieve both efficiency and accuracy, the present application provides an embodiment of an electronic device for implementing all or part of the contents in the abnormal transaction account group identification method, where the electronic device specifically includes the following contents:
fig. 22 is a schematic block diagram of a system configuration of an electronic device 9600 according to an embodiment of the present application. As shown in fig. 22, the electronic device 9600 can include a central processor 9100 and a memory 9140; the memory 9140 is coupled to the central processor 9100. Notably, this fig. 22 is exemplary; other types of structures may also be used in addition to or in place of the structure to implement telecommunications or other functions.
In one embodiment, the anomalous transaction account group identification function may be integrated into the central processor. Wherein the central processor may be configured to control:
step 100: and respectively inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model.
In step 100, the machine learning model may specifically adopt, for example, a decision tree model to predict the account risk, and each account corresponding to the target financial institution may be obtained in advance within a preset time period of the target financial institution, and the content output by the machine learning model includes a high-risk identification result and a low-risk identification result, and the account corresponding to the high-risk identification result is determined as the high-risk account.
Step 200: respectively executing a local community mining step for each high-risk account, wherein the local community mining step comprises the following steps: and respectively determining the association values between other nodes and the initiating node in the graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining the risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule.
It can be understood that the graph data refers to a node connection graph for representing relationships between the accounts, each node in the graph data corresponds to each account one to one, and an edge in the graph data is used for representing transaction information between two adjacent nodes.
Step 300: and if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are met among the risk account communities, determining the risk account communities as abnormal transaction account groups of the target financial institution respectively.
In step 300, after the abnormal transaction account group of the target financial institution is determined, the abnormal transaction account group of the target financial institution is output, so that the target institution timely performs operations such as historical data verification, real-time transaction monitoring evidence obtaining, risk control system reporting and the like on the abnormal transaction account group of the target financial institution, and the operation safety and reliability of the financial institution identifying the abnormal transaction account group are effectively improved.
From the above description, it can be seen that the electronic device provided in the embodiment of the present application determines, by applying the machine learning model, an initiating node for performing local community mining, performs local account community mining with a high-risk account as a starting point, has pertinence and a target community conforms to a group-type abnormal financial behavior, and performs local community mining by applying the initiating node, so that efficiency, effectiveness, and accuracy of local community mining can be effectively improved, a calculation amount required by the risk account community mining corresponding to the high-risk account can be effectively reduced, and further efficiency, reliability, and accuracy of group identification of abnormal transaction accounts can be effectively improved, while conforming to the group-type abnormal financial behavior, an account hidden in a group and closely associated with other accounts in the group can be identified, and abnormal financial requirements can be satisfied, the account community mining is more targeted, a clear high-risk account is used as a starting point, the calculation consumption is low, the group bank account community for abnormal financial behaviors can be rapidly and accurately mined, the manpower is greatly saved, and the efficiency and the probability for identifying abnormal transaction accounts are improved. The method can help banking staff to detect the group-type abnormal financial behaviors more efficiently, greatly improve the efficiency of abnormal financial work, and further effectively improve the operation safety and reliability of financial institutions recognizing abnormal transaction account groups.
In another embodiment, the abnormal transaction account group recognition device may be configured separately from the central processor 9100, for example, the abnormal transaction account group recognition device may be configured as a chip connected to the central processor 9100, and the abnormal transaction account group recognition function is realized by the control of the central processor.
As shown in fig. 22, the electronic device 9600 may further include: a communication module 9110, an input unit 9120, an audio processor 9130, a display 9160, and a power supply 9170. It is noted that the electronic device 9600 also does not necessarily include all of the components shown in fig. 22; in addition, the electronic device 9600 may further include components not shown in fig. 22, which can be referred to in the related art.
As shown in fig. 22, a central processor 9100, sometimes referred to as a controller or operational control, can include a microprocessor or other processor device and/or logic device, which central processor 9100 receives input and controls the operation of the various components of the electronic device 9600.
The memory 9140 can be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 9100 can execute the program stored in the memory 9140 to realize information storage or processing, or the like.
The input unit 9120 provides input to the central processor 9100. The input unit 9120 is, for example, a key or a touch input device. Power supply 9170 is used to provide power to electronic device 9600. The display 9160 is used for displaying display objects such as images and characters. The display may be, for example, an LCD display, but is not limited thereto.
The memory 9140 can be a solid state memory, e.g., Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 9140 could also be some other type of device. Memory 9140 includes a buffer memory 9141 (sometimes referred to as a buffer). The memory 9140 may include an application/function storage portion 9142, the application/function storage portion 9142 being used for storing application programs and function programs or for executing a flow of operations of the electronic device 9600 by the central processor 9100.
The memory 9140 can also include a data store 9143, the data store 9143 being used to store data, such as contacts, digital data, pictures, sounds, and/or any other data used by an electronic device. The driver storage portion 9144 of the memory 9140 may include various drivers for the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging applications, contact book applications, etc.).
The communication module 9110 is a transmitter/receiver 9110 that transmits and receives signals via an antenna 9111. The communication module (transmitter/receiver) 9110 is coupled to the central processor 9100 to provide input signals and receive output signals, which may be the same as in the case of a conventional mobile communication terminal.
Based on different communication technologies, a plurality of communication modules 9110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same electronic device. The communication module (transmitter/receiver) 9110 is also coupled to a speaker 9131 and a microphone 9132 via an audio processor 9130 to provide audio output via the speaker 9131 and receive audio input from the microphone 9132, thereby implementing ordinary telecommunications functions. The audio processor 9130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor 9130 is also coupled to the central processor 9100, thereby enabling recording locally through the microphone 9132 and enabling locally stored sounds to be played through the speaker 9131.
An embodiment of the present application further provides a computer-readable storage medium capable of implementing all the steps in the abnormal transaction account group identification method in the foregoing embodiment, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program implements all the steps of the abnormal transaction account group identification method in the foregoing embodiment, where the execution subject is a server or a client, for example, when the processor executes the computer program, the processor implements the following steps:
step 100: and respectively inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model.
In step 100, the machine learning model may specifically adopt, for example, a decision tree model to predict the account risk, and each account corresponding to the target financial institution may be obtained in advance within a preset time period of the target financial institution, and the content output by the machine learning model includes a high-risk identification result and a low-risk identification result, and the account corresponding to the high-risk identification result is determined as the high-risk account.
Step 200: respectively executing a local community mining step for each high-risk account, wherein the local community mining step comprises the following steps: and respectively determining the association values between other nodes and the initiating node in the graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining the risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule.
It can be understood that the graph data refers to a node connection graph for representing relationships between the accounts, each node in the graph data corresponds to each account one to one, and an edge in the graph data is used for representing transaction information between two adjacent nodes.
Step 300: and if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are met among the risk account communities, determining the risk account communities as abnormal transaction account groups of the target financial institution respectively.
In step 300, after the abnormal transaction account group of the target financial institution is determined, the abnormal transaction account group of the target financial institution is output, so that the target institution timely performs operations such as historical data verification, real-time transaction monitoring evidence obtaining, risk control system reporting and the like on the abnormal transaction account group of the target financial institution, and the operation safety and reliability of the financial institution identifying the abnormal transaction account group are effectively improved.
As can be seen from the above description, the computer-readable storage medium provided in the embodiment of the present application determines, by using a machine learning model, an originating node for performing local community mining, performs local account community mining with a high-risk account as a starting point, has pertinence and a target community conforms to a group-type abnormal financial behavior, and performs local community mining by using the originating node, can effectively improve efficiency, effectiveness, and accuracy of local community mining, can effectively reduce a calculation amount required by the group-type abnormal financial behavior mining corresponding to the high-risk account, and can further effectively improve efficiency, reliability, and accuracy of group identification of an abnormal transaction account, and while conforming to the group-type abnormal financial behavior, can identify an account hidden in a group and closely associated with other accounts in the group, and meet abnormal financial requirements, the account community mining is more targeted, a clear high-risk account is used as a starting point, the calculation consumption is low, the group bank account community for abnormal financial behaviors can be rapidly and accurately mined, the manpower is greatly saved, and the efficiency and the probability for identifying abnormal transaction accounts are improved. The method can help banking staff to detect the group-type abnormal financial behaviors more efficiently, greatly improve the efficiency of abnormal financial work, and further effectively improve the operation safety and reliability of financial institutions recognizing abnormal transaction account groups.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (20)

1. A method for identifying abnormal transaction account groups is characterized by comprising the following steps:
respectively inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model;
respectively executing a local community mining step for each high-risk account, wherein the local community mining step comprises the following steps: respectively determining the association values between other nodes and the initiating node in graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining a risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule;
and if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are met among the risk account communities, determining the risk account communities as abnormal transaction account groups of the target financial institution respectively.
2. The abnormal transaction account group identification method according to claim 1, wherein the step of inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model comprises:
respectively inputting the attribute information of each account corresponding to the target financial institution into a LightGBM model, and determining at least one account as a high-risk account according to the output of the LightGBM model;
the LightGBM model is obtained by pre-training based on an attribute information training set, wherein the attribute information training set comprises attribute information of a plurality of historical accounts and labels corresponding to the historical accounts, and the labels are used for indicating whether the corresponding historical accounts are high-risk accounts or not.
3. The abnormal transaction account group identification method according to claim 1, wherein before the step of inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high risk account according to the output of the machine learning model, the method further comprises:
acquiring attribute information of each account of a target financial institution within a preset time period and transaction information for constructing graph data;
the attribute information comprises attribute characteristic information and transaction characteristic information, and the transaction information comprises transfer record information between accounts.
4. The abnormal transaction account group identification method according to claim 3, further comprising, before the performing the local community mining step for each of the high-risk accounts, respectively:
and constructing graph data for reflecting the incidence relation between the accounts by applying the transaction information of each account, wherein each node in the graph data is in one-to-one correspondence with each account, and an edge in the graph data is used for representing the transaction information between two adjacent nodes.
5. The abnormal transaction account group identification method according to claim 1, wherein the step of determining the association values between other nodes and the originating node in the graph data including the originating node by using the current high-risk account as the originating node, and determining the risk account community corresponding to the current high-risk account according to the association values between the nodes and the originating node and a preset close association determination rule comprises:
respectively determining the association values between other nodes and the initiating node in the graph data containing the initiating node by taking the current high-risk account as the initiating node;
and screening the nodes of which the internal association relation meets the preset close association judgment rule from other nodes except the initiating node in the graph data, and generating a risk account community corresponding to the current high-risk account according to the nodes of which the internal association relation meets the preset close association judgment rule and the initiating node.
6. The abnormal transaction account group identification method according to claim 5, wherein the determining, by using the current high-risk account as an initiating node, the association values between the other nodes and the initiating node in the graph data containing the initiating node respectively comprises:
taking the current high-risk account as an initiating node, and performing approximate page-rank calculation on the graph data including the initiating node to obtain page-rank values of other nodes except the initiating node in the graph data, wherein the page-rank values are used for representing the association degree between the corresponding nodes and the initiating node.
7. The abnormal transaction account group identification method according to claim 5, wherein the step of screening out, from the nodes except the originating node in the graph data, a node whose internal association satisfies a preset close association determination rule, and generating a risk account community corresponding to a current high risk account according to the node whose internal association satisfies the preset close association determination rule and the originating node comprises:
sequencing the association values between each node and the initiating node according to a descending order to obtain a sequence formed by the sequenced nodes, and taking the initiating node as an initial account community;
conductivity obtaining step: extracting a first node in the current sequence, adding the currently extracted node into the account community, and acquiring a current lead value of the account community;
judging whether the current lead value of the account community is continuously decreased for a preset number of times, if so, determining all nodes except the initiating node in the current account community as nodes of which the internal association relationship meets a preset close association judgment rule; if not, returning to execute the conductivity obtaining step;
and the nodes with the internal association relation meeting the preset close association judgment rule and the initiating node form a risk account community corresponding to the current high-risk account.
8. The abnormal transaction account group identification method of claim 1, further comprising:
if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are not met among the risk account communities, merging the risk account communities with similar relationships, wherein the non-similar requirements comprise: the same initiating node is not included among the risk account communities;
and executing the local community mining step aiming at least two high-risk accounts in the combined risk account communities at the same time, wherein the at least two high-risk accounts in the combined risk account communities are all initiating nodes in the local community mining step until the rest risk account communities all meet preset non-similar requirements.
9. The abnormal transaction account group identification method of claim 1, further comprising:
and if one risk account community is obtained through the local community mining step, determining the risk account community as an abnormal transaction account group of the target financial institution.
10. An abnormal transaction account group identification device, comprising:
the high-risk account determining module is used for respectively inputting the attribute information of each account corresponding to the target financial institution into a machine learning model for predicting the risk of the account, and determining at least one account as a high-risk account according to the output of the machine learning model;
a local mining module, configured to perform a local community mining step for each high-risk account, where the local community mining step includes: respectively determining the association values between other nodes and the initiating node in graph data containing the initiating node by taking the current high-risk account as the initiating node, and determining a risk account community corresponding to the current high-risk account according to the association values between the nodes and the initiating node and a preset close association judgment rule;
and the first group determining module is used for determining each risk account community as an abnormal transaction account group of the target financial institution if a plurality of risk account communities are obtained through the local community mining step and each risk account community meets a preset non-similar requirement.
11. The anomalous transaction account group identification device of claim 10, wherein the high-risk account determination module is configured to perform the following:
respectively inputting the attribute information of each account corresponding to the target financial institution into a LightGBM model, and determining at least one account as a high-risk account according to the output of the LightGBM model;
the LightGBM model is obtained by pre-training based on an attribute information training set, wherein the attribute information training set comprises attribute information of a plurality of historical accounts and labels corresponding to the historical accounts, and the labels are used for indicating whether the corresponding historical accounts are high-risk accounts or not.
12. The anomalous transaction account group identification device of claim 10, further comprising:
the account information acquisition module is used for acquiring attribute information of each account of the target financial institution within a preset time period and transaction information used for constructing graph data;
the attribute information comprises attribute characteristic information and transaction characteristic information, and the transaction information comprises transfer record information between accounts.
13. The anomalous transaction account group identification device of claim 12, further comprising:
the graph data construction module is used for constructing graph data for reflecting the incidence relation between the accounts by applying the transaction information of the accounts, wherein each node in the graph data is in one-to-one correspondence with each account, and an edge in the graph data is used for representing the transaction information between two adjacent nodes.
14. The anomalous transaction account population identification device of claim 10, wherein the local mining module comprises:
the initial relationship determining submodule is used for respectively determining the association values between each other node and the initiating node in the graph data containing the initiating node by taking the current high-risk account as the initiating node;
and the affinity determination submodule is used for screening the nodes of which the internal association relations meet the preset affinity association judgment rule from other nodes except the initiating node in the graph data, and generating a risk account community corresponding to the current high-risk account according to the nodes of which the internal association relations meet the preset affinity association judgment rule and the initiating node.
15. The anomalous transaction account group identification device of claim 14, wherein said initial relationship determination submodule includes: an approximate page-rank calculation unit configured to perform the following:
taking the current high-risk account as an initiating node, and performing approximate page-rank calculation on the graph data including the initiating node to obtain page-rank values of other nodes except the initiating node in the graph data, wherein the page-rank values are used for representing the association degree between the corresponding nodes and the initiating node.
16. The anomalous transaction account group identification device of claim 14, wherein said affinity determination submodule includes: a conductivity calculation unit for performing the following:
sequencing the association values between each node and the initiating node according to a descending order to obtain a sequence formed by the sequenced nodes, and taking the initiating node as an initial account community;
conductivity obtaining step: extracting a first node in the current sequence, adding the currently extracted node into the account community, and acquiring a current lead value of the account community;
judging whether the current lead value of the account community is continuously decreased for a preset number of times, if so, determining all nodes except the initiating node in the current account community as nodes of which the internal association relationship meets a preset close association judgment rule; if not, returning to execute the conductivity obtaining step;
and the nodes with the internal association relation meeting the preset close association judgment rule and the initiating node form a risk account community corresponding to the current high-risk account.
17. The anomalous transaction account group identification device of claim 10, further comprising: a second population determination module to perform the following:
if a plurality of risk account communities are obtained through the local community mining step and preset non-similar requirements are not met among the risk account communities, merging the risk account communities with similar relationships, wherein the non-similar requirements comprise: the same initiating node is not included among the risk account communities;
and executing the local community mining step aiming at least two high-risk accounts in the combined risk account communities at the same time, wherein the at least two high-risk accounts in the combined risk account communities are all initiating nodes in the local community mining step until the rest risk account communities all meet preset non-similar requirements.
18. The anomalous transaction account group identification device of claim 10, further comprising: a third population determination module to perform the following:
and if one risk account community is obtained through the local community mining step, determining the risk account community as an abnormal transaction account group of the target financial institution.
19. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of identifying a group of anomalous transaction accounts of any one of claims 1 to 9 when executing the program.
20. A computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the abnormal transaction account group identification method of any one of claims 1 to 9.
CN202010608903.3A 2020-06-30 2020-06-30 Abnormal transaction account group identification method and device Pending CN111784502A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010608903.3A CN111784502A (en) 2020-06-30 2020-06-30 Abnormal transaction account group identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010608903.3A CN111784502A (en) 2020-06-30 2020-06-30 Abnormal transaction account group identification method and device

Publications (1)

Publication Number Publication Date
CN111784502A true CN111784502A (en) 2020-10-16

Family

ID=72761113

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010608903.3A Pending CN111784502A (en) 2020-06-30 2020-06-30 Abnormal transaction account group identification method and device

Country Status (1)

Country Link
CN (1) CN111784502A (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112215616A (en) * 2020-11-30 2021-01-12 四川新网银行股份有限公司 Method and system for automatically identifying abnormal fund transaction based on network
CN112330373A (en) * 2020-11-30 2021-02-05 中国银联股份有限公司 User behavior analysis method and device and computer readable storage medium
CN112435126A (en) * 2021-01-26 2021-03-02 深圳华锐金融技术股份有限公司 Account identification method and device, computer equipment and storage medium
CN113159793A (en) * 2020-12-09 2021-07-23 同盾控股有限公司 Data processing method and device, electronic equipment and computer storage medium
CN113159778A (en) * 2020-12-24 2021-07-23 西安四叶草信息技术有限公司 Financial fraud detection method and device
CN113222738A (en) * 2021-05-25 2021-08-06 山东小葱数字科技有限公司 Cash register card identification method and device, electronic equipment and computer readable storage medium
CN113362157A (en) * 2021-05-27 2021-09-07 中国银联股份有限公司 Abnormal node identification method, model training method, device and storage medium
CN113409139A (en) * 2021-07-27 2021-09-17 深圳前海微众银行股份有限公司 Credit risk identification method, apparatus, device, and program
CN113420190A (en) * 2021-08-23 2021-09-21 连连(杭州)信息技术有限公司 Merchant risk identification method, device, equipment and storage medium
CN113570379A (en) * 2021-08-04 2021-10-29 工银科技有限公司 Abnormal transaction group partner identification method and device
CN113689218A (en) * 2021-08-06 2021-11-23 上海浦东发展银行股份有限公司 Risk account identification method and device, computer equipment and storage medium
CN114723554A (en) * 2022-06-09 2022-07-08 中国工商银行股份有限公司 Abnormal account identification method and device
WO2022226910A1 (en) * 2021-04-29 2022-11-03 Paypal, Inc. Systems and methods for presenting and analyzing transaction flows using tube map format
WO2022237194A1 (en) * 2021-05-10 2022-11-17 深圳前海微众银行股份有限公司 Abnormality detection method and apparatus for accounts in federal learning system, and electronic device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109272378A (en) * 2018-08-23 2019-01-25 阿里巴巴集团控股有限公司 A kind of discovery method and apparatus of risk group
CN110046929A (en) * 2019-03-12 2019-07-23 平安科技(深圳)有限公司 A kind of recognition methods of fraud clique, device, readable storage medium storing program for executing and terminal device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109272378A (en) * 2018-08-23 2019-01-25 阿里巴巴集团控股有限公司 A kind of discovery method and apparatus of risk group
CN110046929A (en) * 2019-03-12 2019-07-23 平安科技(深圳)有限公司 A kind of recognition methods of fraud clique, device, readable storage medium storing program for executing and terminal device

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112330373A (en) * 2020-11-30 2021-02-05 中国银联股份有限公司 User behavior analysis method and device and computer readable storage medium
CN112215616B (en) * 2020-11-30 2021-04-30 四川新网银行股份有限公司 Method and system for automatically identifying abnormal fund transaction based on network
CN112215616A (en) * 2020-11-30 2021-01-12 四川新网银行股份有限公司 Method and system for automatically identifying abnormal fund transaction based on network
CN113159793A (en) * 2020-12-09 2021-07-23 同盾控股有限公司 Data processing method and device, electronic equipment and computer storage medium
CN113159778A (en) * 2020-12-24 2021-07-23 西安四叶草信息技术有限公司 Financial fraud detection method and device
CN113159778B (en) * 2020-12-24 2023-11-24 西安四叶草信息技术有限公司 Financial fraud detection method and device
CN112435126A (en) * 2021-01-26 2021-03-02 深圳华锐金融技术股份有限公司 Account identification method and device, computer equipment and storage medium
CN112435126B (en) * 2021-01-26 2021-06-18 深圳华锐金融技术股份有限公司 Account identification method and device, computer equipment and storage medium
WO2022226910A1 (en) * 2021-04-29 2022-11-03 Paypal, Inc. Systems and methods for presenting and analyzing transaction flows using tube map format
WO2022237194A1 (en) * 2021-05-10 2022-11-17 深圳前海微众银行股份有限公司 Abnormality detection method and apparatus for accounts in federal learning system, and electronic device
CN113222738A (en) * 2021-05-25 2021-08-06 山东小葱数字科技有限公司 Cash register card identification method and device, electronic equipment and computer readable storage medium
CN113362157A (en) * 2021-05-27 2021-09-07 中国银联股份有限公司 Abnormal node identification method, model training method, device and storage medium
CN113362157B (en) * 2021-05-27 2024-02-09 中国银联股份有限公司 Abnormal node identification method, model training method, device and storage medium
CN113409139A (en) * 2021-07-27 2021-09-17 深圳前海微众银行股份有限公司 Credit risk identification method, apparatus, device, and program
CN113570379A (en) * 2021-08-04 2021-10-29 工银科技有限公司 Abnormal transaction group partner identification method and device
CN113570379B (en) * 2021-08-04 2024-02-13 工银科技有限公司 Abnormal transaction group partner identification method and device
CN113689218A (en) * 2021-08-06 2021-11-23 上海浦东发展银行股份有限公司 Risk account identification method and device, computer equipment and storage medium
CN113420190A (en) * 2021-08-23 2021-09-21 连连(杭州)信息技术有限公司 Merchant risk identification method, device, equipment and storage medium
CN114723554A (en) * 2022-06-09 2022-07-08 中国工商银行股份有限公司 Abnormal account identification method and device

Similar Documents

Publication Publication Date Title
CN111784502A (en) Abnormal transaction account group identification method and device
CN111476662A (en) Anti-money laundering identification method and device
CN111275546B (en) Financial customer fraud risk identification method and device
CN113344562B (en) Method and device for detecting Etheng phishing accounts based on deep neural network
CN112785086A (en) Credit overdue risk prediction method and device
CN110826609B (en) Double-current feature fusion image identification method based on reinforcement learning
CN110378575B (en) Overdue event refund collection method and device and computer readable storage medium
US20150262184A1 (en) Two stage risk model building and evaluation
CN111340240A (en) Method and device for realizing automatic machine learning
CN108268785A (en) A kind of sensitive data identification and the device and method of desensitization
KR20200075120A (en) Business default prediction system and operation method thereof
CN110634060A (en) User credit risk assessment method, system, device and storage medium
CN114881775B (en) Fraud detection method and system based on semi-supervised ensemble learning
CN113282623A (en) Data processing method and device
CN112884569A (en) Credit assessment model training method, device and equipment
CN110020196B (en) User analysis method and device based on different data sources and computing equipment
Zhu et al. Loan default prediction based on convolutional neural network and LightGBM
CN111523604A (en) User classification method and related device
CN112927719B (en) Risk information evaluation method, apparatus, device and storage medium
CN114998001A (en) Service class identification method, device, equipment, storage medium and program product
CN117058432B (en) Image duplicate checking method and device, electronic equipment and readable storage medium
CN111178535B (en) Method and apparatus for implementing automatic machine learning
CN112101952B (en) Bank suspicious transaction evaluation and data processing method and device
CN113344581A (en) Service data processing method and device
CN116611923A (en) Knowledge graph-based risk data acquisition method, system, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination