CN111400614A - Case-related cluster searching method based on fund transaction data - Google Patents

Case-related cluster searching method based on fund transaction data Download PDF

Info

Publication number
CN111400614A
CN111400614A CN202010192729.9A CN202010192729A CN111400614A CN 111400614 A CN111400614 A CN 111400614A CN 202010192729 A CN202010192729 A CN 202010192729A CN 111400614 A CN111400614 A CN 111400614A
Authority
CN
China
Prior art keywords
transaction
cluster
clusters
account
modularity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010192729.9A
Other languages
Chinese (zh)
Inventor
裴贵军
唐海龙
崔世鹏
张岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Information and Data Security Solutions Co Ltd
Original Assignee
Information and Data Security Solutions Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Information and Data Security Solutions Co Ltd filed Critical Information and Data Security Solutions Co Ltd
Publication of CN111400614A publication Critical patent/CN111400614A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Educational Administration (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a case-involved cluster searching method based on fund transaction data, which comprises the following steps: A. preprocessing fund transaction data; B. cluster judgment based on modularity; C. and analyzing the compactness of the related clusters to find out the involved clusters. The invention can quickly, accurately and effectively find the involved case groups in a huge fund transaction network, provides a simple and easy solution for economic investigation work, and effectively solves the problem of finding the involved case clusters.

Description

Case-related cluster searching method based on fund transaction data
Technical Field
The invention relates to the technical field of information, in particular to a case-involved cluster searching method based on fund transaction data.
Background
With the development of the times, economic crimes occur more frequently, and the characteristics of new times such as a large number of involved people and complex relationships appear. How to better utilize the fund transaction information to analyze and check illegal funds is a problem which is widely concerned by related business personnel.
The cluster search (group discovery) is an important task in the field of economic investigation, and it is not an easy matter to accurately search related clusters, and at present, under the condition of small data volume, related personnel still adopt a manual method to compare transaction relations, or manually search clusters after drawing a simple fund relation network. With the increasing number of involved persons and the increasing amount of money, the current method is difficult to effectively find the cluster involved in the case in a short time.
Disclosure of Invention
The invention aims to provide a case-related cluster searching method based on fund transaction data, so as to solve the problems in the background technology.
In order to achieve the purpose, the invention provides the following technical scheme:
a method for case-involved cluster searching based on fund transaction data comprises the following steps:
A. preprocessing fund transaction data;
B. cluster judgment based on modularity;
C. and analyzing the compactness of the related clusters to find out the involved clusters.
As a further scheme of the invention: the step A is specifically as follows: firstly, a plurality of transaction records between two accounts are merged, transaction amounts are gathered to form new transaction amounts and transaction times, and then a plurality of account numbers belonging to the same natural person are physically merged.
As a further scheme of the invention: the transaction record comprises a transaction account number, transaction time, transaction amount, a payment receipt mark, an opponent account number and an opponent account name.
As a further scheme of the invention: the step B is specifically as follows: and importing the preprocessed transaction record data into a graph database to form a network graph related to fund transaction, and then dividing clusters in the fund transaction network according to modularity by using a Luwen algorithm.
As a further scheme of the invention: the modularity-based cluster determination includes three stages: step one, initialization, each account is set to belong to the cluster of the account. And step two, continuously traversing the accounts in the transaction network, and attempting to join a single account into the cluster which can improve the modularity to the maximum until no account is movable. And step three, processing the result of the first stage, merging each cluster into a new account to reconstruct the transaction network, wherein the edge weight among the clusters is the sum of the edge weights of all the original accounts in the nodes, and the edge in the cluster is changed into the edge of the self-loop. These two steps are iterated until the algorithm stabilizes.
As a further scheme of the invention: the step C is specifically as follows: and further carrying out the analysis of the closeness degree based on the transaction amount on the cluster judgment result based on the modularity.
As a further scheme of the invention: the step C comprises the following steps: firstly, respectively judging total transaction amounts between a cluster and other clusters to form a two-dimensional matrix related to the transaction amounts between the clusters; secondly, setting a threshold value of the transaction amount, and regarding the cluster with the transaction total amount larger than the threshold value as a cluster with close transaction relation; and finally, regarding the clusters with high cluster compactness degree as clusters with the same attribute, namely the involved clusters.
Compared with the prior art, the invention has the beneficial effects that: the invention can quickly, accurately and effectively find the case-involved group in the fund transaction network, provides a simple and easy solution for economic investigation work, and effectively solves the problem of finding the case-involved cluster.
Drawings
FIG. 1 is a flowchart of a case-involved cluster searching method;
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, in an embodiment of the present invention, a method for referring to a case cluster search based on fund transaction data includes the following steps:
A. preprocessing fund transaction data;
for the case of economic investigation, the original data of the case mostly exist in the form of account transaction records, and the content of each transaction record at least comprises: transaction account number, transaction time, transaction amount, receipt and payment sign, opponent account number, opponent account name and the like. There may be multiple pieces of account transaction data between two accounts in the account's transaction record, thus requiring processing of the data prior to performing the cluster lookup. The preprocessing method is that firstly, a plurality of transaction records between two same accounts are merged, and the transaction amount is summarized to form a new transaction amount and transaction times. Therefore, a plurality of transaction records among the accounts can be combined into a total record of one transaction; secondly, entity merging is carried out on a plurality of accounts belonging to the same natural person;
B. cluster judgment based on modularity;
and importing the preprocessed transaction record data into a graph database to form a network graph related to fund transaction, and then dividing clusters in the fund transaction network according to modularity by using a Luwen algorithm.
Wherein the modularity may be defined as:
Figure BDA0002416492880000031
where m represents the number of transaction relationships in the network, A is the adjacency matrix, kiIndicating the number of accounts having a transaction relationship with account i, if ciAnd cjThe same rule (c)i,cj) Otherwise, 1 is 0.
The gain in modularity may be expressed as follows:
Figure BDA0002416492880000032
wherein k isi,inRepresenting the sum of the weights incident on cluster c by node i,
Figure BDA0002416492880000033
represents the total weight, k, of the incident cluster ciRepresenting the total weight of the incoming account i.
The modularity-based cluster determination includes two phases in which it continually traverses accounts in the trading network in step one, attempting to join a single account into the cluster that maximizes modularity improvement until all accounts in the cluster no longer change. And step two, processing the result of the first stage, merging small clusters into a super large node to reconstruct the network, wherein the weight of the edge is the sum of the edge weights of all the original nodes in the two nodes. These two steps are iterated until the algorithm stabilizes.
The judgment result at this time is the cluster judgment result according to the modularity.
C. Closeness of the associated cluster;
and further carrying out the analysis of the compactness degree based on the transaction amount on the cluster judgment result based on the modularity. Firstly, respectively judging total transaction amounts between a cluster and other clusters to form a two-dimensional matrix related to the transaction amounts between the clusters; secondly, setting a threshold value of the transaction amount, such as 10 hundred million RMB, and regarding the cluster with the transaction total amount larger than the threshold value as a cluster with a close transaction relationship; and finally, regarding the clusters with high cluster compactness degree as clusters with the same attribute, namely the involved clusters.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (7)

1. A method for case-involved cluster search based on fund transaction data is characterized by comprising the following steps:
A. preprocessing fund transaction data;
B. cluster judgment based on modularity;
C. and analyzing the compactness of the related clusters to find out the involved clusters.
2. The method of claim 1, wherein the step a is specifically as follows: firstly, a plurality of transaction records between two accounts are merged, transaction amounts are gathered to form new transaction amounts and transaction times, and then a plurality of account numbers belonging to the same natural person are physically merged.
3. The method of claim 2, wherein the transaction record comprises transaction account number, transaction time, transaction amount, receipt and payment flag, opponent account number and opponent account name.
4. The method of claim 1, wherein step B is specifically: and importing the preprocessed transaction record data into a graph database to form a network graph related to fund transaction, and then dividing clusters in the fund transaction network according to modularity by using a Luwen algorithm.
5. The method of claim 4, wherein the modularity-based cluster determination comprises three stages: step one, initialization, each account is set as a cluster belonging to the account; continuously traversing accounts in the transaction network, and trying to add a single account into a cluster which can improve the modularity to the maximum until no account is movable; step three, processing the result of the first stage, merging each cluster into a new account to reconstruct the transaction network, wherein the edge weight among the clusters is the sum of the edge weights of all original accounts in the nodes, the edge in the clusters is changed into the edge of self-loop, and then returning to the step two, and iterating the two steps until the algorithm is stable.
6. The method of claim 4, wherein the step C comprises: and further carrying out the analysis of the compactness degree based on the transaction amount on the cluster judgment result based on the modularity.
7. The method of claim 6, wherein the step C comprises the steps of: firstly, respectively judging total transaction amounts between a cluster and other clusters to form a two-dimensional matrix related to the transaction amounts between the clusters; secondly, setting a threshold value of the transaction amount, and regarding the cluster with the transaction total amount larger than the threshold value as a cluster with close transaction relation; and finally, regarding the clusters with high cluster compactness degree as clusters with the same attribute, namely the involved clusters.
CN202010192729.9A 2020-01-08 2020-03-18 Case-related cluster searching method based on fund transaction data Pending CN111400614A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2020100166888 2020-01-08
CN202010016688 2020-01-08

Publications (1)

Publication Number Publication Date
CN111400614A true CN111400614A (en) 2020-07-10

Family

ID=71428883

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010192729.9A Pending CN111400614A (en) 2020-01-08 2020-03-18 Case-related cluster searching method based on fund transaction data

Country Status (1)

Country Link
CN (1) CN111400614A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163096A (en) * 2020-09-18 2021-01-01 中国建设银行股份有限公司 Malicious group determination method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100257006A1 (en) * 2009-04-07 2010-10-07 The Boeing Company Associate memory learning for analyzing financial transactions
CN104199832A (en) * 2014-08-01 2014-12-10 西安理工大学 Financial network unusual transaction community finding method based on information entropy
CN104867055A (en) * 2015-06-16 2015-08-26 咸宁市公安局 Financial network doubtable money tracking and identifying method
CN106547838A (en) * 2016-10-14 2017-03-29 北京银丰新融科技开发有限公司 Method based on the suspicious funds transaction of fund network monitor
CN109102151A (en) * 2018-07-03 2018-12-28 阿里巴巴集团控股有限公司 A kind of suspicious group identification method and apparatus
CN109559230A (en) * 2018-12-13 2019-04-02 中科曙光南京研究院有限公司 Bank transaction group based on overlapping community discovery algorithm finds method and system
CN110647590A (en) * 2019-09-23 2020-01-03 税友软件集团股份有限公司 Target community data identification method and related device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100257006A1 (en) * 2009-04-07 2010-10-07 The Boeing Company Associate memory learning for analyzing financial transactions
CN104199832A (en) * 2014-08-01 2014-12-10 西安理工大学 Financial network unusual transaction community finding method based on information entropy
CN104867055A (en) * 2015-06-16 2015-08-26 咸宁市公安局 Financial network doubtable money tracking and identifying method
CN106547838A (en) * 2016-10-14 2017-03-29 北京银丰新融科技开发有限公司 Method based on the suspicious funds transaction of fund network monitor
CN109102151A (en) * 2018-07-03 2018-12-28 阿里巴巴集团控股有限公司 A kind of suspicious group identification method and apparatus
CN109559230A (en) * 2018-12-13 2019-04-02 中科曙光南京研究院有限公司 Bank transaction group based on overlapping community discovery algorithm finds method and system
CN110647590A (en) * 2019-09-23 2020-01-03 税友软件集团股份有限公司 Target community data identification method and related device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163096A (en) * 2020-09-18 2021-01-01 中国建设银行股份有限公司 Malicious group determination method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110334737B (en) Customer risk index screening method and system based on random forest
Ahmed et al. A survey of anomaly detection techniques in financial domain
Rowe et al. Automated social hierarchy detection through email network analysis
US11263644B2 (en) Systems and methods for detecting unauthorized or suspicious financial activity
CN110647590A (en) Target community data identification method and related device
CN103678659A (en) E-commerce website cheat user identification method and system based on random forest algorithm
CN107832964A (en) Bank client relation loop analysis method and system
CN111861595A (en) Cyclic invoicing risk identification method based on knowledge graph
CN104077723A (en) Social network recommending system and social network recommending method
CN111882403A (en) Financial service platform intelligent recommendation method based on user data
Xu et al. Novel key indicators selection method of financial fraud prediction model based on machine learning hybrid mode
CN110046648A (en) The method and device of business classification is carried out based at least one business disaggregated model
Wulansaria et al. Asian e-commerce engages global trade openness: The role of information and communications technology, social, and security indicators
CN113506113B (en) Credit card cash-registering group-partner mining method and system based on associated network
CN111400614A (en) Case-related cluster searching method based on fund transaction data
CN112950290A (en) Mining method and device for economic dependence clients, storage medium and electronic equipment
CN111428092B (en) Bank accurate marketing method based on graph model
KR20170094935A (en) System for providing enterprise information and method
CN111932131B (en) Service data processing method and device
CN113706279A (en) Fraud analysis method and device, electronic equipment and storage medium
CN113988878A (en) Graph database technology-based anti-fraud method and system
Bian et al. FedAvg-DWA: A Novel Algorithm for Enhanced Fraud Detection in Federated Learning Environment
Li et al. Influence of Internet-based Social Big Data on Personal Credit Reporting
Ahn et al. An analysis of the evolution of global financial network of the coordinated portfolio investment survey
Sun et al. A new perspective of credit scoring for small and medium-sized enterprises based on invoice data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200710

RJ01 Rejection of invention patent application after publication