CN116167865A - Community discovery-based bill abnormal customer identification method, system, terminal equipment and storage medium - Google Patents
Community discovery-based bill abnormal customer identification method, system, terminal equipment and storage medium Download PDFInfo
- Publication number
- CN116167865A CN116167865A CN202211549331.1A CN202211549331A CN116167865A CN 116167865 A CN116167865 A CN 116167865A CN 202211549331 A CN202211549331 A CN 202211549331A CN 116167865 A CN116167865 A CN 116167865A
- Authority
- CN
- China
- Prior art keywords
- risk
- community
- bill
- node
- client
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 33
- 238000000034 method Methods 0.000 title claims abstract description 27
- 238000004422 calculation algorithm Methods 0.000 claims description 26
- 238000004364 calculation method Methods 0.000 claims description 11
- 230000002547 anomalous effect Effects 0.000 claims description 8
- 238000013461 design Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 230000000694 effects Effects 0.000 claims description 5
- 238000012795 verification Methods 0.000 claims description 4
- 239000003086 colorant Substances 0.000 claims description 3
- 238000004590 computer program Methods 0.000 claims description 3
- 238000012790 confirmation Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 239000002131 composite material Substances 0.000 claims 1
- 239000000463 material Substances 0.000 abstract description 5
- 238000011835 investigation Methods 0.000 abstract description 4
- 238000012544 monitoring process Methods 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 4
- 238000013136 deep learning model Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- RWSOTUBLDIXVET-UHFFFAOYSA-N Dihydrogen sulfide Chemical compound S RWSOTUBLDIXVET-UHFFFAOYSA-N 0.000 description 1
- BQCADISMDOOEFD-UHFFFAOYSA-N Silver Chemical compound [Ag] BQCADISMDOOEFD-UHFFFAOYSA-N 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005242 forging Methods 0.000 description 1
- 230000008595 infiltration Effects 0.000 description 1
- 238000001764 infiltration Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000012954 risk control Methods 0.000 description 1
- 229910052709 silver Inorganic materials 0.000 description 1
- 239000004332 silver Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/08—Insurance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Technology Law (AREA)
- Development Economics (AREA)
- Computing Systems (AREA)
- Human Resources & Organizations (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a method, a system, a terminal device and a storage medium for identifying abnormal clients of bills based on community discovery, and relates to the field of computer systems. The accuracy rate of the list is high through feedback of the business personnel of each branch, the time cost of on-site investigation and the cost of manpower and material resources are greatly saved, the great approval and popularization of each branch are obtained, and corresponding actual measures such as shutdown business or enhanced risk monitoring are adopted for abnormal clients confirmed in the list.
Description
Technical Field
The invention relates to the field of computer systems, in particular to a method, a system, terminal equipment and a storage medium for identifying abnormal bill clients based on community discovery.
Background
The bill market is an important component of the financial market in China, and is also an important settlement and financing market of entity economy. In recent years, as the traditional asset business of commercial banks, the economic function of the bill market service entity is further exerted under the background that the macroscopic economy speed-up slows down and the economic development enters a new normal state. With the wide involvement of small and medium banks and non-silver financial institutions in bill business, particularly the intervention of folk bill intermediaries, bill participation bodies are diversified gradually, and the institutions often have larger differences in the aspects of own risk preference, internal risk control, management system construction, business personnel allocation and the like, so that new risk factors are brought to bill markets. In particular, the infiltration of the civil bill agency to the bill market increases the possibility of moral risk and transaction fraud, and the risk is diffused to other market bodies along a complex transaction chain, so that the difficulty of bill business risk prevention and control is increased to a certain extent. Currently, abnormal clients such as bill intermediaries, blank companies and the like transact bill business to become an important factor affecting the compliance risk management of commercial banks. Analysis finds that the profit patterns and main characteristics of bill intermediaries and empty shell companies are as follows:
1. profit mode
(1) Earning buying and selling gap: first, a company is registered and established, and the actual operation is conducted to control the daily operation. And secondly, collecting ticket source information in a ticket market, continuously expanding ticket sources, finding out ticket holding enterprises, negotiating with the ticket holding enterprises to paste the prices, and acquiring the ticket in a mode of endorsement transfer and payment of enterprise price. Again, forging trade background materials including value added tax invoices, finding a partner bank through the operated company to make a posting.
(2) Earning intermediary service fees: a channel is provided for buying and selling bills by establishing a 'cooperative relationship' with an individual bank, a quotation and matching service is provided for bill buyers and sellers by utilizing the bank channel, an enterprise is illicitly helped to obtain flowing funds, and an intermediary service fee is charged.
2. Main features
(1) The ticket source needs to be expanded continuously and is not limited by the authenticity of trade background, so a large number of tickets (the number of tickets, the amount of tickets and the number of enterprises seeking to be posted from the ticket source are large).
(2) In order to not squeeze the fund pool, the cash register or endorsement transfer operation can be carried out immediately after the cash register receives the ticket, so that the average ticket holding number is lower.
(3) To preempt the market, the business will not "stop" and the transaction will be relatively frequent.
(4) The most typical features that distinguish from other suspicious transactions: there are a number of "integer-like" instrument transactions.
The prior art mainly relies on the traditional manual examination mode for bill intermediation and empty shell company identification, not only consumes manpower and material resources, but also has poor actual effect. Experience is relied upon in the process of customer admission inspection and subsequent service reinspection.
In the prior art, the risk of the existing bill service is analyzed at the bill service level, and the risk that a blank company and a bill intermediary are important precautions of a commercial bank is mentioned, but the technologies only put forward corresponding management suggestions of the bill service at the service operation level, and the foothold is still on manual examination in the traditional mode, so that the manpower and more values are not liberated by utilizing the natural big data advantage of the electric bill.
Meanwhile, the prior art utilizes the Luwen algorithm to search the case-related cluster of the transaction data, and is mainly used for economic reconnaissance work. Although the Luwen algorithm technology is used, the application field is in economic case, and only the search function is realized.
The business bank bill business authorities need to identify abnormal clients of all clients of the present bank on-hand bill, and the main purpose is to screen empty companies and bill intermediaries in the abnormal clients, discover fraud risks and compliance risks in bill business in time, and prompt and prevent corresponding risks. Therefore, the customer quality is improved, the negative influence of abnormal customers mixed in the entity enterprises on the bank bill business is reduced, the bank funds can be ensured to reach the real enterprises and the real demands, and the economical capacity of the financial service entity is improved.
The identification of abnormal bill clients by commercial banks is still in the stage of relying on experience judgment, subjective speculation and on-site manual examination, and the increasing client level in the electric bill era forms a great challenge for the traditional method which consumes a great amount of time, manpower and material resources. Therefore, by means of the natural big data advantage of the ticket age, the off-site investigation of ticket abnormal clients by using a model algorithm by using technological forces becomes urgent.
Disclosure of Invention
The embodiment of the invention provides a method, a system, terminal equipment and a storage medium for identifying abnormal bill clients based on community discovery.
A method for identifying abnormal bill clients based on community discovery specifically comprises the following steps:
step 1, designing point-edge relation of bill clients, wherein the step is to analyze risk indexes related to the bill clients and design nodes and edges of bill close client patterns;
processing node and edge data, wherein the step is to process the point-edge design scheme in the step 1 into three node files, one attribute file and two edge files;
step 3, building a bill client knowledge graph, wherein the bill client knowledge graph is built in Neo4j by using the nodes, the attributes and the edge files processed in the step 2, and different nodes are marked by using different colors;
step 4, dividing communities by using a Louvain algorithm, wherein the Louvain algorithm is an algorithm for community discovery based on modularity;
step 5, dividing community groups according to the community risk probability, and setting three risk probability level thresholds p low ,p mid ,p high The risk probability is less than or equal to p low Is a community group with low risk probability, and the risk probability is larger than p low P is less than or equal to mid Is a risk-in-stroke probability community group, and the risk probability is greater than p mid P is less than or equal to high Is a community group with high risk probability;
step 7, calculating the risk level of each bill client, converting the risk score of each bill client into a score of 0-100 according to the mapping relation, and dividing the risk score into 10 risk level levels 1 ,level 2 ,...,level 10 ;
Step 8, selecting clients with risk scores higher than a limit value to generate an abnormal client list, and selecting clients with risk scores higher than a set risk threshold level risk Forming an abnormal client list by the clients of (1);
step 9, issuing an abnormal client list to the operation institution for verification and confirmation;
and step 10, calculating a model identification effect according to the feedback result of the operation institution.
Further: the nodes in the step 1 are three types, namely bill client nodes, associated person nodes and account checking IP nodes, wherein the associated person nodes comprise natural persons related to bill clients and bill services, such as legal persons, stakeholders, high-level management, actual control persons, client managers and the like.
Further: the calculation formula of the modularity is as follows:
where m is the number of connections in the network, v and w are any two nodes in the network, A when there is a connection between them vw =1, otherwise a vw =0;k W Representing the degree of node w; delta (c) V ,c w ) For determining whether the nodes v and w are in the same community, if so V ,c w ) =1, otherwise δ (c V ,c w )=0;
The simplified form is:
wherein, sigma in Is the number of edges in community c; sigma (sigma) tot Is the sum of the degrees of nodes in community c;
the calculation formula of the modularity increment is as follows:
wherein, sigma in Is the number of edges in community c; sigma (sigma) tot Is the degree of the node within community c; k (k) i Is the degree of node i; k (k) i,in Is the sum of the number of connections between node i and nodes within community c.
Further: the Louvain algorithm is divided into three phases, namely:
step 1, each node is made to belong to a community c, n nodes in the network exist at the moment, n communities exist, and the module degree Q at the moment is calculated 0 Then let node i no longer belong to the community c in which it is located i Dividing node i and node j into communities, and calculating the modularity Q at the moment 1 Calculating module gain Δq=q 1 -Q 0 If delta Q If the node i is more than 0, the node i should be divided into communities where j is located, otherwise, the node i should not be divided into communities where j is located;
step 2, the communities divided in the step 1 are aggregated into a node, and the whole network is reconstructed;
and 3, when the modularity is no longer increased, the iteration is automatically stopped.
Further: the risk probability calculation formula of the community is as follows:
wherein c i Is community i, n risk For community c i The number of clients that have been marked as anomalous clients, n norisk For community c i The number of clients not marked as anomalous clients.
Further: the calculation formula of the comprehensive risk score is as follows:
wherein r is 0 Is the community risk level, w 0 Representing the weight, r 1 ,r 2 ,...,r k Is other risk index of interest in bill business, w 1 ,w 2 ,...,w k Is the weight corresponding to these risk indicators.
Further: the node attributes in the step 1 are two types, namely the bill client attribute and the bill risk index.
Further: the edges in the step 1 are two types, one type is that the associated person node points to the bill client node, and the other type is that the reconciliation I P node points to the bill client node.
Further: the system comprises a data acquisition module, a data processing module, an algorithm module, a logic module and a display module;
the data acquisition module is used for acquiring bill customer information;
the algorithm module obtains corresponding index information based on the obtained bill client information and matched with a corresponding algorithm;
the logic module is used for carrying out logic judgment and screening and rejection on the index information;
the display module is used for displaying the index information after the judgment to the management institution.
Further: the terminal device may include: the system comprises a processor, a storage medium and a bus, wherein the storage medium stores machine-readable instructions executable by the processor, when the terminal device is running, the processor communicates with the storage medium through the bus, and the processor executes the machine-readable instructions to execute the steps of the deep learning model training method as described in the previous embodiment.
Further: a storage medium storing a computer program which, when executed by a processor, performs the steps of the method described above.
The invention has the beneficial effects that: according to the method, a relationship map of bill close clients is established, community division is carried out on a bill client network according to a Louvain algorithm, risk probability of each community is calculated, risk scores and risk grades of each client are calculated by combining other risk characteristic indexes of bill clients, client groups with higher risk scores and risk grades are selected to enter an abnormal client investigation list according to the sequence, and the list is issued to an operation institution for manual verification. The accuracy rate of the list is high through feedback of the business personnel of each branch, the time cost of on-site investigation and the cost of manpower and material resources are greatly saved, the great approval and popularization of each branch are obtained, and corresponding actual measures such as shutdown business or enhanced risk monitoring are adopted for abnormal clients confirmed in the list.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a schematic flow chart of the method of the present invention;
fig. 2 shows a schematic diagram of the composition of the device of the invention;
fig. 3 shows a schematic diagram of the composition of the terminal device of the present invention;
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described with reference to the accompanying drawings in the embodiments of the present invention, and it should be understood that the drawings in the present invention are for the purpose of illustration and description only and are not intended to limit the scope of the present invention. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this disclosure, illustrates operations implemented according to some embodiments of the present invention. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to or removed from the flow diagrams by those skilled in the art under the direction of the present disclosure.
In addition, the described embodiments of the invention are only some, but not all, embodiments of the invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
It should be noted that the term "comprising" will be used in embodiments of the invention to indicate the presence of the features stated hereafter, but not to exclude the addition of other features. It should also be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures. In the description of the present invention, it should also be noted that the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and should not be construed as indicating or implying relative importance.
Figure 1 shows a flow chart of the steps of the method of the invention.
The method for identifying abnormal bill clients based on community discovery specifically comprises the following steps:
step 1, designing point-edge relation of bill clients, wherein the step is to analyze risk indexes related to the bill clients and design nodes and edges of bill impression client patterns, and the nodes in the step are three types, namely bill client nodes, associated person nodes and reconciliation I P nodes, wherein the associated person nodes comprise natural persons related to bill clients and bill services, such as legal persons, stakeholders, high-rise management, actual control persons, client managers and the like; the node attributes of the step are two types, namely a bill client attribute and a bill risk index, wherein the bill client attribute comprises a client name, a client number, a client belonging branch, a client belonging agency, a client belonging secondary industry and the like, and the bill risk index comprises an endorsement out-of-limit number, a single-day maximum endorsement out-of-limit number, whether an abnormal list client, whether a sensitive industry and the like; the sides of the step are two types, one type is that the associated person node points to the bill client node, the other type is that the reconciliation I P node points to the bill client node, and the two types of sides both represent subordinate relations;
and 2, processing data of nodes and edges, wherein the step is to process the point-edge design scheme in the step 1 into three node files, one attribute file and two edge files.
And 3, building a bill client knowledge graph, wherein the bill client knowledge graph is built in Neo4j by using the nodes, the attributes and the edge files processed in the step 2, and different nodes are marked by using different colors.
And 4, dividing communities by using a Louvain algorithm, wherein the Louvain algorithm is from an article Fast unfolding of communities in large networks published by Vincent et al and is an algorithm for community discovery based on modularity. Modularity is a quantization index used for measuring the quality of community division. If a community division algorithm can divide points with dense connections into communities and the connections between the communities are sparse, the value of the network modularity obtained by division is larger, so that the community division with larger modularity is better.
The calculation formula of the modularity is as follows:
where m is the number of connections in the network, v and w are any two nodes in the network, A when there is a connection between them vw =1, otherwise a vw =0;k W Representing the degree of node w; delta (c) V ,c w ) For determining whether the nodes v and w are in the same community, if so V ,c w ) =1, otherwise δ (c V ,c w )=0;
The simplified form is:
wherein, sigma in Is the number of edges in community c; sigma (sigma) tot Is the sum of the degrees of nodes in community c;
the calculation formula of the modularity increment is as follows:
wherein, sigma in Is an edge within community cA number; sigma (sigma) tot Is the degree of the node within community c; k (k) i Is the degree of node i; k (k) i,in Is the sum of the number of connections between node i and nodes within community c.
The Louvain algorithm is divided into three phases, namely:
step 1, each node is made to belong to a community c, n nodes in the network exist at the moment, n communities exist, and the module degree Q at the moment is calculated 0 Then let node i no longer belong to the community c in which it is located i Dividing node i and node j into communities, and calculating the modularity Q at the moment 1 Calculating module gain Δq=q 1 -Q 0 If delta Q If the node i is more than 0, the node i should be divided into communities where j is located, otherwise, the node i should not be divided into communities where j is located;
step 2, the communities divided in the step 1 are aggregated into a node, and the whole network is reconstructed;
and 3, when the modularity is no longer increased, the iteration is automatically stopped.
Calculating the risk probability of each community, and supposing c i Is community i, n risk For community c i The number of clients that have been marked as anomalous clients, n norisk For community c i The number of clients not marked as anomalous clients.
The risk probability calculation formula of the community is as follows:
step 5, setting three risk probability level thresholds p low ,p mid ,p high The risk probability is less than or equal to p low Is a community group with low risk probability, and the risk probability is larger than p low P is less than or equal to mid Is a risk-in-stroke probability community group, and the risk probability is greater than p mid P is less than or equal to high Is a community group with high risk probability.
And 6, calculating the risk score of each bill client, taking the risk score as one dimension in the client risk evaluation according to the community risk grade obtained in the step 5, and combining other risk indexes of the bill impression client and corresponding weights thereof to obtain the comprehensive risk score of the client. The calculation formula of the comprehensive risk score is as follows:
wherein r is 0 Is the community risk level, w 0 Representing the weight, r 1 ,r 2 ,...,r k Is other risk index of interest in bill business, w 1 ,w 2 ,...,w k Weights corresponding to the risk indexes;
step 7, calculating the risk level of each bill client, converting the risk score of each bill client into a score of 0-100 according to the mapping relation, and dividing the risk score into 10 risk level levels 1 ,level 2 ,...,level 10 . The higher the score, the higher the risk level, and the greater the probability that the customer is an anomalous customer; conversely, the lower the score, the lower the risk level, and the less likely the customer is an anomalous customer;
step 8, selecting clients with risk scores higher than a limit value to generate an abnormal client list, and selecting clients with risk scores higher than a set risk threshold level risk Forming an abnormal client list by the clients of (1);
step 9, issuing an abnormal client list to the operation institution for verification and confirmation;
and step 10, calculating a model identification effect according to the feedback result of the operation institution.
Assuming that the issued abnormal client lists are accumulated for N total, M client hits are checked, the total abnormal clients found by the management organization are K total abnormal clients, wherein S abnormal clients are found in the abnormal client lists issued by the model, the calculation formulas of the model accuracy rate and the recall rate are as follows:
as shown in fig. 2, the system corresponding to the method comprises a data acquisition module, a data processing module, an algorithm module, a logic module and a display module;
the data acquisition module is used for acquiring bill customer information;
the algorithm module obtains corresponding index information based on the obtained bill client information and matched with a corresponding algorithm;
the logic module is used for carrying out logic judgment and screening and rejection on the index information;
the display module is used for displaying the index information after the judgment to the management institution.
As shown in fig. 3, the terminal device 6 may include: processor 601, storage medium 602, and bus 603, storage medium 602 storing machine-readable instructions executable by processor 601, when the terminal device is running, the processor 601 communicates with storage medium 602 via bus 603, and processor 601 executes the machine-readable instructions to perform the steps of the deep learning model training method as described in the previous embodiments. The specific implementation manner and the technical effect are similar, and are not repeated here.
For ease of illustration, only one processor is described in the above terminal device. It should be noted, however, that in some embodiments, the terminal device of the present invention may also include multiple processors, and thus, the steps performed by one processor described in the present invention may also be performed jointly by multiple processors or separately.
The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily appreciate variations or alternatives within the scope of the present invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.
Claims (9)
1. A method for identifying abnormal bill clients based on community discovery is characterized by comprising the following steps:
step 1, designing point-edge relation of bill clients, wherein the step is to analyze risk indexes related to the bill clients and design nodes and edges of bill close client patterns;
processing node and edge data, wherein the step is to process the point-edge design scheme in the step 1 into three node files, one attribute file and two edge files;
step 3, building a bill client knowledge graph, wherein the bill client knowledge graph is built in Neo4j by using the nodes, the attributes and the edge files processed in the step 2, and different nodes are marked by using different colors;
step 4, dividing communities by using a Louvain algorithm, wherein the Louvain algorithm is an algorithm for community discovery based on modularity;
step 5, dividing community groups according to the community risk probability, and setting three risk probability level thresholds p low ,p mid ,p high The risk probability is less than or equal to p low Is a community group with low risk probability, and the risk probability is larger than p low P is less than or equal to mid Is a risk-in-stroke probability community group, and the risk probability is greater than p mid P is less than or equal to high Is a community group with high risk probability;
step 6, calculating the risk score of each bill client, taking the risk score as one dimension in the client risk evaluation according to the community risk grade obtained in the step 5, and combining other risk indexes of the bill impression client and corresponding weights thereof to obtain the comprehensive risk score of the client;
step 7, calculating the risk level of each bill client, converting the risk score of each bill client into a score of 0-100 according to the mapping relation, and dividing the risk score into 10 risk level levels 1 ,level 2 ,…,level 10 ;
Step 8, selecting clients with risk scores higher than a limit value to generate an abnormal client list, and selecting clients with risk scores higher than a set risk threshold level risk Forming an abnormal client list by the clients of (1);
step 9, issuing an abnormal client list to the operation institution for verification and confirmation;
and step 10, calculating a model identification effect according to the feedback result of the operation institution.
2. The method of claim 1, wherein the nodes in step 1 are three types, namely a bill client node, a correspondents node and a reconciliation IP node.
3. The method of claim 1, wherein the modularity is calculated as:
where m is the number of connections in the network, v and w are any two nodes in the network, A when there is a connection between them vw =1, otherwise a vw =0;k W Representing the degree of node w; delta (c) V ,c w ) For determining whether the nodes v and w are in the same community, if so V ,c w ) =1, otherwise δ (c V ,c w )=0;
The calculation formula of the modularity increment is as follows:
wherein, sigma in Is the number of edges in community c; sigma (sigma) tot Is the degree of the node within community c; k (k) i Is the degree of node i; k (k) i,in Is the sum of the number of connections between node i and nodes within community c.
4. The method according to claim 1, characterized in that the Louvain algorithm is divided into three phases, respectively:
stage 1, first make each sectionThe point belongs to a community c, n nodes and n communities exist in the network, and the module degree Q at the moment is calculated 0 Then let node i no longer belong to the community c in which it is located i Dividing node i and node j into communities, and calculating the modularity Q at the moment 1 Calculating module gain Δq=q 1 -Q 0 If delta Q >0, the node i should be divided into communities where j is located, otherwise the node i should not be divided into communities where j is located;
step 2, the communities divided in the step 1 are aggregated into a node, and the whole network is reconstructed;
and 3, when the modularity is no longer increased, the iteration is automatically stopped.
5. The method of claim 1, wherein the risk probability calculation formula for the community is as follows:
wherein c i Is community i, n risk For community c i The number of clients that have been marked as anomalous clients, n norisk For community c i The number of clients not marked as anomalous clients.
7. The system for identifying abnormal bill clients based on community discovery is characterized by comprising a data acquisition module, a data processing module, an algorithm module, a logic module and a display module;
the data acquisition module is used for acquiring bill customer information;
the algorithm module obtains corresponding index information based on the obtained bill client information and matched with a corresponding algorithm;
the logic module is used for carrying out logic judgment and screening and rejection on the index information;
the display module is used for displaying the index information after the judgment to the management institution.
8. A terminal device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the terminal device is operating, the processor executing the machine-readable instructions to perform the steps of the method of any of claims 1 to 6 when executed.
9. A storage medium, characterized in that the storage medium has a computer program stored thereon,
the computer program is executed by a processor to perform the steps of the method according to any of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211549331.1A CN116167865A (en) | 2022-12-05 | 2022-12-05 | Community discovery-based bill abnormal customer identification method, system, terminal equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211549331.1A CN116167865A (en) | 2022-12-05 | 2022-12-05 | Community discovery-based bill abnormal customer identification method, system, terminal equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116167865A true CN116167865A (en) | 2023-05-26 |
Family
ID=86410095
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211549331.1A Pending CN116167865A (en) | 2022-12-05 | 2022-12-05 | Community discovery-based bill abnormal customer identification method, system, terminal equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116167865A (en) |
-
2022
- 2022-12-05 CN CN202211549331.1A patent/CN116167865A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Beraja et al. | Data-intensive innovation and the state: Evidence from AI firms in China | |
CN111476660B (en) | Intelligent wind control system and method based on data analysis | |
CN107832964A (en) | Bank client relation loop analysis method and system | |
CN108596443A (en) | A kind of Electricity customers method for evaluating credit rating based on multi-dimensional data | |
CN107103548A (en) | The monitoring method and system and risk monitoring and control method and system of network behavior data | |
CN106776897A (en) | A kind of user's portrait label determines method and device | |
WO2023082969A1 (en) | Data feature combination pricing method and system based on shapley value and electronic device | |
Renigier-Biłozor et al. | Forced sale discount on property market–How to assess it? | |
CN113989019A (en) | Method, device, equipment and storage medium for identifying risks | |
CN112419030B (en) | Method, system and equipment for evaluating financial fraud risk | |
CN112613977A (en) | Personal credit loan admission credit granting method and system based on government affair data | |
CN112330342A (en) | Method and system for optimally matching enterprise name and system user name | |
CN101226614A (en) | Method for estimation of network assets essentiality | |
CN111506876A (en) | Data prediction analysis method, system, equipment and readable storage medium | |
Arnaudo et al. | The digital trasformation in the Italian banking sector | |
Pham et al. | Innovation and bank efficiency in Vietnam and Pakistan | |
CN105427171A (en) | Data processing method of Internet lending platform rating | |
CN117094764A (en) | Bank integral processing method and device | |
CN116167865A (en) | Community discovery-based bill abnormal customer identification method, system, terminal equipment and storage medium | |
CN115907840A (en) | Transaction risk prediction method and device for transaction risk prediction | |
CN112199360A (en) | Data processing method, device, equipment and medium | |
CN111460052A (en) | Low-security fund supervision method and system based on supervised data correlation analysis | |
CN112700322B (en) | Order sampling detection method, order sampling detection device, electronic equipment and storage medium | |
CN112116356B (en) | Asset characteristic information processing method and device | |
KR102308098B1 (en) | An apparatus and method for providing user interfaces of managing transaction information based on automatic matching between accounts receivables and deposit information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |