CA3150593A1 - Method for identifying underground industry entities and system thereof - Google Patents
Method for identifying underground industry entities and system thereof Download PDFInfo
- Publication number
- CA3150593A1 CA3150593A1 CA3150593A CA3150593A CA3150593A1 CA 3150593 A1 CA3150593 A1 CA 3150593A1 CA 3150593 A CA3150593 A CA 3150593A CA 3150593 A CA3150593 A CA 3150593A CA 3150593 A1 CA3150593 A1 CA 3150593A1
- Authority
- CA
- Canada
- Prior art keywords
- data
- underground industry
- underground
- industry entity
- enterprise customer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000004140 cleaning Methods 0.000 claims abstract description 15
- 238000012545 processing Methods 0.000 claims description 10
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000013507 mapping Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 claims description 3
- 230000009286 beneficial effect Effects 0.000 description 5
- 208000001613 Gambling Diseases 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 3
- 238000013459 approach Methods 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000003825 pressing Methods 0.000 description 2
- 230000003213 activating effect Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Business, Economics & Management (AREA)
- Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Technology Law (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention discloses a method for identifying underground industry entities and a system thereof, which belong to the technical field of Internet-based fintech and are designed to enhance accuracy and efficiency in identifying underground industry entities. The method includes: collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information; tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes an underground industry entity recorded in the enterprise customer data and said tag data corresponding thereto. The system is to be used in the method
Description
METHOD FOR IDENTIFYING UNDERGROUND INDUSTRY ENTITIES AND
SYSTEM THEREOF
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of Internet-based fintech, and more particularly to a method for identifying underground industry entities and a system thereof.
Description of Related Art
SYSTEM THEREOF
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of Internet-based fintech, and more particularly to a method for identifying underground industry entities and a system thereof.
Description of Related Art
[0002] The Internet-based underground financial industry lays its foundation on customers of financial credit products who have adversely classified credits. These people usually have difficulty in getting loans through banks or other regular channels, such as peasant workers and students. Some of them lack credit consciousness and are keen on getting petty advantages, and they tend to cheat on loans and cash unscrupulously but do not have willingness to repay. For such a group of people, some underground dealers have sprung up on the Internet to provide services related to malicious acts like cashing out, arbitrage, and identity tampering. The underground dealers usually advertise their cashing out, arbitrage, and identity tampering services on the Internet through various forums, online communities, Weibo, official accounts, and so on, with the attempt to attract the attention of their target customers. Therefore, by webscraping the dynamic of the underground industry and monitoring posts from the related dealers, it is possible to warn banks and financial service providers of these suspected customers so that the banks and the financial service providers can timely reject their loan applications and reduce loss. Hence, how to use technical means to identify underground industry entity effectively has become a pressing need to be met in the credit industry.
Date Recue/Date Received 2022-03-01 SUMMARY OF THE INVENTION
Date Recue/Date Received 2022-03-01 SUMMARY OF THE INVENTION
[0003] The objective of the present invention is to provide a method for identifying underground industry entities and a system thereof for enhancing accuracy and efficiency of identifying underground industry entities. In the context of the present disclosure, the underground industry entities can also be referred to as suspected fraudulent entities.
[0004] To achieve the foregoing objective, in a first aspect, the present invention provides an anti-fraud method for identifying underground industry entities, comprises:
[0005] collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
[0006] classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and
[0007] associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result in the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
[0008] Preferably, the step of collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information comprises:
[0009] having the collected underground industry data include user IDs, content details, data sources, link addresses, and publication times, in which the content details include the underground industry entity information, and optionally include terminal identification numbers and/or login IP addresses; and
[0010] cleaning the underground industry data using a predetermined regular expression, and extracting the valid data containing the underground industry entity information.
Date Recue/Date Received 2022-03-01
Date Recue/Date Received 2022-03-01
[0011] More preferably, the step of classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data comprises:
[0012] having the underground industry entity classification table include plural entries of said tag data, and having each said entry of the tag data include plural keywords;
[0013] performing word segmentation on the valid data and matching segments of the valid data with keywords corresponding to respective tag data in a one-to-one manner; and
[0014] counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
[0015] Further, the step of associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data comprises:
[0016] associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and
[0017] identifying the potential risk entities according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
[0018] Preferably, the method further comprises:
[0019] constructing mapping relationship between the tag data and risk levels, in which fraud probability levels of the risk levels correspond to grey accounts, high-risk accounts, black accounts and highly-black accounts from low to high; and
[0020] when the underground industry entity recognition result is output, outputting the corresponding risk level.
Date Recue/Date Received 2022-03-01
Date Recue/Date Received 2022-03-01
[0021] Preferably, the method further comprises:
[0022] training a risk rating model using a PageRank algorithm according to the association relationship between the loaning entity in the enterprise customer data and the underground industry entity information;
[0023] when the underground industry entity recognition result is output, using the risk rating model to perform risk rating.
[0024] As compared to the prior art, the anti-fraud method for identifying underground industry entities of the present invention has the following beneficial effects:
[0025] The method for identifying underground industry entities of the present invention uses the data collection technology to collect underground industry data of contents of posts or replies about cashing out, arbitrage, and identity tampering in social media like popular forums, online communities, and cleans the underground industry data so as to obtain valid data containing underground industry entity information. The valid data are subsequently classified and tagged according to a pre-configured underground industry entity classification table, so as to generate corresponding tag data. At last, the underground industry entity information is associated to and matched with enterprise customer data, so as to generate an underground industry entity recognition result from the enterprise customer data.
[0026] It is thus clear that, as compared to the manual webscraping approach in the prior art, the method of the present invention collects underground industry data automatically in a real-time manner, thereby ensuring timeliness and efficiency in collecting the underground industry data. Additionally, the foregoing procedures make identification of underground industry entities programmable and automated, thereby enhancing accuracy and efficiency in identifying underground industry entities.
[0027] In a second aspect, the present invention provides an anti-fraud system for identifying Date Recue/Date Received 2022-03-01 underground industry entities to be used in the anti-fraud method for identifying underground industry entities of the previous technical scheme. The system comprises:
[0028] a collecting unit, for collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
[0029] a processing unit, for classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and
[0030] an identifying unit, for associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
[0031] Preferably, the processing unit comprises:
[0032] a table-constructing module, for constructing the underground industry entity classification table, wherein the underground industry entity classification table includes plural entries of said tag data, and each said entry of the tag data include plural keywords;
[0033] a matching module, for performing word segmentation on the valid data and matching segments of the valid data with corresponding keywords in a one-to-one manner;
and
and
[0034] a selecting module, for counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
[0035] More preferably, the identifying unit comprises:
[0036] a managing module, for associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and
[0037] an identifying module, for identifying the potential risk entities in the enterprise customer Date Recue/Date Received 2022-03-01 data according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
[0038] As compared to the prior art, the anti-fraud system for identifying underground industry entities of the present invention provides beneficial effects that are similar to those provided by the disclosed anti-fraud method for identifying underground industry entities as enumerated above, and thus no repetitions are made herein.
[0039] In a third aspect, the present invention provides a computer-readable storage medium, storing therein a computer program, when executed by a processor, the computer program performs the steps of the method for identifying underground industry entities as described above.
[0040] As compared to the prior art, the computer-readable storage medium of the present invention provides beneficial effects that are similar to those provided by the disclosed method for identifying underground industry entities as enumerated above, and thus no repetitions are made herein.
BRIEF DESCRIPTION OF THE DRAWINGS
BRIEF DESCRIPTION OF THE DRAWINGS
[0041] The accompanying drawings are provided herein for better understanding of the present invention and form a part of this disclosure. The illustrative embodiments and their descriptions are for explaining the present invention and by no means form any undue limitation to the present invention, wherein:
[0042] FIG. 1 is a flowchart of a method for identifying underground industry entities according to one embodiment of the present invention; and
[0043] FIG. 2 is a processing sequence diagram of a method for identifying underground industry entities according to one embodiment of the present invention.
Date Recue/Date Received 2022-03-01 DETAILED DESCRIPTION OF THE INVENTION
Date Recue/Date Received 2022-03-01 DETAILED DESCRIPTION OF THE INVENTION
[0044] To make the foregoing objectives, features, and advantages of the present invention clearer and more understandable, the following description will be directed to some embodiments as depicted in the accompanying drawings to detail the technical schemes disclosed in these embodiments. It is, however, to be understood that the embodiments referred herein are only a part of all possible embodiments and thus not exhaustive. Based on the embodiments of the present invention, all the other embodiments can be conceived without creative labor by people of ordinary skill in the art, and all these and other embodiments shall be embraced in the scope of the present invention.
[0045] Embodiment 1
[0046] Referring to FIG. 1, the present embodiment provides a method for identifying underground industry entities, comprising:
[0047] collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information; tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data.
[0048] The method for identifying underground industry entities of the present embodiment uses the data collection technology to collect underground industry data of contents of posts or replies about cashing out, arbitrage, and identity tampering from social media like popular forums and online communities, and cleans the underground industry data so as to obtain valid data containing underground industry entity information. The valid data are subsequently classified and tagged according to a pre-configured underground industry entity classification table, so as to generate corresponding tag data. At last, the Date Recue/Date Received 2022-03-01 underground industry entity information is associated to and matched with enterprise customer data, so as to generate an underground industry entity recognition result from the enterprise customer data. The underground industry entity recognition result includes potential risk entities in the enterprise customer data and the corresponding tag data.
[0049] It is thus clear that, as compared to the manual webscraping approach in the prior art, the method of the present embodiment collects underground industry data automatically in a real-time manner, thereby ensuring timeliness and efficiency in collecting the underground industry data. Additionally, the foregoing procedures make identification of underground industry entities programmable and automated, thereby enhancing accuracy and efficiency in identifying underground industry entities.
[0050] In the embodiment, the step of collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information comprises:
[0051] having the collected underground industry data include user IDs, content details, data sources, link addresses, and publication times, in which the content details include the underground industry entity information, and optionally include terminal identification numbers and/or login IP addresses; and cleaning the underground industry data using a predetermined regular expression, and extracting the valid data containing the underground industry entity information.
[0052] In particular implementations, the underground industry data sources include contents of posts and replies in social media like popular forums and online communities.
Collection of the underground industry data shall be made on a real-time basis. Popular online communities may include quit gambling communities, financial intermediation communities, credit cultivation communities, credit expansion communities, clinical volunteer communities, Internet loan communities, deal hunting communities.
Popular forums may include Zuanke8, Chia-He-Jun Forums, Card God Net, 51 Credit Card Forum, Date Recue/Date Received 2022-03-01 and Card Farmer Forum. Fields in the collected underground industry data may include user IDs, content details, data sources, link addresses, publication times, terminal identification numbers, login IP addresses, or other valid data that reflect identities of underground industry entities and contents published by the underground dealers.
Specifically, data cleaning is achieved by using a predetermined regular expression to clean the collected underground industry text data. The extracted underground industry entity information may include mobile phone numbers, WeChat accounts, QQ
accounts, QQ groups, email addresses and the like. In practical use, the present embodiment further provides a function for business personnel to edit or process the underground industry text data again so as to manually repair or correct data not extracted at all or not correctly extracted by the algorithm.
Collection of the underground industry data shall be made on a real-time basis. Popular online communities may include quit gambling communities, financial intermediation communities, credit cultivation communities, credit expansion communities, clinical volunteer communities, Internet loan communities, deal hunting communities.
Popular forums may include Zuanke8, Chia-He-Jun Forums, Card God Net, 51 Credit Card Forum, Date Recue/Date Received 2022-03-01 and Card Farmer Forum. Fields in the collected underground industry data may include user IDs, content details, data sources, link addresses, publication times, terminal identification numbers, login IP addresses, or other valid data that reflect identities of underground industry entities and contents published by the underground dealers.
Specifically, data cleaning is achieved by using a predetermined regular expression to clean the collected underground industry text data. The extracted underground industry entity information may include mobile phone numbers, WeChat accounts, QQ
accounts, QQ groups, email addresses and the like. In practical use, the present embodiment further provides a function for business personnel to edit or process the underground industry text data again so as to manually repair or correct data not extracted at all or not correctly extracted by the algorithm.
[0053] In the embodiment, the step of tagging the valid data according to an underground industry entity classification table, so as to obtain tag data comprises:
[0054] having the underground industry entity classification table include plural entries of said tag data, and having each said entry of the tag data include plural keywords;
performing word segmentation on the valid data and matching segments of the valid data with keywords corresponding to respective tag data in a one-to-one manner; and counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
performing word segmentation on the valid data and matching segments of the valid data with keywords corresponding to respective tag data in a one-to-one manner; and counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
[0055] In particular implementations, the underground industry entity classification table may be manually configured by business personnel. The table contains plural tag data classes, each corresponding to plural keywords. For example, the tag data classes may include "capital thirsty," "load intermediary," "credit cultivation intermediary,"
"gambling," and "deal hunting." The keywords may include "repaid some," "credit checking,"
"sports gambling," "passing immediately," "white account," "get rid of debt,"
"cornered,"
"blacklist," and "overdue debt." The keywords are grouped into corresponding tag data Date Recue/Date Received 2022-03-01 classes and form the underground industry entity classification table.
Besides, the embodiment further provides a function for business personnel to edit and process the underground industry entity classification table again so as to manually repair and correct the tag data classes and keywords not accurately grouped.
"gambling," and "deal hunting." The keywords may include "repaid some," "credit checking,"
"sports gambling," "passing immediately," "white account," "get rid of debt,"
"cornered,"
"blacklist," and "overdue debt." The keywords are grouped into corresponding tag data Date Recue/Date Received 2022-03-01 classes and form the underground industry entity classification table.
Besides, the embodiment further provides a function for business personnel to edit and process the underground industry entity classification table again so as to manually repair and correct the tag data classes and keywords not accurately grouped.
[0056] In the embodiment, the step of associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data comprises:
[0057] associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and identifying the potential risk entities according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
[0058] In particular implementations, the underground industry entity information and the enterprise customer data are associated and matched using a knowledge graph.
The basic analysis ability provided by the knowledge graph platform can sort and integrate the data, and find association relationship of the data. Common association relationship may include WeChat accounts, logging in with the same device, logging in with the same IP, having the same mobile phone number. The internal knowledge graph platform integrates, associates and fuses the underground industry data, so as to identify customers with potential risks from the enterprise customer data.
The basic analysis ability provided by the knowledge graph platform can sort and integrate the data, and find association relationship of the data. Common association relationship may include WeChat accounts, logging in with the same device, logging in with the same IP, having the same mobile phone number. The internal knowledge graph platform integrates, associates and fuses the underground industry data, so as to identify customers with potential risks from the enterprise customer data.
[0059] The embodiment further comprises: constructing mapping relationship between the tag data and risk levels, in which fraud probability levels of the risk levels correspond to grey Date Recue/Date Received 2022-03-01 accounts, high-risk accounts, black accounts and highly-black accounts from low to high;
and when the underground industry entity recognition result is output, outputting the corresponding risk level. Additionally, or alternatively, the embodiment further comprises: training a risk rating model using a PageRank algorithm according to the association relationship between the loaning entity and the underground industry entity information in the enterprise customer data; when the underground industry entity recognition result is output, using the risk rating model to score the risk level.
and when the underground industry entity recognition result is output, outputting the corresponding risk level. Additionally, or alternatively, the embodiment further comprises: training a risk rating model using a PageRank algorithm according to the association relationship between the loaning entity and the underground industry entity information in the enterprise customer data; when the underground industry entity recognition result is output, using the risk rating model to score the risk level.
[0060] Referring to FIG. 2, for easy understanding, an example of the foregoing embodiment is described below:
[0061] Step 1 is about starting collection of underground industry data by activating a web-crawler program to collect information of contents of posts and replies posted in social media, wherein the collected data include the following fields: user IDs, content details, data sources, link addresses, and publication times.
[0062] Step 2 is about cleaning contents of the replies so as to obtain the valid data containing underground industry entity information.
[0063] Step 2-1 is about performing segmentation on the full-volume underground industry data text, and computing and sorting word frequencies from high to low.
[0064] Step 2-2 is about at a front-end data service module, calling word frequency computing service through micro-service interface, displaying the segments and corresponding word frequencies to business personnel for he/she to update the underground industry entity classification table regularly.
[0065] Step 2-3 is about picking out posts whose contents contain combinations of numbers and letters that are suspected to be contact information, and synchronizing the picked data to a data processing module.
[0066] Step 3 is about processing the underground industry data at the processing module, which includes pre-processing the underground industry entity data using an algorithm and verifying and/or correcting the pre-processed data by business personnel.
[0067] Step 3-1 is about preliminarily extracting the contact information from the text of the Date Recue/Date Received 2022-03-01 underground industry entity information using an algorithm, and displaying the information in an underground industry entity examination page simultaneously.
[0068] Step 3-2 is about pre-processing identities in the underground industry entity information by classifying the valid data in the text using an algorithm, tagging the valid data, and displaying the data in the underground industry entity examination page simultaneously.
[0069] Step 3-3 is about verifying the underground industry valid data information. Business personnel enter the underground industry entity examination page of the system to verify and confirm the underground industry entity information that has been extracted by the program, such as mobile phone numbers, WeChat accounts, QQ accounts, QQ
groups, and email addresses. If the extraction work of the system is correct, the business personnel can directly confirm the result by pressing a corresponding button in the page. If the extraction work of the system is not correct, the business personnel may edit and correct the data in the page.
groups, and email addresses. If the extraction work of the system is correct, the business personnel can directly confirm the result by pressing a corresponding button in the page. If the extraction work of the system is not correct, the business personnel may edit and correct the data in the page.
[0070] For example, there is the post titled "Use WeChat Mini Program to search for YOU
WANT TO LEND MONEY to get 3000," and has attracted the following replies:
WANT TO LEND MONEY to get 3000," and has attracted the following replies:
[0071] User A: If you have a card from Bank of Communications, I can help you to get cash 100 for free. Come to 287765737 if you possess resources and personal connections;
[0072] User B: I will never touch this anymore. I have come all this way to get rid of debt and I
swear that I will live my life well. If I get trapped next time, I doubt I
will get help again.
If you are now at a corner as I was, try to contact Brother Lo. Here is his contact information 9956252. Hopefully Brother Lo can help more people like me; and
swear that I will live my life well. If I get trapped next time, I doubt I
will get help again.
If you are now at a corner as I was, try to contact Brother Lo. Here is his contact information 9956252. Hopefully Brother Lo can help more people like me; and
[0073] User C: If you need cash, contact v1503391949.
[0074] In the above replies, underground industry entity information is as below: User A, contact channel: QQ, No. 287765737, corresponding tag: deal-hunter; User B, contact channel:
QQ, No. 9956252, corresponding tag: gambler; and User C, contact channel:
WeChat /mobile phone number, No. 1503391949, corresponding tag: deal-hunter.
Date Recue/Date Received 2022-03-01
QQ, No. 9956252, corresponding tag: gambler; and User C, contact channel:
WeChat /mobile phone number, No. 1503391949, corresponding tag: deal-hunter.
Date Recue/Date Received 2022-03-01
[0075] Step 4 is about associating the valid data to existing enterprise customer data for analysis.
[0076] Step 4-1 is about introducing the valid data into a knowledge graph.
[0077] Step 4-2 is about associating the valid data to enterprise customer data.
[0078] Step 5 is about providing underground industry data services according to the existing data.
[0079] Step 5-1 is about associating the data to an underground industry list.
Three essential factors, namely name, identity card number, mobile phone number of a user in the enterprise customer data are associated to the underground industry entity information, so as to identify potential risk customers form the enterprise customer data and to report the number and the level of the underground industry entity.
Three essential factors, namely name, identity card number, mobile phone number of a user in the enterprise customer data are associated to the underground industry entity information, so as to identify potential risk customers form the enterprise customer data and to report the number and the level of the underground industry entity.
[0080] Step 5-2 is about calculating risk scores of the underground industry entities. According to the reported number of levels of the associated underground industry entities, a proper algorithm is selected to calculate the risk scores of the underground industry entity. Then the risk scores are output.
[0081] To sum up, the present implementation has the following innovations:
[0082] 1. Full-process procedure automation for underground industry entity identification
[0083] With the disclosed underground industry monitoring system, a complete link from acquiring of data from public data sources to provision of services, and this allow the process of identifying underground industry entities to be configurable.
Therein, by properly configuring the tag data and the keywords, development costs can be significantly reduced and use efficiency of the system can be improved; and
Therein, by properly configuring the tag data and the keywords, development costs can be significantly reduced and use efficiency of the system can be improved; and
[0084] 2. Risk analysis based on social relationship
[0085] Association between the underground industry data and the enterprise customer data can be built using the knowledge graph platform. By using the three essential factors of the enterprise customer data as parameters, levels and number of users associated underground industry entities as well as risk scores can be reported and output as an underground industry entity recognition result from the enterprise customer data.
Date Recue/Date Received 2022-03-01
Date Recue/Date Received 2022-03-01
[0086] Embodiment 2
[0087] The present embodiment provides a system for identifying underground industry entities, comprising:
[0088] a collecting unit, for collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
[0089] a processing unit, for classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and
[0090] an identifying unit, for associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
[0091] Preferably, the processing unit comprises:
[0092] a table-constructing module, for constructing the underground industry entity classification table, having the underground industry entity classification table include plural entries of said tag data, and having each said entry of the tag data include plural keywords;
[0093] a matching module, for performing word segmentation on the valid data and matching segments of the valid data with corresponding keywords in a one-to-one manner;
and
and
[0094] a selecting module, for counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
[0095] Preferably, the identifying unit comprises:
[0096] a managing module, for associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association Date Recue/Date Received 2022-03-01 relationship includes an association level and an associated node count;
[0097] an identifying module, for identifying the potential risk entities in the enterprise customer data according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
[0098] As compared to the prior art, the disclosed anti-fraud system for identifying underground industry entities provides beneficial effects that are similar to those provided by the disclosed method for identifying underground industry entities as enumerated above, and thus no repetitions are made herein.
[0099] Embodiment 3
[0100] the present embodiment provides a computer-readable storage medium, storing therein a computer program, when executed by a processor, the computer program performs the steps of the method for identifying underground industry entities as described previously.
[0101] As compared to the prior art, the disclosed computer-readable storage medium provides beneficial effects that are similar to those provided by the disclosed method for identifying underground industry entities as enumerated above, and thus no repetitions are made herein.
[0102] As will be appreciated by people of ordinary skill in the art, implementation of all or a part of the steps of the method of the present invention as described previously may be realized by having a program instruct related hardware components. The program may be stored in a computer-readable storage medium, and the program is about performing the individual steps of the methods described in the foregoing embodiments.
The storage medium may be a ROM/RAM, a hard drive, an optical disk, a memory card or the like.
The storage medium may be a ROM/RAM, a hard drive, an optical disk, a memory card or the like.
[0103] The present invention has been described with reference to the preferred embodiments Date Recue/Date Received 2022-03-01 and it is understood that the embodiments are not intended to limit the scope of the present invention. Moreover, as the contents disclosed herein should be readily understood and can be implemented by a person skilled in the art, all equivalent changes or modifications which do not depart from the concept of the present invention should be encompassed by the appended claims. Hence, the scope of the present invention shall only be defined by the appended claims.
Date Recue/Date Received 2022-03-01
Date Recue/Date Received 2022-03-01
Claims (10)
1. An anti-fraud method for identifying underground industry entities, comprising:
collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result in the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities in the enterprise customer data and the corresponding tag data.
collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result in the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities in the enterprise customer data and the corresponding tag data.
2. The method of claim 1, wherein the step of collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information comprises:
the collected underground industry data include user IDs, content details, data sources, link addresses, and publication times, in which the content details include the underground industry entity information, and optionally include terminal identification numbers and/or login IP
addresses; and cleaning the underground industry data using a predetermined regular expression, and extracting the valid data containing the underground industry entity information.
the collected underground industry data include user IDs, content details, data sources, link addresses, and publication times, in which the content details include the underground industry entity information, and optionally include terminal identification numbers and/or login IP
addresses; and cleaning the underground industry data using a predetermined regular expression, and extracting the valid data containing the underground industry entity information.
3. The method of claim 1 or 2, wherein the step of classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data comprises:
the underground industry entity classification table includes plural entries of said tag data, and each said entry of the tag data includes plural keywords;
performing word segmentation on the valid data and matching segments of the valid data with keywords corresponding to respective tag data in a one-to-one manner; and counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
the underground industry entity classification table includes plural entries of said tag data, and each said entry of the tag data includes plural keywords;
performing word segmentation on the valid data and matching segments of the valid data with keywords corresponding to respective tag data in a one-to-one manner; and counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
4. The method of claim 3, wherein the step of associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data comprises:
associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and identifying the potential risk entities according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and identifying the potential risk entities according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
5. The method of claim 4, further comprising:
constructing mapping relationship between the tag data and risk levels, in which fraud probability of the risk levels correspond to grey accounts, high-risk accounts, black accounts and highly-black accounts from low to high; and when the underground industry entity recognition result is output, outputting the corresponding risk level simultaneously.
constructing mapping relationship between the tag data and risk levels, in which fraud probability of the risk levels correspond to grey accounts, high-risk accounts, black accounts and highly-black accounts from low to high; and when the underground industry entity recognition result is output, outputting the corresponding risk level simultaneously.
6. The method of claim 4, further comprising:
training a risk rating model using a PageRank algorithm according to the association relationship between the loaning entity in the enterprise customer data and the underground industry entity information;
when the underground industry entity recognition result is output, using the risk rating model to perform risk rating simultaneously.
training a risk rating model using a PageRank algorithm according to the association relationship between the loaning entity in the enterprise customer data and the underground industry entity information;
when the underground industry entity recognition result is output, using the risk rating model to perform risk rating simultaneously.
7. An anti-fraud underground industry entity identification system, comprising:
a collecting unit, for collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
a processing unit, for classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and an identifying unit, for associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
a collecting unit, for collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
a processing unit, for classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and an identifying unit, for associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
8. The system of claim 7, wherein the processing unit comprises:
a table-constructing module, for constructing the underground industry entity classification table, wherein the underground industry entity classification table includes plural entries of said tag data, and each said entry of the tag data include plural keywords;
a matching module, for performing word segmentation on the valid data and matching segments of the valid data with corresponding keywords in a one-to-one manner; and a selecting module, for counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
a table-constructing module, for constructing the underground industry entity classification table, wherein the underground industry entity classification table includes plural entries of said tag data, and each said entry of the tag data include plural keywords;
a matching module, for performing word segmentation on the valid data and matching segments of the valid data with corresponding keywords in a one-to-one manner; and a selecting module, for counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
9. The system of claim 7, wherein the identifying unit comprises:
a managing module, for associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and an identifying module, for identifying the potential risk entities in the enterprise customer data according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
a managing module, for associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and an identifying module, for identifying the potential risk entities in the enterprise customer data according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
10. A computer-readable storage medium, storing therein a computer program, wherein when executed by a processor, the computer program performs the steps of any one of claims 1 through 7.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110231361.7A CN113065943A (en) | 2021-03-02 | 2021-03-02 | Anti-fraud black product entity identification method and system |
CN202110231361.7 | 2021-03-02 |
Publications (1)
Publication Number | Publication Date |
---|---|
CA3150593A1 true CA3150593A1 (en) | 2022-09-02 |
Family
ID=76559522
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA3150593A Pending CA3150593A1 (en) | 2021-03-02 | 2022-03-01 | Method for identifying underground industry entities and system thereof |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113065943A (en) |
CA (1) | CA3150593A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115277045A (en) * | 2022-05-17 | 2022-11-01 | 广东申立信息工程股份有限公司 | IDC safety management system |
CN117688055B (en) * | 2023-11-08 | 2024-06-14 | 亿保创元(北京)信息科技有限公司 | Insurance black product identification and response system based on correlation network analysis technology |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112131275B (en) * | 2020-09-23 | 2023-07-25 | 长三角信息智能创新研究院 | Enterprise portrait construction method of holographic city big data model and knowledge graph |
CN112380531A (en) * | 2020-11-11 | 2021-02-19 | 平安科技(深圳)有限公司 | Black product group partner identification method, device, equipment and storage medium |
-
2021
- 2021-03-02 CN CN202110231361.7A patent/CN113065943A/en active Pending
-
2022
- 2022-03-01 CA CA3150593A patent/CA3150593A1/en active Pending
Also Published As
Publication number | Publication date |
---|---|
CN113065943A (en) | 2021-07-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106875078B (en) | Transaction risk detection method, device and equipment | |
CN112053221A (en) | Knowledge graph-based internet financial group fraud detection method | |
CN109670936A (en) | Loan examination & approval processing method, platform, equipment and computer readable storage medium | |
CN109583966B (en) | High-value customer identification method, system, equipment and storage medium | |
CA3150593A1 (en) | Method for identifying underground industry entities and system thereof | |
CN111861732A (en) | Risk assessment system and method | |
US8255392B2 (en) | Real time data collection system and method | |
CN109034583A (en) | Abnormal transaction identification method, apparatus and electronic equipment | |
CN110533521B (en) | Dynamic post-credit early warning method, device, equipment and readable storage medium | |
CN112418956A (en) | Financial product recommendation method and device | |
CN110310020B (en) | Project scheme management method based on data analysis, related device and storage medium | |
CN117114514A (en) | Talent information analysis management method, system and device based on big data | |
Polcsik et al. | Residents’ perceptions of sporting events: a review of the literature | |
CN109146667B (en) | Method for constructing external interface comprehensive application model based on quantitative statistics | |
CN113065892B (en) | Information pushing method, device, equipment and storage medium | |
CN115713399A (en) | User credit assessment system combined with third-party data source | |
Goyal et al. | Fraud Detection on Social Media using Data Analytics | |
Narayanan et al. | A study on customer’s knowledge about the green banking initiatives of selected public sector banks in Madurai district | |
CN110956471A (en) | Method for analyzing credit investigation data of decoration industry | |
CN113537666B (en) | Evaluation model training method, evaluation and business auditing method, device and equipment | |
CN114240595A (en) | Anti-fraud model taking recognition repayment capability as core | |
Gera et al. | BILD testing for spotting out suspicious reviews, suspicious reviewers and group spammers | |
CN112529623B (en) | Malicious user identification method, device and equipment | |
Simoni | Bank selection and consumer decision-making in the banking services industry | |
Lakshmipathi et al. | E BANKING SERVICES AND CUSTOMER SATISFACTION-A STUDY WITH REFERENCE TO SELECT PUBLIC AND PRIVATE BANKS IN INDIA |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request |
Effective date: 20220916 |
|
EEER | Examination request |
Effective date: 20220916 |
|
EEER | Examination request |
Effective date: 20220916 |
|
EEER | Examination request |
Effective date: 20220916 |