CA3150593A1 - Method for identifying underground industry entities and system thereof - Google Patents

Method for identifying underground industry entities and system thereof Download PDF

Info

Publication number
CA3150593A1
CA3150593A1 CA3150593A CA3150593A CA3150593A1 CA 3150593 A1 CA3150593 A1 CA 3150593A1 CA 3150593 A CA3150593 A CA 3150593A CA 3150593 A CA3150593 A CA 3150593A CA 3150593 A1 CA3150593 A1 CA 3150593A1
Authority
CA
Canada
Prior art keywords
data
underground industry
underground
industry entity
enterprise customer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3150593A
Other languages
French (fr)
Inventor
Peibin Liu
Lei XIONG
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
1035744 Canada Ltd
Original Assignee
1035744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 1035744 Canada Ltd filed Critical 1035744 Canada Ltd
Publication of CA3150593A1 publication Critical patent/CA3150593A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Technology Law (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention discloses a method for identifying underground industry entities and a system thereof, which belong to the technical field of Internet-based fintech and are designed to enhance accuracy and efficiency in identifying underground industry entities. The method includes: collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information; tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes an underground industry entity recorded in the enterprise customer data and said tag data corresponding thereto. The system is to be used in the method

Description

METHOD FOR IDENTIFYING UNDERGROUND INDUSTRY ENTITIES AND
SYSTEM THEREOF
BACKGROUND OF THE INVENTION
Technical Field [0001] The present invention relates to the technical field of Internet-based fintech, and more particularly to a method for identifying underground industry entities and a system thereof.
Description of Related Art
[0002] The Internet-based underground financial industry lays its foundation on customers of financial credit products who have adversely classified credits. These people usually have difficulty in getting loans through banks or other regular channels, such as peasant workers and students. Some of them lack credit consciousness and are keen on getting petty advantages, and they tend to cheat on loans and cash unscrupulously but do not have willingness to repay. For such a group of people, some underground dealers have sprung up on the Internet to provide services related to malicious acts like cashing out, arbitrage, and identity tampering. The underground dealers usually advertise their cashing out, arbitrage, and identity tampering services on the Internet through various forums, online communities, Weibo, official accounts, and so on, with the attempt to attract the attention of their target customers. Therefore, by webscraping the dynamic of the underground industry and monitoring posts from the related dealers, it is possible to warn banks and financial service providers of these suspected customers so that the banks and the financial service providers can timely reject their loan applications and reduce loss. Hence, how to use technical means to identify underground industry entity effectively has become a pressing need to be met in the credit industry.

Date Recue/Date Received 2022-03-01 SUMMARY OF THE INVENTION
[0003] The objective of the present invention is to provide a method for identifying underground industry entities and a system thereof for enhancing accuracy and efficiency of identifying underground industry entities. In the context of the present disclosure, the underground industry entities can also be referred to as suspected fraudulent entities.
[0004] To achieve the foregoing objective, in a first aspect, the present invention provides an anti-fraud method for identifying underground industry entities, comprises:
[0005] collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
[0006] classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and
[0007] associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result in the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
[0008] Preferably, the step of collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information comprises:
[0009] having the collected underground industry data include user IDs, content details, data sources, link addresses, and publication times, in which the content details include the underground industry entity information, and optionally include terminal identification numbers and/or login IP addresses; and
[0010] cleaning the underground industry data using a predetermined regular expression, and extracting the valid data containing the underground industry entity information.

Date Recue/Date Received 2022-03-01
[0011] More preferably, the step of classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data comprises:
[0012] having the underground industry entity classification table include plural entries of said tag data, and having each said entry of the tag data include plural keywords;
[0013] performing word segmentation on the valid data and matching segments of the valid data with keywords corresponding to respective tag data in a one-to-one manner; and
[0014] counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
[0015] Further, the step of associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data comprises:
[0016] associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and
[0017] identifying the potential risk entities according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
[0018] Preferably, the method further comprises:
[0019] constructing mapping relationship between the tag data and risk levels, in which fraud probability levels of the risk levels correspond to grey accounts, high-risk accounts, black accounts and highly-black accounts from low to high; and
[0020] when the underground industry entity recognition result is output, outputting the corresponding risk level.

Date Recue/Date Received 2022-03-01
[0021] Preferably, the method further comprises:
[0022] training a risk rating model using a PageRank algorithm according to the association relationship between the loaning entity in the enterprise customer data and the underground industry entity information;
[0023] when the underground industry entity recognition result is output, using the risk rating model to perform risk rating.
[0024] As compared to the prior art, the anti-fraud method for identifying underground industry entities of the present invention has the following beneficial effects:
[0025] The method for identifying underground industry entities of the present invention uses the data collection technology to collect underground industry data of contents of posts or replies about cashing out, arbitrage, and identity tampering in social media like popular forums, online communities, and cleans the underground industry data so as to obtain valid data containing underground industry entity information. The valid data are subsequently classified and tagged according to a pre-configured underground industry entity classification table, so as to generate corresponding tag data. At last, the underground industry entity information is associated to and matched with enterprise customer data, so as to generate an underground industry entity recognition result from the enterprise customer data.
[0026] It is thus clear that, as compared to the manual webscraping approach in the prior art, the method of the present invention collects underground industry data automatically in a real-time manner, thereby ensuring timeliness and efficiency in collecting the underground industry data. Additionally, the foregoing procedures make identification of underground industry entities programmable and automated, thereby enhancing accuracy and efficiency in identifying underground industry entities.
[0027] In a second aspect, the present invention provides an anti-fraud system for identifying Date Recue/Date Received 2022-03-01 underground industry entities to be used in the anti-fraud method for identifying underground industry entities of the previous technical scheme. The system comprises:
[0028] a collecting unit, for collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
[0029] a processing unit, for classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and
[0030] an identifying unit, for associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
[0031] Preferably, the processing unit comprises:
[0032] a table-constructing module, for constructing the underground industry entity classification table, wherein the underground industry entity classification table includes plural entries of said tag data, and each said entry of the tag data include plural keywords;
[0033] a matching module, for performing word segmentation on the valid data and matching segments of the valid data with corresponding keywords in a one-to-one manner;
and
[0034] a selecting module, for counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
[0035] More preferably, the identifying unit comprises:
[0036] a managing module, for associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and
[0037] an identifying module, for identifying the potential risk entities in the enterprise customer Date Recue/Date Received 2022-03-01 data according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
[0038] As compared to the prior art, the anti-fraud system for identifying underground industry entities of the present invention provides beneficial effects that are similar to those provided by the disclosed anti-fraud method for identifying underground industry entities as enumerated above, and thus no repetitions are made herein.
[0039] In a third aspect, the present invention provides a computer-readable storage medium, storing therein a computer program, when executed by a processor, the computer program performs the steps of the method for identifying underground industry entities as described above.
[0040] As compared to the prior art, the computer-readable storage medium of the present invention provides beneficial effects that are similar to those provided by the disclosed method for identifying underground industry entities as enumerated above, and thus no repetitions are made herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0041] The accompanying drawings are provided herein for better understanding of the present invention and form a part of this disclosure. The illustrative embodiments and their descriptions are for explaining the present invention and by no means form any undue limitation to the present invention, wherein:
[0042] FIG. 1 is a flowchart of a method for identifying underground industry entities according to one embodiment of the present invention; and
[0043] FIG. 2 is a processing sequence diagram of a method for identifying underground industry entities according to one embodiment of the present invention.

Date Recue/Date Received 2022-03-01 DETAILED DESCRIPTION OF THE INVENTION
[0044] To make the foregoing objectives, features, and advantages of the present invention clearer and more understandable, the following description will be directed to some embodiments as depicted in the accompanying drawings to detail the technical schemes disclosed in these embodiments. It is, however, to be understood that the embodiments referred herein are only a part of all possible embodiments and thus not exhaustive. Based on the embodiments of the present invention, all the other embodiments can be conceived without creative labor by people of ordinary skill in the art, and all these and other embodiments shall be embraced in the scope of the present invention.
[0045] Embodiment 1
[0046] Referring to FIG. 1, the present embodiment provides a method for identifying underground industry entities, comprising:
[0047] collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information; tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data.
[0048] The method for identifying underground industry entities of the present embodiment uses the data collection technology to collect underground industry data of contents of posts or replies about cashing out, arbitrage, and identity tampering from social media like popular forums and online communities, and cleans the underground industry data so as to obtain valid data containing underground industry entity information. The valid data are subsequently classified and tagged according to a pre-configured underground industry entity classification table, so as to generate corresponding tag data. At last, the Date Recue/Date Received 2022-03-01 underground industry entity information is associated to and matched with enterprise customer data, so as to generate an underground industry entity recognition result from the enterprise customer data. The underground industry entity recognition result includes potential risk entities in the enterprise customer data and the corresponding tag data.
[0049] It is thus clear that, as compared to the manual webscraping approach in the prior art, the method of the present embodiment collects underground industry data automatically in a real-time manner, thereby ensuring timeliness and efficiency in collecting the underground industry data. Additionally, the foregoing procedures make identification of underground industry entities programmable and automated, thereby enhancing accuracy and efficiency in identifying underground industry entities.
[0050] In the embodiment, the step of collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information comprises:
[0051] having the collected underground industry data include user IDs, content details, data sources, link addresses, and publication times, in which the content details include the underground industry entity information, and optionally include terminal identification numbers and/or login IP addresses; and cleaning the underground industry data using a predetermined regular expression, and extracting the valid data containing the underground industry entity information.
[0052] In particular implementations, the underground industry data sources include contents of posts and replies in social media like popular forums and online communities.
Collection of the underground industry data shall be made on a real-time basis. Popular online communities may include quit gambling communities, financial intermediation communities, credit cultivation communities, credit expansion communities, clinical volunteer communities, Internet loan communities, deal hunting communities.
Popular forums may include Zuanke8, Chia-He-Jun Forums, Card God Net, 51 Credit Card Forum, Date Recue/Date Received 2022-03-01 and Card Farmer Forum. Fields in the collected underground industry data may include user IDs, content details, data sources, link addresses, publication times, terminal identification numbers, login IP addresses, or other valid data that reflect identities of underground industry entities and contents published by the underground dealers.
Specifically, data cleaning is achieved by using a predetermined regular expression to clean the collected underground industry text data. The extracted underground industry entity information may include mobile phone numbers, WeChat accounts, QQ
accounts, QQ groups, email addresses and the like. In practical use, the present embodiment further provides a function for business personnel to edit or process the underground industry text data again so as to manually repair or correct data not extracted at all or not correctly extracted by the algorithm.
[0053] In the embodiment, the step of tagging the valid data according to an underground industry entity classification table, so as to obtain tag data comprises:
[0054] having the underground industry entity classification table include plural entries of said tag data, and having each said entry of the tag data include plural keywords;
performing word segmentation on the valid data and matching segments of the valid data with keywords corresponding to respective tag data in a one-to-one manner; and counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
[0055] In particular implementations, the underground industry entity classification table may be manually configured by business personnel. The table contains plural tag data classes, each corresponding to plural keywords. For example, the tag data classes may include "capital thirsty," "load intermediary," "credit cultivation intermediary,"
"gambling," and "deal hunting." The keywords may include "repaid some," "credit checking,"
"sports gambling," "passing immediately," "white account," "get rid of debt,"
"cornered,"
"blacklist," and "overdue debt." The keywords are grouped into corresponding tag data Date Recue/Date Received 2022-03-01 classes and form the underground industry entity classification table.
Besides, the embodiment further provides a function for business personnel to edit and process the underground industry entity classification table again so as to manually repair and correct the tag data classes and keywords not accurately grouped.
[0056] In the embodiment, the step of associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data comprises:
[0057] associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and identifying the potential risk entities according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
[0058] In particular implementations, the underground industry entity information and the enterprise customer data are associated and matched using a knowledge graph.
The basic analysis ability provided by the knowledge graph platform can sort and integrate the data, and find association relationship of the data. Common association relationship may include WeChat accounts, logging in with the same device, logging in with the same IP, having the same mobile phone number. The internal knowledge graph platform integrates, associates and fuses the underground industry data, so as to identify customers with potential risks from the enterprise customer data.
[0059] The embodiment further comprises: constructing mapping relationship between the tag data and risk levels, in which fraud probability levels of the risk levels correspond to grey Date Recue/Date Received 2022-03-01 accounts, high-risk accounts, black accounts and highly-black accounts from low to high;
and when the underground industry entity recognition result is output, outputting the corresponding risk level. Additionally, or alternatively, the embodiment further comprises: training a risk rating model using a PageRank algorithm according to the association relationship between the loaning entity and the underground industry entity information in the enterprise customer data; when the underground industry entity recognition result is output, using the risk rating model to score the risk level.
[0060] Referring to FIG. 2, for easy understanding, an example of the foregoing embodiment is described below:
[0061] Step 1 is about starting collection of underground industry data by activating a web-crawler program to collect information of contents of posts and replies posted in social media, wherein the collected data include the following fields: user IDs, content details, data sources, link addresses, and publication times.
[0062] Step 2 is about cleaning contents of the replies so as to obtain the valid data containing underground industry entity information.
[0063] Step 2-1 is about performing segmentation on the full-volume underground industry data text, and computing and sorting word frequencies from high to low.
[0064] Step 2-2 is about at a front-end data service module, calling word frequency computing service through micro-service interface, displaying the segments and corresponding word frequencies to business personnel for he/she to update the underground industry entity classification table regularly.
[0065] Step 2-3 is about picking out posts whose contents contain combinations of numbers and letters that are suspected to be contact information, and synchronizing the picked data to a data processing module.
[0066] Step 3 is about processing the underground industry data at the processing module, which includes pre-processing the underground industry entity data using an algorithm and verifying and/or correcting the pre-processed data by business personnel.
[0067] Step 3-1 is about preliminarily extracting the contact information from the text of the Date Recue/Date Received 2022-03-01 underground industry entity information using an algorithm, and displaying the information in an underground industry entity examination page simultaneously.
[0068] Step 3-2 is about pre-processing identities in the underground industry entity information by classifying the valid data in the text using an algorithm, tagging the valid data, and displaying the data in the underground industry entity examination page simultaneously.
[0069] Step 3-3 is about verifying the underground industry valid data information. Business personnel enter the underground industry entity examination page of the system to verify and confirm the underground industry entity information that has been extracted by the program, such as mobile phone numbers, WeChat accounts, QQ accounts, QQ
groups, and email addresses. If the extraction work of the system is correct, the business personnel can directly confirm the result by pressing a corresponding button in the page. If the extraction work of the system is not correct, the business personnel may edit and correct the data in the page.
[0070] For example, there is the post titled "Use WeChat Mini Program to search for YOU
WANT TO LEND MONEY to get 3000," and has attracted the following replies:
[0071] User A: If you have a card from Bank of Communications, I can help you to get cash 100 for free. Come to 287765737 if you possess resources and personal connections;
[0072] User B: I will never touch this anymore. I have come all this way to get rid of debt and I
swear that I will live my life well. If I get trapped next time, I doubt I
will get help again.
If you are now at a corner as I was, try to contact Brother Lo. Here is his contact information 9956252. Hopefully Brother Lo can help more people like me; and
[0073] User C: If you need cash, contact v1503391949.
[0074] In the above replies, underground industry entity information is as below: User A, contact channel: QQ, No. 287765737, corresponding tag: deal-hunter; User B, contact channel:
QQ, No. 9956252, corresponding tag: gambler; and User C, contact channel:
WeChat /mobile phone number, No. 1503391949, corresponding tag: deal-hunter.

Date Recue/Date Received 2022-03-01
[0075] Step 4 is about associating the valid data to existing enterprise customer data for analysis.
[0076] Step 4-1 is about introducing the valid data into a knowledge graph.
[0077] Step 4-2 is about associating the valid data to enterprise customer data.
[0078] Step 5 is about providing underground industry data services according to the existing data.
[0079] Step 5-1 is about associating the data to an underground industry list.
Three essential factors, namely name, identity card number, mobile phone number of a user in the enterprise customer data are associated to the underground industry entity information, so as to identify potential risk customers form the enterprise customer data and to report the number and the level of the underground industry entity.
[0080] Step 5-2 is about calculating risk scores of the underground industry entities. According to the reported number of levels of the associated underground industry entities, a proper algorithm is selected to calculate the risk scores of the underground industry entity. Then the risk scores are output.
[0081] To sum up, the present implementation has the following innovations:
[0082] 1. Full-process procedure automation for underground industry entity identification
[0083] With the disclosed underground industry monitoring system, a complete link from acquiring of data from public data sources to provision of services, and this allow the process of identifying underground industry entities to be configurable.
Therein, by properly configuring the tag data and the keywords, development costs can be significantly reduced and use efficiency of the system can be improved; and
[0084] 2. Risk analysis based on social relationship
[0085] Association between the underground industry data and the enterprise customer data can be built using the knowledge graph platform. By using the three essential factors of the enterprise customer data as parameters, levels and number of users associated underground industry entities as well as risk scores can be reported and output as an underground industry entity recognition result from the enterprise customer data.

Date Recue/Date Received 2022-03-01
[0086] Embodiment 2
[0087] The present embodiment provides a system for identifying underground industry entities, comprising:
[0088] a collecting unit, for collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
[0089] a processing unit, for classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and
[0090] an identifying unit, for associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
[0091] Preferably, the processing unit comprises:
[0092] a table-constructing module, for constructing the underground industry entity classification table, having the underground industry entity classification table include plural entries of said tag data, and having each said entry of the tag data include plural keywords;
[0093] a matching module, for performing word segmentation on the valid data and matching segments of the valid data with corresponding keywords in a one-to-one manner;
and
[0094] a selecting module, for counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
[0095] Preferably, the identifying unit comprises:
[0096] a managing module, for associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association Date Recue/Date Received 2022-03-01 relationship includes an association level and an associated node count;
[0097] an identifying module, for identifying the potential risk entities in the enterprise customer data according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
[0098] As compared to the prior art, the disclosed anti-fraud system for identifying underground industry entities provides beneficial effects that are similar to those provided by the disclosed method for identifying underground industry entities as enumerated above, and thus no repetitions are made herein.
[0099] Embodiment 3
[0100] the present embodiment provides a computer-readable storage medium, storing therein a computer program, when executed by a processor, the computer program performs the steps of the method for identifying underground industry entities as described previously.
[0101] As compared to the prior art, the disclosed computer-readable storage medium provides beneficial effects that are similar to those provided by the disclosed method for identifying underground industry entities as enumerated above, and thus no repetitions are made herein.
[0102] As will be appreciated by people of ordinary skill in the art, implementation of all or a part of the steps of the method of the present invention as described previously may be realized by having a program instruct related hardware components. The program may be stored in a computer-readable storage medium, and the program is about performing the individual steps of the methods described in the foregoing embodiments.
The storage medium may be a ROM/RAM, a hard drive, an optical disk, a memory card or the like.
[0103] The present invention has been described with reference to the preferred embodiments Date Recue/Date Received 2022-03-01 and it is understood that the embodiments are not intended to limit the scope of the present invention. Moreover, as the contents disclosed herein should be readily understood and can be implemented by a person skilled in the art, all equivalent changes or modifications which do not depart from the concept of the present invention should be encompassed by the appended claims. Hence, the scope of the present invention shall only be defined by the appended claims.

Date Recue/Date Received 2022-03-01

Claims (10)

What is claimed is:
1. An anti-fraud method for identifying underground industry entities, comprising:
collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result in the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities in the enterprise customer data and the corresponding tag data.
2. The method of claim 1, wherein the step of collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information comprises:
the collected underground industry data include user IDs, content details, data sources, link addresses, and publication times, in which the content details include the underground industry entity information, and optionally include terminal identification numbers and/or login IP
addresses; and cleaning the underground industry data using a predetermined regular expression, and extracting the valid data containing the underground industry entity information.
3. The method of claim 1 or 2, wherein the step of classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data comprises:
the underground industry entity classification table includes plural entries of said tag data, and each said entry of the tag data includes plural keywords;
performing word segmentation on the valid data and matching segments of the valid data with keywords corresponding to respective tag data in a one-to-one manner; and counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
4. The method of claim 3, wherein the step of associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data comprises:
associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and identifying the potential risk entities according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
5. The method of claim 4, further comprising:
constructing mapping relationship between the tag data and risk levels, in which fraud probability of the risk levels correspond to grey accounts, high-risk accounts, black accounts and highly-black accounts from low to high; and when the underground industry entity recognition result is output, outputting the corresponding risk level simultaneously.
6. The method of claim 4, further comprising:
training a risk rating model using a PageRank algorithm according to the association relationship between the loaning entity in the enterprise customer data and the underground industry entity information;
when the underground industry entity recognition result is output, using the risk rating model to perform risk rating simultaneously.
7. An anti-fraud underground industry entity identification system, comprising:
a collecting unit, for collecting underground industry data, and cleaning the underground industry data to obtain valid data containing underground industry entity information;
a processing unit, for classifying and tagging the valid data according to an underground industry entity classification table, so as to obtain tag data; and an identifying unit, for associating and matching the underground industry entity information in the valid data to and with enterprise customer data, and outputting an underground industry entity recognition result from the enterprise customer data, wherein the underground industry entity recognition result includes potential risk entities recorded in the enterprise customer data and said tag data corresponding thereto.
8. The system of claim 7, wherein the processing unit comprises:
a table-constructing module, for constructing the underground industry entity classification table, wherein the underground industry entity classification table includes plural entries of said tag data, and each said entry of the tag data include plural keywords;
a matching module, for performing word segmentation on the valid data and matching segments of the valid data with corresponding keywords in a one-to-one manner; and a selecting module, for counting the number of matchings between segments in the valid data and the keywords corresponding to the tag data, and selecting the entry of the tag data with the greatest number of matchings as the tag data of the valid data.
9. The system of claim 7, wherein the identifying unit comprises:
a managing module, for associating and matching the underground industry entity information to and with the enterprise customer data using a knowledge graph, so as to identify association relationship between a loaning entity recorded in the enterprise customer data and the underground industry entity information, in which the association relationship includes an association level and an associated node count; and an identifying module, for identifying the potential risk entities in the enterprise customer data according to the association relationship through matching, and outputting the risk entities and corresponding said tag data in an associated manner, so as to obtain the underground industry entity recognition result.
10. A computer-readable storage medium, storing therein a computer program, wherein when executed by a processor, the computer program performs the steps of any one of claims 1 through 7.
CA3150593A 2021-03-02 2022-03-01 Method for identifying underground industry entities and system thereof Pending CA3150593A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110231361.7A CN113065943A (en) 2021-03-02 2021-03-02 Anti-fraud black product entity identification method and system
CN202110231361.7 2021-03-02

Publications (1)

Publication Number Publication Date
CA3150593A1 true CA3150593A1 (en) 2022-09-02

Family

ID=76559522

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3150593A Pending CA3150593A1 (en) 2021-03-02 2022-03-01 Method for identifying underground industry entities and system thereof

Country Status (2)

Country Link
CN (1) CN113065943A (en)
CA (1) CA3150593A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115277045A (en) * 2022-05-17 2022-11-01 广东申立信息工程股份有限公司 IDC safety management system
CN117688055B (en) * 2023-11-08 2024-06-14 亿保创元(北京)信息科技有限公司 Insurance black product identification and response system based on correlation network analysis technology

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112131275B (en) * 2020-09-23 2023-07-25 长三角信息智能创新研究院 Enterprise portrait construction method of holographic city big data model and knowledge graph
CN112380531A (en) * 2020-11-11 2021-02-19 平安科技(深圳)有限公司 Black product group partner identification method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113065943A (en) 2021-07-02

Similar Documents

Publication Publication Date Title
CN106875078B (en) Transaction risk detection method, device and equipment
CN112053221A (en) Knowledge graph-based internet financial group fraud detection method
CN109670936A (en) Loan examination & approval processing method, platform, equipment and computer readable storage medium
CN109583966B (en) High-value customer identification method, system, equipment and storage medium
CA3150593A1 (en) Method for identifying underground industry entities and system thereof
CN111861732A (en) Risk assessment system and method
US8255392B2 (en) Real time data collection system and method
CN109034583A (en) Abnormal transaction identification method, apparatus and electronic equipment
CN110533521B (en) Dynamic post-credit early warning method, device, equipment and readable storage medium
CN112418956A (en) Financial product recommendation method and device
CN110310020B (en) Project scheme management method based on data analysis, related device and storage medium
CN117114514A (en) Talent information analysis management method, system and device based on big data
Polcsik et al. Residents’ perceptions of sporting events: a review of the literature
CN109146667B (en) Method for constructing external interface comprehensive application model based on quantitative statistics
CN113065892B (en) Information pushing method, device, equipment and storage medium
CN115713399A (en) User credit assessment system combined with third-party data source
Goyal et al. Fraud Detection on Social Media using Data Analytics
Narayanan et al. A study on customer’s knowledge about the green banking initiatives of selected public sector banks in Madurai district
CN110956471A (en) Method for analyzing credit investigation data of decoration industry
CN113537666B (en) Evaluation model training method, evaluation and business auditing method, device and equipment
CN114240595A (en) Anti-fraud model taking recognition repayment capability as core
Gera et al. BILD testing for spotting out suspicious reviews, suspicious reviewers and group spammers
CN112529623B (en) Malicious user identification method, device and equipment
Simoni Bank selection and consumer decision-making in the banking services industry
Lakshmipathi et al. E BANKING SERVICES AND CUSTOMER SATISFACTION-A STUDY WITH REFERENCE TO SELECT PUBLIC AND PRIVATE BANKS IN INDIA

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916