CN113159937A - Method and device for identifying risks and electronic equipment - Google Patents

Method and device for identifying risks and electronic equipment Download PDF

Info

Publication number
CN113159937A
CN113159937A CN202110583340.1A CN202110583340A CN113159937A CN 113159937 A CN113159937 A CN 113159937A CN 202110583340 A CN202110583340 A CN 202110583340A CN 113159937 A CN113159937 A CN 113159937A
Authority
CN
China
Prior art keywords
risk
resource transfer
media
medium
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110583340.1A
Other languages
Chinese (zh)
Inventor
付小桐
丁盘苹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110583340.1A priority Critical patent/CN113159937A/en
Publication of CN113159937A publication Critical patent/CN113159937A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering

Abstract

The disclosure provides a risk identification method, a risk identification device and electronic equipment, which are applied to the fields of artificial intelligence or finance and the like. The method comprises the following steps: obtaining at least one resource transfer data, the at least one resource transfer data being data between the determined risk media and the associated media for the risk media; constructing a knowledge graph, wherein risk media and associated media are used as nodes of the knowledge graph, and resource transfer data are used as edges of the knowledge graph; acquiring resource transfer characteristics based on the knowledge graph, wherein the resource transfer characteristics comprise resource transfer associated data for the risk media and/or for the associated media; clustering the resource transfer characteristics to obtain a first risk class, wherein a first proportion of risk media in the first risk class in the media included in the first risk class is larger than a second proportion of risk media in the second risk class in the media included in the second risk class; and obtaining suspicious risk media from media other than the risk media in the first risk class.

Description

Method and device for identifying risks and electronic equipment
Technical Field
The present disclosure relates to the field of artificial intelligence and financial technology, and more particularly, to a method and an apparatus for risk identification, and an electronic device.
Background
The telecom network fraud situation is getting more severe, and fraud is showing the trends of industrialization, specialization and cross-platform. Machine learning and knowledge mapping have been used to discover certain associations that present potential risks and to make relevant warnings.
In carrying out the disclosed concept, the applicant has found that there are at least the following problems in the related art. The information in the knowledge graph is often relatively complicated, a large number of clients irrelevant to fraud can be excavated, and the accuracy is low. In addition, the fraud molecular tag data is less, and the accuracy of the machine learning output result is difficult to improve through a large amount of training data.
Disclosure of Invention
In view of the above, the present disclosure provides a method, an apparatus, and an electronic device for reducing risk of identification of asset difficulty in continuously maintaining a business architecture.
One aspect of the present disclosure provides a method performed by a server for identifying a risk, comprising: obtaining at least one resource transfer data, the at least one resource transfer data being data between the determined risk media and the associated media for the risk media; constructing a knowledge graph, wherein risk media and associated media are used as nodes of the knowledge graph, and resource transfer data are used as edges of the knowledge graph; acquiring resource transfer characteristics based on the knowledge graph, wherein the resource transfer characteristics comprise resource transfer associated data for the risk media and/or for the associated media; clustering the resource transfer characteristics to obtain a first risk class, wherein a first proportion of risk media in the first risk class in the media included in the first risk class is larger than a second proportion of risk media in the second risk class in the media included in the second risk class; and obtaining suspicious risk media from media other than the risk media in the first risk class.
According to an embodiment of the present disclosure, acquiring resource transfer characteristics based on a knowledge-graph includes: acquiring first resource transfer data for a risk medium and second resource transfer data for a directly associated medium of the risk medium from a knowledge graph; resource transfer characteristics are obtained based on the first resource transfer data and the second resource transfer data.
According to the embodiment of the disclosure, the account of the risk medium and the account of the directly related medium aiming at the risk medium are distributed by the server side.
According to an embodiment of the present disclosure, obtaining suspicious risk media from media other than risk media in the first risk class includes: and acquiring suspicious risk media from the association media of the risk media in the first risk class based on a preset rule.
According to the embodiment of the disclosure, acquiring the suspicious risk medium from the associated medium of the risk mediums in the first risk class based on the preset rule comprises: and acquiring suspicious risk media from indirect association media of the risk media in the first risk class based on a preset rule.
According to an embodiment of the present disclosure, the preset rule includes at least one of: the risk medium and the indirect associated medium have the resource transfer data of the same direct associated medium aiming at the risk medium in a preset time interval, and the indirect associated medium is taken as the suspicious risk medium; and if the same direct correlation medium of the risk media has the resource transfer data aiming at the risk media and the indirect correlation media in the preset time interval, taking the indirect correlation media as suspicious risk media.
According to an embodiment of the present disclosure, clustering resource transfer features includes: carrying out normalization processing on the resource transfer characteristics to obtain a resource transfer characteristic vector; and clustering the resource transfer characteristic vectors to obtain at least two risk classes.
According to an embodiment of the present disclosure, the resource transfer feature includes a monetary feature; the resource transfer characteristic is normalized, and the resource transfer characteristic vector is obtained by the following steps: performing characteristic smoothing processing on the resource transfer characteristic through a logarithmic function to obtain a resource transfer smoothing characteristic; and normalizing the resource transfer smooth feature to map the resource transfer smooth feature to the designated space.
According to an embodiment of the present disclosure, the method further includes: storing a suspected risk medium; and/or sending the suspicious risk medium to the client.
One aspect of the present disclosure provides an apparatus for identifying risk, comprising: the system comprises a resource transfer data acquisition module, a knowledge graph construction module, a resource transfer characteristic acquisition module, a characteristic clustering module and a suspicious risk medium acquisition module. The resource transfer data acquisition module is used for acquiring at least one resource transfer data, and the resource transfer data is data between the determined risk medium and the associated medium; the knowledge graph building module is used for building a knowledge graph, the risk media and the associated media are used as nodes of the knowledge graph, and the resource transfer data is used as edges of the knowledge graph; the resource transfer characteristic acquisition module is used for acquiring resource transfer characteristics based on the knowledge graph, and the resource transfer characteristics comprise resource transfer associated data aiming at risk media and/or aiming at associated media; the characteristic clustering module is used for clustering the resource transfer characteristics to obtain a first risk class, wherein a first proportion of risk media in the first risk class in the media included in the first risk class is larger than a second proportion of risk media in the second risk class in the media included in the second risk class; and the suspicious risk medium acquiring module is used for acquiring suspicious risk media from media except the risk media in the first risk class.
Another aspect of the present disclosure provides an electronic device comprising one or more processors and a storage device, wherein the storage device is configured to store executable instructions, which when executed by the processors, implement the method as above.
Another aspect of the present disclosure provides a computer-readable storage medium storing computer-executable instructions for implementing the above method when executed.
Another aspect of the disclosure provides a computer program comprising computer executable instructions for implementing the method as above when executed.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:
fig. 1 schematically illustrates an exemplary system architecture to which the method, apparatus, and electronic device for identifying risk may be applied, according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a method of identifying risk according to an embodiment of the present disclosure;
FIG. 3 schematically shows a schematic diagram of a knowledge-graph according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow diagram for knowledge-graph-based resource transfer feature acquisition in accordance with an embodiment of the present disclosure;
FIG. 5 schematically shows a media schematic according to an embodiment of the disclosure;
FIG. 6 schematically illustrates a schematic diagram of a personal risk transfer relationship map provided according to an embodiment of the disclosure;
FIG. 7 schematically illustrates a diagram of preset rules, in accordance with an embodiment of the present disclosure;
FIG. 8 schematically illustrates a block diagram of an apparatus for identifying risk according to an embodiment of the present disclosure;
FIG. 9 schematically illustrates a logical view of identifying risk according to an embodiment of the present disclosure; and
FIG. 10 schematically shows a block diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). The terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more features.
In recent years, the situation of telecommunication network fraud is becoming more severe, and fraud is showing the trend of industrialization, specialization and cross-platform. Cheaters often use social engineering means such as cheating, threats and the like to control account holders to entice the account holders to make money or transfer. Telecommunication phishing has seriously damaged the good financial order and social order of the country, endangers the healthy development of economic construction and influences the safety and the happiness of users.
In order to prevent financial criminal activities such as telecommunication network fraud, the conventional method is usually a manual investigation or a method with better interpretability by using expert rules and the like. Machine learning, knowledge graph and other artificial intelligence new technologies are also gradually applied to the field of financial wind control. Knowledge maps have been used to discover certain associations that present potential risks and to make relevant warnings. The existing method is to directly give knowledge map rules to screen possible fraud molecules, although a large number of suspicious fraud molecules can be mined by the scheme, the information associated with the map is often relatively redundant, a large number of clients irrelevant to fraud can be mined, and the accuracy is low. The traditional machine learning technology has less fraud molecular tag data and is difficult to improve the accuracy.
The embodiment of the disclosure provides a method and a device for identifying risks and electronic equipment. The risk identification method comprises a resource transfer characteristic acquisition process and a suspicious medium acquisition process. In the resource transfer characteristic obtaining process, at least one resource transfer data is obtained firstly, the at least one resource transfer data is data between a determined risk medium and an associated medium for the risk medium, then a knowledge graph is constructed, the risk medium and the associated medium are used as nodes of the knowledge graph, the resource transfer data is used as an edge of the knowledge graph, and then the resource transfer characteristic is obtained based on the knowledge graph, and the resource transfer characteristic comprises resource transfer associated data for the risk medium and/or for the associated medium. And entering a suspicious medium acquiring process after the resource transfer characteristic acquiring process is finished, clustering the resource transfer characteristics to obtain a first risk class, wherein the first proportion of the risk medium in the first risk class in the medium included in the first risk class is larger than the second proportion of the risk medium in the second risk class in the medium included in the second risk class in the first risk class, and then acquiring the suspicious risk medium from the medium except the risk medium in the first risk class.
The risk identification method and device and the electronic equipment provided by the embodiment of the disclosure are based on a risk identification method combining knowledge graph and machine learning. The method has the advantages that the knowledge graph is constructed based on the resource transfer data, the associated transaction characteristics related to the fraud card are extracted, the K-means clustering and the rule screening are utilized to output the fraud card list, the accuracy rate of identifying the fraud card can be improved compared with the existing method, telecommunication fraud and law violation crimes are struck, and losses of customers and banks are reduced.
The method, the device and the electronic equipment for identifying the risk provided by the embodiment of the disclosure can be used in the artificial intelligence field in the relevant aspects of identifying the risk, and can also be used in various fields except the artificial intelligence field, such as the financial field.
Fig. 1 schematically illustrates an exemplary system architecture to which the method, apparatus, and electronic device for identifying risk may be applied, according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and servers 105, 106, 107. The network 104 may include a plurality of gateways, routers, hubs, network wires, etc. to provide a medium of communication links between the terminal devices 101, 102, 103 and the servers 105, 106, 107. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with other terminal devices and servers 105, 106, 107 via the network 104 to receive or send information, etc., such as receiving a risk identification request, exposing a knowledge graph, displaying suspicious risks, etc. The terminal devices 101, 102, 103 may be installed with various communication client applications, such as an operation and maintenance application, an asset management application, a software development application, a banking application, a government affairs application, a monitoring application, a web browser application, a search application, an office application, an instant messaging tool, a mailbox client, social platform software, and the like (for example only). For example, the user may view the suspicious account using the terminal device 101. For example, the user may use the terminal device 102 for knowledgegraph maintenance. For example, the user may use the terminal 103 to perform algorithm optimization or the like.
The terminal devices 101, 102, 103 include, but are not limited to, smart phones, virtual reality devices, augmented reality devices, tablets, laptop portable computers, desktop computers, and the like.
The servers 105, 106, and 107 may receive the request and process the request, and may specifically be a storage server, a background management server, a server cluster, and the like. For example, server 105 may store a knowledge graph, server 106 may obtain resource transfer data, and server 107 may be a server for identifying risks. The background management server can analyze and process the received risk identification request and the like, and feed back a processing result (such as request risk interception and the like) to the terminal equipment.
It should be noted that the method for identifying risks provided by the embodiments of the present disclosure may be generally performed by the servers 105, 106, 107. Accordingly, the risk identification means provided by the embodiments of the present disclosure may be generally disposed in the servers 105, 106, 107. The method of identifying risk provided by embodiments of the present disclosure may also be performed by a server or a cluster of servers different from the servers 105, 106, 107 and capable of communicating with the terminal devices 101, 102, 103 and/or the servers 105, 106, 107.
It should be understood that the number of terminal devices, networks, and servers are merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 schematically shows a flow chart of a method of identifying risk according to an embodiment of the present disclosure. The method for identifying risks is executed by a server side.
As shown in fig. 2, the method of identifying a risk may include operations S210 to S250.
In operation S210, at least one resource transfer data is obtained, the at least one resource transfer data being data between the determined risk medium and the associated medium for the risk medium.
In this embodiment, the resource transfer data includes, but is not limited to, at least one of the following: transaction data, payment data, loan data, etc. for the resource. For example, a resource is transferred from account 1 to account 2, and so on. The resource may be a virtual resource such as electronic money, an entity money, a valuable paper, a tradable material, or the like. For example, resource transfer data includes, but is not limited to: and taking the fraud cards in the confirmed fraud transactions and the fraud transactions of customer complaints as a fraud card list and the corresponding personal transfer transaction data of customers and the like.
The media may be physical media such as a bank card, passbook, or the like. The medium may be a virtual medium, such as an account, account number, mobile phone number, identification number, user identification, and the like. The medium may characterize the identity of a certain user.
To facilitate understanding of the embodiments of the present disclosure, the following is exemplified by the medium being a bank card that has a corresponding bank account with an issuer. The risk medium may be a bank card with a risk of fraudulent transactions, and in the transaction funds related to the bank account of the bank card with the risk of fraudulent transactions, funds caused by illegal means such as fraud exist.
In operation S220, a knowledge graph is constructed, risk media and associated media are used as nodes of the knowledge graph, and resource transfer data is used as edges of the knowledge graph.
The knowledge graph is a technical method for describing the association relationship between knowledge and all things in the modeling world by using a graph model. The system consists of nodes and edges: the nodes can be entities or abstract concepts; edges are attributes of entities, usually represented by "relationships".
In this embodiment, the bank cards may be used as nodes, and the transaction relationships between the bank cards may be used as edges of the knowledge graph. For example, if bank card 1 transfers money to bank card 2, it may be indicated that node 1, which represents bank card 1, points to node 2, which represents bank card 2.
FIG. 3 schematically shows a schematic diagram of a knowledge-graph according to an embodiment of the disclosure.
As shown in fig. 3, a portion of the knowledge-graph shows 5 media. Wherein, there is an association between the medium 1 and the medium 3. There is an association between the medium 3 and the media 2 and 4, respectively. There is also an association between the medium 4 and the medium 5. The association may be directional. Taking the transfer relationship as an example, if the account of medium 1 is transferred to the account of medium 3, the line representing the transfer relationship may be directed from the node of medium 1 to the node of medium 3.
In operation S230, resource transfer characteristics are obtained based on the knowledge-graph, the resource transfer characteristics including resource transfer associated data for the risk media and/or for the associated media.
In this embodiment, taking the resource transfer as a transaction as an example, the transaction characteristics include, but are not limited to, at least one of the following: the number of transactions transferred out, the number of cards transferred out, the number of transactions transferred in, the number of cards transferred in, the number of transactions transferred out for a fraudulent card, the number of fraudulent cards transferred out, the amount of money for a fraudulent card transferred out, the number of transactions transferred in from a fraudulent card, the amount of money transferred in from a fraudulent card, and the amount of money transferred in from a fraudulent card.
The transaction characteristics may be statistically derived based on historical transaction data. Such as giving a number of fraudulent card transactions, may be a number of transactions for a bank card that has been determined to have fraudulent transactions.
In operation S240, the resource transfer features are clustered to obtain a first risk class, where a first proportion of the risk media in the first risk class to the media included in the first risk class is greater than a second proportion of the risk media in the second risk class to the media included in the second risk class.
In this embodiment, a plurality of classes can be obtained by clustering the resource transfer characteristics in a clustering manner. Because the similarity between the object features with the same property is greater than the similarity between the object features with different properties, the object features with the same property can be clustered into the same class as much as possible in a clustering mode. The clustering can adopt an unsupervised clustering mode or a supervised clustering mode. For example, K-Means clustering may be used for clustering.
Exemplified by the K-Means clustering method.
First, k samples of initialization are selected as initial cluster centers a — a1, a2, … …, ak, k are positive integers greater than 1.
Then, the following two steps are repeated until a certain abort condition (number of iterations, minimum error variation, etc.) is reached:
operation 1, for each sample xi in the data set, calculates its distance to k cluster centers and classifies it into the class corresponding to the cluster center with the smallest distance. i is a positive integer greater than or equal to 1.
Operation 2, for each class ci, recalculating its cluster center
Figure BDA0003086994830000091
(i.e. all belonging to this classThe centroid of the book).
It should be noted that, since the number K of the clustering centers needs to be specified in advance for K-Means clustering, the number K of the clustering categories can be set according to the data volume and experience of the resource transfer data. For example, values of K include, but are not limited to: 2. 3, 4, 8, 10, 20, 30, 50, or more, etc.
In operation S250, suspicious risk media are obtained from media other than the risk media in the first risk class.
The probability that the medium with high risk exists in at least one certain class is higher than that in other classes through clustering, so that the medium with high risk can be obtained from the at least one certain class conveniently.
In the risk identification scheme based on the combination of the knowledge graph and the machine learning, the machine learning method predicts the future through historical data and experience, but the traditional characteristics are difficult to embody the association relationship. And the associated features provided by the knowledge graph can enrich the data features, so that the accuracy of the machine learning model is improved. The existing method is optimized by combining the knowledge graph with machine learning, and the purpose of accurately obtaining the risk medium is achieved.
In some embodiments, after the suspicious risk media are obtained, the risk media may be stored for matching processing, such as batch sending to a reviewer for review. After the suspicious risk media are obtained, the risk media may also be output, such as sending the risk media to the client, so that the user views the risk media at the client.
In certain embodiments, acquiring resource transfer characteristics based on the knowledge-graph may include operations S401 to S402.
In operation S401, first resource transfer data for a risky medium and second resource transfer data for a directly associated medium of the risky medium are acquired from a knowledge graph.
In this embodiment, in order to facilitate obtaining the multidimensional resource transfer feature, feature extraction may be performed on the risk medium and the directly related medium of the risk medium, respectively. For example, the number of times the risk medium receives a resource, the number of times the risk medium outputs a resource, the amount of resources received per time, the amount of resources output per time, the total amount of resources received, and the total amount of resources output; the number of times the medium is directly associated with receiving a resource, the number of times the medium is directly associated with outputting a resource, the amount of resources received per time, the amount of resources output per time, the total amount of resources received, the total amount of resources output, and the like.
In operation S402, a resource transfer characteristic is acquired based on the first resource transfer data and the second resource transfer data.
Because the difference between the times and the resource amount is large, for example, the times can be single digits, tens digits or hundreds digits, but the resource amount can be thousands digits, tens digits, millions digits and the like, the first resource transfer data and the second resource transfer data can be processed, a smoother characteristic can be obtained, and the large numbers and the small numbers are prevented from being submerged.
In some embodiments, the account number of the risky medium and the account number of the directly associated medium for the risky medium are assigned by the server. For example, bank cards issued by the same banking institution, which bank cards have corresponding accounts in the same banking institution.
In some embodiments, the media selected to participate in the cluster are a fraudulent bank card with complete transaction information, and a bank card with complete transaction information in a once-transaction card. For example, a bank card issued by the same bank may be selected, which facilitates the acquisition of the complete transaction data.
For example, a banking institution may choose a banking card that the banking institution issues itself because transaction data for non-home bank cards is incomplete. If the part of the cards are introduced into the model, the part of the cards are unfair, and the result is inaccurate.
Fig. 5 schematically shows a media schematic according to an embodiment of the disclosure.
As shown in fig. 5, where card 1, card 3, and card 4 are issued by institution 1 and card 2 and card 5 are issued by institution 2, if card 1 is the determined risk medium, then only card 1, card 3, and card 4, etc. may be clustered. By actively deleting cards 2 and 5, the accuracy of the clustering result is increased.
In some embodiments, obtaining the suspected risk media from the media other than the risk media in the first risk class may include obtaining the suspected risk media from the associated media of the risk media in the first risk class based on a preset rule.
The probability that the risk medium exists in the medium which has the associated transaction with the risk medium is higher, so that the associated medium which has the associated transaction with the risk medium can be found from the cluster, and then the suspicious risk medium is screened, thereby effectively reducing the screening range and improving the processing efficiency. The association transaction may be a direct association transaction or an indirect association transaction.
In some embodiments, obtaining the suspected risk medium from the associated medium of the risk mediums in the first risk class based on the preset rule comprises: and acquiring suspicious risk media from indirect association media of the risk media in the first risk class based on a preset rule. Therefore, the screening range is further reduced, and the processing efficiency is improved.
Fig. 6 schematically illustrates a schematic diagram of a personal risk transfer relationship map provided according to an embodiment of the present disclosure.
As shown in fig. 6, a card directly associated with a fraud card that has been validated may be referred to as a one-time transaction card. Of the cards directly associated with the first degree transaction card, the cards other than the fraudulent card may be referred to as second degree transaction cards. By analogy, a three-degree transaction card, a four-degree transaction card and the like can be further arranged.
The number of transaction cards of two degrees and above is excessive, and the introduction of a large number of cards introduces a lot of transactions unrelated to fraud. Thus analyzing fraudulent cards and once-transacted cards. And calculating the proportion of the number of the cheating cards in each class after the K-means clustering, wherein if the proportion of the cheating cards in one class is high, the possibility that the other cards in the group are involved in cheating is also high. In some embodiments, when clustering is performed, clustering is performed only on the fraud card, the first-degree transaction card and the second-degree transaction card, and prediction efficiency can be effectively improved on the basis of improving risk prediction accuracy.
In some embodiments, the preset rules include at least one of the following.
For example, if the risk medium and the indirect correlation medium have the same resource export data of the direct correlation medium for the risk medium within a preset time interval, the indirect correlation medium is used as the suspicious risk medium. Wherein the preset time interval may be determined based on expert experience or the like. For example, the preset time interval includes, but is not limited to: 1 minute, 3 minutes, 10 minutes, 30 minutes, 1 hour, 3 hours, 8 hours, 12 hours, 20 hours, 1 day, 3 days, 7 days, 15 days, 27 days, 1 month, 3 months, half a year, 1 year or more, and the like.
For example, if the same direct correlation medium of the risk mediums has the resource transfer data for the risk medium and the indirect correlation medium within a preset time interval, the indirect correlation medium is used as the suspicious risk medium.
Fig. 7 schematically shows a schematic diagram of preset rules according to an embodiment of the present disclosure.
As shown in fig. 7, a suspect card that has been transferred to an intermediate card together with a fraudulent card for a certain period of time or that has been transferred to an intermediate card together with a fraudulent card for a certain period of time is screened out. The suspicious card is a card found by the model and the rule together, the probability of being a fraud card is higher, and the suspicious card is stored in a suspicious fraud card list. In the range of the second-degree transaction card, the first-degree transaction card is not directly used as a suspicious card, and the screening range is effectively reduced. It should be noted that, in this embodiment, the possibility that the intermediate card is a suspicious card is not excluded, only part of the transaction association relationship and part of the fraudulent card are shown in fig. 7, and the preset rule makes the intermediate card in fig. 7 not be the suspicious card of the fraudulent card shown in fig. 7, but because a plurality of fraudulent cards may be included in the aggregated class, the intermediate card in fig. 7 may possibly become the suspicious card of the fraudulent card not shown in fig. 7. According to the method and the device, the suspicious card is selected from the two-degree transaction card, the screening range is effectively reduced, meanwhile, the characteristics of concealment and net type distribution of fraud behaviors are effectively considered, and the risk identification efficiency is effectively improved.
In some embodiments, clustering resource transfer features may include the following operations. Firstly, the resource transfer characteristics are normalized to obtain a resource transfer characteristic vector. Then, clustering is carried out on the resource transfer characteristic vectors to obtain at least two risk classes.
In particular, the resource transfer feature may include a monetary feature.
Accordingly, performing normalization processing on the resource transfer characteristics to obtain a resource transfer characteristic vector may include the following operations. Firstly, performing characteristic smoothing processing on the resource transfer characteristic through a logarithmic function to obtain the resource transfer smooth characteristic. Then, the resource transfer smooth feature is normalized to map the resource transfer smooth feature to a specified space. Through the characteristic smoothing processing, the problem that the small numbers are possibly submerged by the large numbers due to overlarge numerical difference between the money amount class characteristic and the frequency class characteristic is effectively solved.
For example, features are first smoothed using a log function, processed using a base-10 log function log10(x +1) for monetary features, and other number types of features are processed using a natural logarithm ln (x + 1). Then the maximum and minimum normalization (x-min)/(max-min) is used to map each feature into the [0,1] range for subsequent clustering.
According to the method and the device, the knowledge graph aiming at the personal risk transfer is constructed, the relevant transaction characteristics related to the fraud card are extracted, and the fraud card list is output by utilizing K-means clustering and rule screening, so that the accuracy rate of identifying the fraud card can be improved compared with the existing method, telecommunication fraud and law violation crimes are struck, and losses of customers and institutions are reduced.
Another aspect of the present disclosure provides an apparatus for identifying a risk.
Fig. 8 schematically shows a block diagram of an apparatus for identifying risk according to an embodiment of the present disclosure.
As shown in fig. 8, the apparatus 800 for identifying risk may include: the system comprises a resource transfer data acquisition module 810, a knowledge graph construction module 820, a resource transfer characteristic acquisition module 830, a characteristic clustering module 840 and a suspicious risk medium acquisition module 850.
The resource transfer data obtaining module 810 is configured to obtain at least one resource transfer data, where the resource transfer data is for data between the determined risk medium and the associated medium.
The knowledge graph building module 820 is used for building a knowledge graph, the risk media and the associated media are used as nodes of the knowledge graph, and the resource transfer data is used as edges of the knowledge graph.
The resource transfer characteristic acquisition module 830 is configured to acquire resource transfer characteristics based on the knowledge-graph, the resource transfer characteristics including resource transfer associated data for the risk media and/or for the associated media.
The feature clustering module 840 is configured to cluster the resource transfer features to obtain a first risk class, where a first proportion of risk media in the first risk class to media included in the first risk class is greater than a second proportion of risk media in the second risk class to media included in the second risk class.
The suspected risk media retrieval module 850 is configured to retrieve suspected risk media from media other than the risk media in the first risk class.
FIG. 9 schematically shows a logical schematic of identifying risk according to an embodiment of the disclosure.
As shown in fig. 9, the risk identifying means 800 may perform the operations as follows.
First, resource transfer data required for composition is acquired. For example, the resource transfer data may include: and transferring transaction data of individual customers corresponding to the fraud cards in the fraud transactions hit by the transaction rules and the customer complaints.
And then, constructing a personal risk transfer relation map according to the resource transfer data obtained in the last operation. A knowledge graph can be simply understood as a multiple relationship graph, which is composed of points and edges. In a knowledge graph we usually represent edges with entity representation points and relationships. The invention takes the prior fraud card as the starting point of the graph and transaction data of a period of time as the edge to construct the knowledge graph comprising the transaction relation.
Next, resource transfer characteristics are calculated based on the knowledge-graph for personal risk transfers constructed by the above operations.
Then, the resource transfer characteristics calculated by the previous operation are processed so as to facilitate the subsequent clustering operation.
Next, all fraud cards and one-degree transaction cards are clustered using a K-means algorithm. The fraudulent card and the non-fraudulent card have certain difference on the transaction behavior characteristics, so that more cards similar to the transaction behavior of the fraudulent card can be found out by utilizing a clustering method. The K-means clustering used in the embodiments of the present disclosure is an unsupervised machine learning algorithm, and the basic idea is to divide a given sample set into K classes according to the distance between samples. The points within the classes are kept as close together as possible while the distance between the classes is kept as large as possible.
Then, the cards in the class with the highest fraud rate obtained from the previous operation are subjected to rule screening, and the rules used are shown in fig. 7.
By the method, the suspicious risk medium with high suspicious degree can be screened out quickly and accurately.
It should be noted that the implementation, solved technical problems, implemented functions, and achieved technical effects of each module/unit and the like in the apparatus part embodiment are respectively the same as or similar to the implementation, solved technical problems, implemented functions, and achieved technical effects of each corresponding step in the method part embodiment, and are not described in detail herein.
Any of the modules, units, or at least part of the functionality of any of them according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules and units according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, units according to the embodiments of the present disclosure may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by any other reasonable means of hardware or firmware by integrating or packaging the circuits, or in any one of three implementations of software, hardware and firmware, or in any suitable combination of any of them. Alternatively, one or more of the modules, units according to embodiments of the present disclosure may be implemented at least partly as computer program modules, which, when executed, may perform the respective functions.
For example, any of the resource transfer data acquisition module 810, the knowledge graph construction module 820, the resource transfer feature acquisition module 830, the feature clustering module 840, and the suspected risk media acquisition module 850 may be combined and implemented in one module, or any one of the modules may be split into multiple modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the resource transfer data obtaining module 810, the knowledge graph constructing module 820, the resource transfer feature obtaining module 830, the feature clustering module 840, and the suspected risk media obtaining module 850 may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware, and firmware, or in a suitable combination of any of them. Alternatively, at least one of the resource transfer data acquisition module 810, the knowledge graph construction module 820, the resource transfer feature acquisition module 830, the feature clustering module 840, and the suspected risk media acquisition module 850 may be implemented, at least in part, as a computer program module that, when executed, may perform corresponding functions.
FIG. 10 schematically shows a block diagram of an electronic device according to an embodiment of the disclosure. The electronic device shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 10, an electronic device 1000 according to an embodiment of the present disclosure includes a processor 1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. Processor 1001 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1001 may also include onboard memory for caching purposes. The processor 1001 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the present disclosure.
In the RAM 1003, various programs and data necessary for the operation of the electronic apparatus 1000 are stored. The processor 1001, the ROM 1002, and the RAM 1003 are communicatively connected to each other by a bus 1004. The processor 1001 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 1002 and/or the RAM 1003. Note that the program may also be stored in one or more memories other than the ROM 1002 and the RAM 1003. The processor 1001 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in one or more memories.
Electronic device 1000 may also include an input/output (I/O) interface 1005, the input/output (I/O) interface 1005 also being connected to bus 1004, according to an embodiment of the present disclosure. Electronic device 1000 may also include one or more of the following components connected to I/O interface 1005: an input section 1006 including a keyboard, a mouse, and the like; an output section 1007 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 1008 including a hard disk and the like; and a communication section 1009 including a network interface card such as a LAN card, a modem, or the like. The communication section 1009 performs communication processing via a network such as the internet. The driver 1010 is also connected to the I/O interface 1005 as necessary. A removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 1010 as necessary, so that a computer program read out therefrom is mounted into the storage section 1008 as necessary.
According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication part 1009 and/or installed from the removable medium 1011. The computer program performs the above-described functions defined in the system of the embodiment of the present disclosure when executed by the processor 1001. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 1002 and/or the RAM 1003 described above and/or one or more memories other than the ROM 1002 and the RAM 1003.
Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the method provided by the embodiments of the present disclosure, when the computer program product is run on an electronic device, the program code being configured to cause the electronic device to implement the image model training method or the image processing method provided by the embodiments of the present disclosure.
The computer program, when executed by the processor 1001, performs the above-described functions defined in the system/apparatus of the embodiments of the present disclosure. The systems, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
In one embodiment, the computer program may be hosted on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted in the form of a signal on a network medium, distributed, downloaded and installed via the communication part 1009, and/or installed from the removable medium 1011. The computer program containing program code may be transmitted using any suitable network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In accordance with embodiments of the present disclosure, program code for executing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, these computer programs may be implemented using high level procedural and/or object oriented programming languages, and/or assembly/machine languages. The programming language includes, but is not limited to, programming languages such as Java, C + +, python, the "C" language, or the like. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. These examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.

Claims (12)

1. A method performed by a server-side for identifying risk, comprising:
obtaining at least one resource transfer data, the at least one resource transfer data being for data between the determined risk media and an associated media of the risk media;
constructing a knowledge graph, wherein the risk medium and the association medium are used as nodes of the knowledge graph, and the resource transfer data is used as edges of the knowledge graph;
obtaining resource transfer characteristics based on the knowledge-graph, the resource transfer characteristics including resource transfer associated data for the risk media and/or for the associated media;
clustering the resource transfer characteristics to obtain a first risk class, wherein a first proportion of risk media in the first risk class in the media included in the first risk class is larger than a second proportion of risk media in the second risk class in the media included in the second risk class; and
and obtaining suspicious risk media from media in the first risk class except the risk media.
2. The method of claim 1, wherein the obtaining resource transfer features based on the knowledge-graph comprises:
obtaining first resource transfer data for the risk media and second resource transfer data for a directly associated media of the risk media from the knowledge graph; and
resource transfer characteristics are obtained based on the first resource transfer data and the second resource transfer data.
3. The method of claim 2, wherein the account number of the risky medium and the account number of the directly associated medium for the risky medium are assigned by the server.
4. The method of claim 1, wherein said obtaining suspicious risk media from media other than said risk media in said first risk class comprises:
and acquiring the suspicious risk medium from the association medium of the risk mediums in the first risk class based on a preset rule.
5. The method according to claim 4, wherein the obtaining suspicious risk media from the associated media of the risk media in the first risk class based on the preset rule comprises:
and acquiring the suspicious risk medium from the indirect association medium of the risk medium in the first risk class based on a preset rule.
6. The method of claim 5, wherein the preset rules include at least one of:
if the risk medium and the indirect associated medium have the resource transfer data aiming at the same direct associated medium of the risk medium in a preset time interval, taking the indirect associated medium as the suspicious risk medium; or
And if the same direct correlation medium of the risk mediums has resource transfer data aiming at the risk mediums and the indirect correlation mediums within a preset time interval, taking the indirect correlation mediums as the suspicious risk mediums.
7. The method of claim 1, wherein the clustering the resource transfer features comprises:
normalizing the resource transfer characteristics to obtain a resource transfer characteristic vector; and
and clustering the resource transfer characteristic vectors to obtain at least two risk classes.
8. The method of claim 7, wherein the resource transfer characteristic comprises a monetary characteristic;
the normalizing the resource transfer characteristics to obtain the resource transfer characteristic vector comprises:
performing characteristic smoothing processing on the resource transfer characteristic through a logarithmic function to obtain a resource transfer smoothing characteristic; and
and normalizing the resource transfer smooth feature to map the resource transfer smooth feature to a specified space.
9. The method of any of claims 1-8, further comprising:
storing the suspected risk media; and/or
And sending the suspicious risk medium to a client.
10. An apparatus for identifying risks, disposed in a server, the apparatus comprising:
a resource transfer data acquisition module, configured to acquire at least one resource transfer data, where the resource transfer data is data between the determined risk medium and the associated medium;
the knowledge graph construction module is used for constructing a knowledge graph, the risk medium and the association medium are used as nodes of the knowledge graph, and the resource transfer data is used as edges of the knowledge graph;
a resource transfer characteristic acquisition module for acquiring resource transfer characteristics based on the knowledge-graph, the resource transfer characteristics including resource transfer associated data for the risk media and/or for the associated media;
the characteristic clustering module is used for clustering the resource transfer characteristics to obtain a first risk class, wherein a first proportion of risk media in the first risk class in the media included in the first risk class is larger than a second proportion of risk media in the second risk class; and
and the suspicious risk medium acquiring module is used for acquiring suspicious risk media from the media except the risk media in the first risk class.
11. An electronic device, comprising:
one or more processors;
a storage device for storing executable instructions which, when executed by the processor, implement the method of any one of claims 1 to 9.
12. A computer-readable storage medium storing computer-executable instructions which, when executed by a processor, implement the method of any one of claims 1 to 9.
CN202110583340.1A 2021-05-27 2021-05-27 Method and device for identifying risks and electronic equipment Pending CN113159937A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110583340.1A CN113159937A (en) 2021-05-27 2021-05-27 Method and device for identifying risks and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110583340.1A CN113159937A (en) 2021-05-27 2021-05-27 Method and device for identifying risks and electronic equipment

Publications (1)

Publication Number Publication Date
CN113159937A true CN113159937A (en) 2021-07-23

Family

ID=76877768

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110583340.1A Pending CN113159937A (en) 2021-05-27 2021-05-27 Method and device for identifying risks and electronic equipment

Country Status (1)

Country Link
CN (1) CN113159937A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023065870A1 (en) * 2021-10-20 2023-04-27 腾讯科技(深圳)有限公司 Resource transfer information detection method and apparatus, device, and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023065870A1 (en) * 2021-10-20 2023-04-27 腾讯科技(深圳)有限公司 Resource transfer information detection method and apparatus, device, and storage medium

Similar Documents

Publication Publication Date Title
CN108734380B (en) Risk account determination method and device and computing equipment
US20110166979A1 (en) Connecting decisions through customer transaction profiles
US11593811B2 (en) Fraud detection based on community change analysis using a machine learning model
US11574360B2 (en) Fraud detection based on community change analysis
US20220207295A1 (en) Predicting occurrences of temporally separated events using adaptively trained artificial intelligence processes
CN111666346A (en) Information merging method, transaction query method, device, computer and storage medium
CN110197426B (en) Credit scoring model building method, device and readable storage medium
CN111861487A (en) Financial transaction data processing method, and fraud monitoring method and device
CN111292090A (en) Method and device for detecting abnormal account
CN114187112A (en) Training method of account risk model and determination method of risk user group
CN114693192A (en) Wind control decision method and device, computer equipment and storage medium
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
CN113159937A (en) Method and device for identifying risks and electronic equipment
CN113094595A (en) Object recognition method, device, computer system and readable storage medium
US20210049687A1 (en) Systems and methods of generating resource allocation insights based on datasets
US20170148098A1 (en) Data creating, sourcing, and agregating real estate tool
CN111429257A (en) Transaction monitoring method and device
Siegenthaler Blockchain Clustering with Machine Learning
CN116028880B (en) Method for training behavior intention recognition model, behavior intention recognition method and device
CN112967134B (en) Network training method, risk user identification method, device, equipment and medium
CN113095805A (en) Object recognition method, device, computer system and readable storage medium
CN117934154A (en) Transaction risk prediction method, model training method, device, equipment, medium and program product
CN114065050A (en) Method, system, electronic device and storage medium for product recommendation
CN116797024A (en) Service processing method, device, electronic equipment and storage medium
CN117876089A (en) Asset processing method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination