CN110070364A

CN110070364A - Method and apparatus, storage medium based on the fraud of graph model detection clique

Info

Publication number: CN110070364A
Application number: CN201910239821.3A
Authority: CN
Inventors: 黄剑飞; 陈振
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2019-03-27
Filing date: 2019-03-27
Publication date: 2019-07-30
Also published as: WO2020192184A1

Abstract

This disclosure relates to a kind of method and apparatus based on the fraud of graph model detection clique, storage medium, for solving the technical issues of being difficult to clique's fraud in the related technology.The method based on the fraud of graph model detection clique includes: to obtain user base data and history suspicion user data；According to the data of acquisition, user-association figure is generated；Wherein, the node of the user-association figure is the user-association subgraph generated according to data characteristics, and the side right of the user-association figure includes the similarity of node again；Based on the user-association figure, clique to be determined is generated using community's partitioning algorithm and is gathered；Calculate the suspicion degree of clique's set to be determined；According to calculated result, the judgement result of the clique to be determined is exported.

Description

Method and apparatus, storage medium based on the fraud of graph model detection clique

Technical field

This disclosure relates to network technique field, and in particular, to it is a kind of based on graph model detection clique fraud method and Device, storage medium.

Background technique

Financial field needs to guarantee the safety of funds transaction to the more demanding of transaction risk control.In practical application In, there may be some frauds.For example, fraudster inveigles many ordinary consumers to transfer accounts to it, but not to These consumers return corresponding return, are made profit with this.In order to identify above-mentioned fraud, by the fraudster of high risk It identifies, with the monetary losses for avoiding consumer as far as possible that take measures, can use Trading Model to identify fraudster, than Such as, some payment account is qualitative for fraudster's account, the qualitative funds transaction that fraudster's account is carried out is risk trade.

Summary of the invention

The disclosure provides a kind of method and apparatus, storage medium that clique's fraud is detected based on graph model, to solve correlation The technical issues of clique's fraud is difficult in technology.

To achieve the above object, the embodiment of the present disclosure in a first aspect, providing a kind of based on the fraud of graph model detection clique Method, which comprises

Obtain user base data and history suspicion user data；

According to the data of acquisition, user-association figure is generated；Wherein, the node of the user-association figure is according to data characteristics The side right of the user-association subgraph of generation, the user-association figure includes the similarity of node again；

Based on the user-association figure, clique to be determined is generated using community's partitioning algorithm and is gathered；

Calculate the suspicion degree of clique's set to be determined；

According to calculated result, the judgement result of the clique to be determined is exported.

Optionally, the generation user-association figure, comprising:

Choose the feature combination in the user base data and the history suspicion user data and group number；

Generate user-association subgraph and using feature consistency is equal or ambiguity equivalent way is corresponding with user pass Joining subgraph is that node splicing generates user without weighted associations figure；

Similarity using the user without weighted associations figure interior joint re-generates the similar weighted associations figure of user as side right.

It is optionally, described to generate clique's set to be determined using community's partitioning algorithm, comprising:

Based on the similar weighted associations figure of the user, n clique is generated using community's partitioning algorithm and is gathered, n is positive integer；

Confirm that number of users is less than or equal to very big threshold value in clique's set；

Confirm that number of users is less than the quantity of the clique set of minimum threshold value less than or equal to preset threshold；

Clique set is determined as clique's set to be determined.

Optionally, further includes:

The clique's set for being greater than the very big threshold value to number of users calls community's partitioning algorithm to be divided so that described Number of users is less than or equal to the very big threshold value in clique's set；

If the quantity that the clique that number of users is less than minimum threshold value gathers is greater than the preset threshold, call level poly- The clique set that class algorithm is less than minimum threshold value to number of users is condensed.

Optionally, community's partitioning algorithm includes icon label propagation algorithm or GN algorithm；The hierarchical clustering algorithm packet Include agglomerative algorithm or splitting algorithm.

Optionally, the suspicion degree score for calculating clique's set to be determined, comprising:

Target data feature is chosen from the data characteristics, the target data feature is gathered in the clique to be determined In distribution with distributional difference of the target data feature in overall data be more than targets threshold；

According to accounting of the target data feature in clique's set to be determined, clique's collection to be determined is calculated The suspicion degree score of conjunction.

Extract clique's feature of each clique's set to be determined；

Clique's feature is inputted in trained regression model so that the regression model exports the group to be determined The suspicion degree score of partner's set.

According to accounting of the target data feature in clique's set to be determined, clique's collection to be determined is calculated The the first suspicion degree score closed；

Extract clique's feature of each clique's set to be determined；

Clique's feature is inputted in trained regression model so that the regression model exports the group to be determined Second suspicion degree score of partner's set；

According to the first suspicion degree score and the second suspicion degree score, clique's set to be determined is calculated Comprehensive suspicion degree score.

The second aspect of the embodiment of the present disclosure provides a kind of device based on the fraud of graph model detection clique, described device Include:

Module is obtained, user base data and history suspicion user data are used for；

First generation module generates user-association figure for the data according to acquisition；Wherein, the user-association figure Node is the user-association subgraph generated according to data characteristics, and the side right of the user-association figure includes the similarity of node again；

Second generation module generates clique to be determined using community's partitioning algorithm and collects for being based on the user-association figure It closes；

Computing module, for calculating the suspicion degree of clique's set to be determined；

Output module, for exporting the judgement result of the clique to be determined according to calculated result.

Optionally, first generation module includes:

First chooses submodule, for choosing the feature in the user base data and the history suspicion user data Combination and group number；

First generates submodule, for equal using feature consistency or ambiguity equivalent way to correspond to and generates user-association Subgraph simultaneously splices generation user without weighted associations figure by node of the user-association subgraph；

Second generates submodule, for being re-generated using the user without the similarity of weighted associations figure interior joint as side right The similar weighted associations figure of user.

Optionally, second generation module includes:

Third generates submodule, for being based on the similar weighted associations figure of the user, generates n using community's partitioning algorithm Clique's set, n is positive integer；

First confirmation submodule, for confirming, number of users is less than or equal to very big threshold value in clique's set；

Second confirmation submodule, for confirm number of users be less than minimum threshold value the clique gather quantity be less than or Equal to preset threshold；

Third confirms submodule, gathers for clique set to be determined as clique to be determined.

Optionally, further includes:

Division module, clique's set for being greater than the very big threshold value to number of users call community's partitioning algorithm to carry out It divides so that number of users is less than or equal to the very big threshold value in clique set；

Module is agglomerated, if the quantity for the clique that number of users is less than minimum threshold value to gather is greater than the default threshold Value, the clique set for calling hierarchical clustering algorithm to be less than minimum threshold value to number of users are condensed.

Optionally, the computing module includes:

Second chooses submodule, for choosing target data feature, the target data feature from the data characteristics Distribution and distributional difference of the target data feature in overall data in clique's set to be determined are more than target Threshold value；

First computational submodule, for the accounting according to the target data feature in the clique to be determined set, Calculate the suspicion degree score of clique's set to be determined.

Optionally, the computing module includes:

First extracts submodule, for extracting clique's feature of each clique's set to be determined；

First input submodule, for inputting in trained regression model clique's feature so that the recurrence mould Type exports the suspicion degree score of clique's set to be determined.

Optionally, the computing module includes:

Third chooses submodule, for choosing target data feature, the target data feature from the data characteristics Distribution and distributional difference of the target data feature in overall data in clique's set to be determined are more than target Threshold value；

Second computational submodule, for the accounting according to the target data feature in the clique to be determined set, Calculate the first suspicion degree score of clique's set to be determined；

Second extracts submodule, for extracting clique's feature of each clique's set to be determined；

Second input submodule, for inputting in trained regression model clique's feature so that the recurrence mould Type exports the second suspicion degree score of clique's set to be determined；

Third computational submodule, for calculating according to the first suspicion degree score and the second suspicion degree score The synthesis suspicion degree score of clique's set to be determined.

The third aspect of the embodiment of the present disclosure provides a kind of computer readable storage medium, is stored thereon with computer journey The step of sequence, which realizes any one of above-mentioned first aspect the method when being executed by processor.

The fourth aspect of the embodiment of the present disclosure provides a kind of device based on the fraud of graph model detection clique, comprising:

Memory is stored thereon with computer program；And

Processor, it is any in above-mentioned first aspect to realize for executing the computer program in the memory The step of item the method.

By adopting the above technical scheme, following technical effect can at least be reached:

The disclosure generates user-association figure, and to be determined using the generation of community's partitioning algorithm according to the user data of acquisition Clique's set, by the suspicion degree for calculating clique's set to be determined, it can tell whether clique's set to be determined belongs to Clique is cheated, solves the technical issues of being difficult to clique's fraud in the related technology.In addition, the disclosure is also divided using community Algorithm and hierarchical clustering algorithm, solve that clique's scale in clique's division result is excessive, there are many lesser clique's scale amounts Problem.Also, the disclosure promotes graph model data-handling capacity by the means of similarity indexing, while being assembled using subgraph, Similar side right can configure ground mode again and generate the similar weighted associations figure of user, and this method is more flexible can be parallel, can be into one Step promotes the large-scale data processing capacity under fraud scene.

Other feature and advantage of the disclosure will the following detailed description will be given in the detailed implementation section.

Detailed description of the invention

Attached drawing is and to constitute part of specification for providing further understanding of the disclosure, with following tool Body embodiment is used to explain the disclosure together, but does not constitute the limitation to the disclosure.In the accompanying drawings:

Fig. 1 is a kind of method flow based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure Figure.

Fig. 2 is the step that a kind of method based on the fraud of graph model detection clique shown according to an exemplary embodiment includes The flow chart of user-association figure is generated in rapid.

Fig. 3 is the step that a kind of method based on the fraud of graph model detection clique shown according to an exemplary embodiment includes The flow chart of clique's set to be determined is generated in rapid.

Fig. 4 is the step that a kind of method based on the fraud of graph model detection clique shown according to an exemplary embodiment includes The flow chart of calculating suspicion degree score in rapid.

Fig. 5 is that another method based on the fraud of graph model detection clique shown according to an exemplary embodiment includes The flow chart of suspicion degree score is calculated in step.

Fig. 6 is that another method based on the fraud of graph model detection clique shown according to an exemplary embodiment includes The flow chart of suspicion degree score is calculated in step.

Fig. 7 is a kind of device block diagram based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure.

Fig. 8 is first of a kind of device based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure Generation module block diagram.

Fig. 9 is second of a kind of device based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure Generation module block diagram.

Figure 10 is another device frame based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure Figure.

Figure 11 is a kind of based on the device of graph model detection clique fraud shown in one exemplary embodiment of the disclosure Calculate module frame chart.

Figure 12 is another device based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure Computing module block diagram.

Figure 13 is another device based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure Computing module block diagram.

Figure 14 is a kind of device block diagram based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure.

Specific embodiment

It is described in detail below in conjunction with specific embodiment of the attached drawing to the disclosure.It should be understood that this place is retouched The specific embodiment stated is only used for describing and explaining the disclosure, is not limited to the disclosure.

In order to cope with ubiquitous attack, fraud detection is seeming most important instantly.By investigation, the relevant technologies In, for financial fraud detection mainly using following several, and there are various defects, it is summarized as follows:

Method based on black and white lists, prestige library lookup needs unscheduled maintenance to add new black and white lists or prestige library Content, the paid data purchase of the relatively high such as third party of this maintaining method cost, and method response and spreadability are limited.

The method of rule-based engine, financial fraud means are changeable on line, after fraudster changes fraudulent mean, based on rule Then the method for engine will often fail, and need to put into a large amount of operations and financial resource goes to update regulation engine.

Method based on Supervised machine learning, Supervised machine learning are most widely used study sides in fraud detection Method.Machine learning model is by that can use such as decision tree, random forest, support vector machines (Support Vector Machine) and NB Algorithm etc., the complicated calculations of hundreds of variables (higher dimensional space) are carried out, accurate locking fraud row For but Supervised machine learning method depends on labeled data, and it is bigger, just that labeled data in financial fraud scene obtains difficulty Negative sample is unbalance (positive sample only when fraud generation after mark just have, and sample changeable in financial fraud scene fraudulent mean compared with Cause mark more difficult less).If it is limited to lack fraud labeled data, the ability of Supervised machine learning enough.

Method based on unsupervised learning, unsupervised learning are a branches of current fraud detection explorative research, mainly It is to be studied based on cluster and drawing method, current unsupervised Technical comparing is immature, and difficulty is bigger, not ready-made solution Unsupervised machine learning effectively can be used for fraud detection by scheme.Main difficulty as how to solve large-scale data ability, Suspicion determines quantization etc..

Fig. 1 is a kind of method flow based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure Figure, to solve to be difficult to the technical issues of clique is cheated in the related technology.As shown in Figure 1, clique should be detected based on graph model Fraud includes:

S11 obtains user base data and history suspicion user data.

S12 generates user-association figure according to the data of acquisition；Wherein, the node of the user-association figure is according to data The user-association subgraph that feature generates, the side right of the user-association figure includes the similarity of node again.

S13 is based on the user-association figure, generates clique to be determined using community's partitioning algorithm and gathers.

S14 calculates the suspicion degree of clique's set to be determined.

S15 exports the judgement result of the clique to be determined according to calculated result.

In step s 11, the user data can be the data of the various client accounts of user's application, for example apply User data etc. when user data when user data when Meituan account, application Alipay account, application wechat account, institute Stating account can be bank's card number of user's application, such as deposit card or credit card.The user data is also possible to utilize The corresponding data of the user that payment platform is paid, for example, paid using Meituan user data, using Alipay into The capable user data paid, the user data paid using wechat etc..User base data include that applicant fills in Shen It please book data, people's row report queries information, mobile terminal behavioral data, electric quotient data and the social data of applicant's authorization.Institute Stating history suspicion user data may include black and white lists information, and black and white lists can be any entity type in network, account Family, address, telephone number etc..Blacklist includes the interior fraud accumulated of row, serious overdue or exchange blacklist, white list packet Include phone, the address etc. of vip client or handmarking's devoid of risk.

After obtaining user base data and history suspicion user data, step S12 is executed, it is raw according to the data of acquisition At user-association figure；Wherein, the node of the user-association figure is the user-association subgraph generated according to data characteristics, the use The side right of family associated diagram includes the similarity of node again.

Referring to FIG. 2, the data according to acquisition, generate user-association figure, may comprise steps of:

S121 chooses the feature combination in the user base data and the history suspicion user data and group number.

S122 generates user-association subgraph and using feature consistency is equal or ambiguity equivalent way is corresponding with the use It is that node splicing generates user without weighted associations figure that family, which is associated with subgraph,.

S123, the similarity using the user without weighted associations figure interior joint re-generate user's similarity weight series of fortified passes as side right Connection figure.

In step S121, the feature in the data can be device id, IP address, imsi, and (international mobile subscriber is known Other code), imei (international mobile equipment identification number), geography information, the features such as login time.The feature combination is from the number At least one feature is selected in feature in as one group, described group of number is at least also one group.

After selected characteristic combination and group number, using feature consistency is equal or ambiguity equivalent way, by different features Combination associates to form user-association subgraph.For example, the device id that different accounts log in is identical, then it is consistent to can use feature Property equivalent way, which is got up；The IP address part that different accounts log in is identical, i.e., under the same local area network Logged difference account, then can use feature Fuzzy equivalent way for two account relatings.Generate user-association After subgraph, is spliced using the user-association subgraph as node and generate user without weighted associations figure.Then, with the user without weight The similarity of associated diagram interior joint re-generates the similar weighted associations figure of user as side right, and measuring similarity can be used in Xiang Shidu Function calculates, and generates the similar weighted associations figure of user based on weight size alternative beta pruning optimization.

After generating user-association figure, step S13 is executed, user-association figure is based on, is generated using community's partitioning algorithm wait sentence Determine clique's set.It is referring to figure 3., described to generate clique's set to be determined using community's partitioning algorithm, comprising:

S131 is based on the similar weighted associations figure of the user, generates n clique using community's partitioning algorithm and gathers, n is positive Integer.Wherein, community's partitioning algorithm includes icon label propagation algorithm or GN algorithm.

S132 confirms that number of users is less than or equal to very big threshold value in clique's set.

S133, the quantity that the clique that confirmation number of users is less than minimum threshold value gathers are less than or equal to preset threshold. The very big threshold value is greater than the minimum threshold value.

Clique set is determined as clique to be determined and gathered by S134.

When number of users (such as different account quantity) is greater than very big threshold value in clique set, for example, one Different account quantity is more than 20 in clique's set, then continues that community's partitioning algorithm is called to be divided so that the clique collects Number of users is less than or equal to the very big threshold value in conjunction.If number of users is less than the quantity that the clique of minimum threshold value gathers Hierarchical clustering is then called for example, less than 3 clique's set numbers of different account quantity are more than 15 greater than the preset threshold Algorithm to number of users be less than minimum threshold value the clique set is condensed, here the optional Layer-agglomeration of hierarchical clustering or Disintegrating method.

After generating clique's set to be determined, step S14 is executed, calculates the suspicion degree of clique's set to be determined.Suspicion The calculation of degree includes but is not limited to three kinds following:

The first calculation: referring to FIG. 4, the suspicion degree score for calculating clique's set to be determined, including Following steps:

S141a chooses target data feature from the data characteristics, and the target data feature is in the group to be determined Distribution and distributional difference of the target data feature in overall data in partner's set are more than targets threshold.Wherein, whole Data refer to all user base data.

S142a is calculated described to be determined according to accounting of the target data feature in clique's set to be determined The suspicion degree score of clique's set.

For example, for account quantity 100 of the same day new registration of certain client, wherein being infused using virtual mobile phone number The quantity of volume account is 8, then distribution proportion of the account registered using virtual mobile phone number on the day of in the account of new registration as 8%.In some generated clique's set to be determined, account quantity is 10, wherein having 7 accounts is infused using virtual mobile phone number Volume, distribution proportion 70%, 70% comparison 8%, otherness is very big.The account then registered using virtual mobile phone number is target data Feature, accounting of the target data feature in the clique to be determined set is 0.7, can using the accounting as described in Determine the suspicion degree score of clique's set.

Alternatively, for account quantity 100 of new registration on the day of certain client, wherein history suspicion user's registration account Quantity be 8, then distribution proportion of the account of history suspicion user's registration on the day of in the account of new registration be 8%.It generates Some clique to be determined set in, account quantity is 10, wherein having 8 accounts is history suspicion user's registration, distribution Ratio is 80%, and 80% comparison 8%, otherness is very big.Then using the account of history suspicion user's registration as target data feature, institute Stating accounting of the target data feature in clique's set to be determined is 0.8, can be using the accounting as the group to be determined The suspicion degree score of partner's set.

Second of calculation:, can be with referring to FIG. 5, the suspicion degree score for calculating the clique to be determined set The following steps are included:

S141b extracts clique's feature of each clique's set to be determined.Wherein, clique's feature includes at least History suspicion user's accounting feature also may include the features such as clique's scale, shared device account quantity accounting.

S142b, clique's feature is inputted in trained regression model so that regression model output it is described to Determine the suspicion degree score of clique's set.Wherein, the regression model can be GBDT (Gradient Boosting Decision Tree；Gradient promotes decision tree) model.

The third calculation:, can be with referring to FIG. 6, the suspicion degree score for calculating the clique to be determined set The following steps are included:

S141c chooses target data feature from the data characteristics, and the target data feature is in the group to be determined Distribution and distributional difference of the target data feature in overall data in partner's set are more than targets threshold.

S142c is calculated described to be determined according to accounting of the target data feature in clique's set to be determined First suspicion degree score of clique's set.

S143c extracts clique's feature of each clique's set to be determined.

S14c inputs clique's feature in trained regression model so that regression model output is described wait sentence Determine the second suspicion degree score of clique's set.

S144c calculates the clique to be determined according to the first suspicion degree score and the second suspicion degree score The synthesis suspicion degree score of set.

Then according to calculated result, the judgement result of the clique to be determined is exported.For example, when comprehensive suspicion degree score is super When crossing preset value, then it can be determined that the clique to be determined for fraud clique.

For example, the first suspicion degree of some clique to be determined is scored at 0.7, and the second suspicion degree score 0.8 is then described The synthesis suspicion degree score of clique to be determined set can take the average value 0.75 of two scores, be more than preset value 0.6, then should be to Clique is determined to cheat clique.

It is worth noting that for simple description, therefore, it is stated as a systems for embodiment of the method shown in FIG. 1 The combination of actions of column, but those skilled in the art should understand that, the disclosure is not limited by the described action sequence.Its It is secondary, those skilled in the art should also know that, the embodiments described in the specification are all preferred embodiments, related dynamic Make necessary to the not necessarily disclosure.

Fig. 7 is a kind of device based on the fraud of graph model detection clique shown in one exemplary embodiment of the disclosure.Such as Fig. 7 Shown, the device 300 based on the fraud of graph model detection clique includes:

Module 310 is obtained, for obtaining user base data and history suspicion user data；

First generation module 320 generates user-association figure for the data according to acquisition；Wherein, the user-association figure Node be the user-association subgraph generated according to data characteristics, the side right of the user-association figure includes the similar of node again Degree；

Second generation module 330 generates clique to be determined using community's partitioning algorithm for being based on the user-association figure Set；

Computing module 340, for calculating the suspicion degree of clique's set to be determined；

Output module 350, for exporting the judgement result of the clique to be determined according to calculated result.

Optionally, as shown in figure 8, first generation module 320 includes:

First chooses submodule 321, for choosing in the user base data and the history suspicion user data Feature combination and group number；

First generates submodule 322, for equal using feature consistency or ambiguity equivalent way to correspond to and generates user It is associated with subgraph and is spliced using the user-association subgraph as node and generate user without weighted associations figure；

Second generates submodule 323, for the similarity using the user without weighted associations figure interior joint as side right weight Generate the similar weighted associations figure of user.

Optionally, as shown in figure 9, second generation module 330 includes:

Third generates submodule 331, for being based on the similar weighted associations figure of the user, is generated using community's partitioning algorithm N clique's set, n is positive integer；

First confirmation submodule 332, for confirming, number of users is less than or equal to very big threshold value in clique's set；

Second confirmation submodule 333, for confirming that number of users is small less than the quantity that the clique of minimum threshold value gathers In or equal to preset threshold；

Third confirms submodule 334, gathers for clique set to be determined as clique to be determined.

Optionally, as shown in Figure 10, the device 300 based on the fraud of graph model detection clique further include:

Division module 360, clique's set for being greater than the very big threshold value to number of users call community's partitioning algorithm It is divided so that number of users is less than or equal to the very big threshold value in clique set；

Module 370 is agglomerated, if being greater than for the quantity that the clique that number of users is less than minimum threshold value gathers described pre- If threshold value, the clique set for calling hierarchical clustering algorithm to be less than minimum threshold value to number of users is condensed.

Optionally, as shown in figure 11, the computing module 340 includes:

Second chooses submodule 341a, for choosing target data feature, the target data from the data characteristics Distribution of the feature in the clique to be determined set be more than with distributional difference of the target data feature in overall data Targets threshold；

First computational submodule 342a, for the accounting in clique's set to be determined according to the target data feature Than calculating the suspicion degree score of clique's set to be determined.

Optionally, as shown in figure 12, the computing module 340 includes:

First extracts submodule 341b, for extracting clique's feature of each clique's set to be determined；

First input submodule 342b, for inputting in trained regression model clique's feature so that described time Model is returned to export the suspicion degree score of clique's set to be determined.

Optionally, as shown in figure 13, the computing module 340 includes:

Third chooses submodule 341c, for choosing target data feature, the target data from the data characteristics Distribution of the feature in the clique to be determined set be more than with distributional difference of the target data feature in overall data Targets threshold；

Second computational submodule 342c, for the accounting in clique's set to be determined according to the target data feature Than calculating the first suspicion degree score of clique's set to be determined；

Second extracts submodule 343c, for extracting clique's feature of each clique's set to be determined；

Second input submodule 344c, for inputting in trained regression model clique's feature so that described time Model is returned to export the second suspicion degree score of clique's set to be determined；

Third computational submodule 345c is used for according to the first suspicion degree score and the second suspicion degree score, Calculate the synthesis suspicion degree score of clique's set to be determined.

About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method Embodiment in be described in detail, no detailed explanation will be given here.

The disclosure also provides a kind of computer readable storage medium, is stored thereon with computer program, and the program is processed The method and step based on the fraud of graph model detection clique described in any of the above-described alternative embodiment is realized when device executes.

The disclosure also provides a kind of device based on the fraud of graph model detection clique, comprising:

Memory is stored thereon with computer program；And

Processor, for executing the computer program in the memory, to realize the optional implementation of any of the above-described The example method and step based on the fraud of graph model detection clique.

Figure 14 is a kind of frame of device 400 based on the fraud of graph model detection clique shown according to an exemplary embodiment Figure.As shown in figure 14, which may include: processor 401, memory 402, multimedia component 403, input/output (I/O) interface 404 and communication component 405.

Wherein, processor 401 is used to control the integrated operation of the device 400, above-mentioned based on graph model detection to complete All or part of the steps in the method for clique's fraud.Memory 402 is for storing various types of data to support in the dress 400 operation is set, these data for example may include the finger of any application or method for operating on the device 400 Order and the relevant data of application program.The memory 402 can be by any kind of volatibility or non-volatile memory device Or their combination is realized, for example, static random access memory (Static Random Access Memory, referred to as SRAM), electrically erasable programmable read-only memory (Electrically Erasable Programmable Read-Only Memory, abbreviation EEPROM), Erasable Programmable Read Only Memory EPROM (Erasable Programmable Read-Only Memory, abbreviation EPROM), programmable read only memory (Programmable Read-Only Memory, abbreviation PROM), only It reads memory (Read-Only Memory, abbreviation ROM), magnetic memory, flash memory, disk or CD.Multimedia component 403 may include screen and audio component.Wherein screen for example can be touch screen, and audio component is for exporting and/or inputting Audio signal.For example, audio component may include a microphone, microphone is for receiving external audio signal.Institute is received Audio signal can be further stored in memory 402 or be sent by communication component 405.Audio component further includes at least one A loudspeaker is used for output audio signal.I/O interface 404 provides interface between processor 401 and other interface modules, on Stating other interface modules can be keyboard, mouse, button etc..These buttons can be virtual push button or entity button.Communication Component 405 is for carrying out wired or wireless communication between the device 400 and other equipment.Wireless communication, such as Wi-Fi, bluetooth, Near-field communication (Near Field Communication, abbreviation NFC), 2G, 3G or 4G or they one or more of group It closes, therefore the corresponding communication component 405 may include: Wi-Fi module, bluetooth module, NFC module.

In one exemplary embodiment, device 400 can be by one or more application specific integrated circuit (Application Specific Integrated Circuit, abbreviation ASIC), digital signal processor (Digital Signal Processor, abbreviation DSP), digital signal processing appts (Digital Signal Processing Device, Abbreviation DSPD), programmable logic device (Programmable Logic Device, abbreviation PLD), field programmable gate array (Field Programmable Gate Array, abbreviation FPGA), controller, microcontroller, microprocessor or other electronics member Part is realized, for executing the above-mentioned method based on the fraud of graph model detection clique.

In a further exemplary embodiment, a kind of computer readable storage medium including program instruction, example are additionally provided It such as include the memory 402 of program instruction, above procedure instruction can be executed above-mentioned to complete by the processor 401 of device 400 Method based on the fraud of graph model detection clique.

The preferred embodiment of the disclosure is described in detail in conjunction with attached drawing above, still, the disclosure is not limited to above-mentioned reality The detail in mode is applied, in the range of the technology design of the disclosure, a variety of letters can be carried out to the technical solution of the disclosure Monotropic type, these simple variants belong to the protection scope of the disclosure.

It is further to note that specific technical features described in the above specific embodiments, in not lance In the case where shield, it can be combined in any appropriate way.In order to avoid unnecessary repetition, the disclosure to it is various can No further explanation will be given for the combination of energy.

In addition, any combination can also be carried out between a variety of different embodiments of the disclosure, as long as it is without prejudice to originally Disclosed thought equally should be considered as disclosure disclosure of that.

Claims

1. a kind of method based on the fraud of graph model detection clique, which is characterized in that the described method includes:

Obtain user base data and history suspicion user data；

According to the data of acquisition, user-association figure is generated；Wherein, the node of the user-association figure is to be generated according to data characteristics User-association subgraph, the side right of the user-association figure includes the similarity of node again；

Calculate the suspicion degree of clique's set to be determined；

2. the method according to claim 1, wherein the generation user-association figure, comprising:

Generate user-association subgraph and using feature consistency is equal or ambiguity equivalent way is corresponding with user-association Figure is that node splicing generates user without weighted associations figure；

3. according to the method described in claim 2, it is characterized in that, described generate clique's collection to be determined using community's partitioning algorithm It closes, comprising:

Clique set is determined as clique's set to be determined.

4. according to the method described in claim 3, it is characterized by further comprising:

The clique's set for being greater than the very big threshold value to number of users calls community's partitioning algorithm to be divided so that the clique Number of users is less than or equal to the very big threshold value in set；

If the quantity that the clique that number of users is less than minimum threshold value gathers is greater than the preset threshold, hierarchical clustering is called to calculate The clique set that method is less than minimum threshold value to number of users is condensed.

5. according to the method described in claim 4, it is characterized in that, community's partitioning algorithm include icon label propagation algorithm or GN algorithm；The hierarchical clustering algorithm includes agglomerative algorithm or splitting algorithm.

6. the method according to claim 1, wherein the suspicion degree for calculating clique's set to be determined obtains Point, comprising:

Target data feature is chosen from the data characteristics, the target data feature is in clique's set to be determined Distribution is more than targets threshold with distributional difference of the target data feature in overall data；

According to accounting of the target data feature in clique's set to be determined, clique's set to be determined is calculated Suspicion degree score.

7. the method according to claim 1, wherein the suspicion degree for calculating clique's set to be determined obtains Point, comprising:

Extract clique's feature of each clique's set to be determined；

Clique's feature is inputted in trained regression model so that regression model output clique's collection to be determined The suspicion degree score of conjunction.

8. a kind of device based on the fraud of graph model detection clique, which is characterized in that described device includes:

Module is obtained, for obtaining user base data and history suspicion user data；

First generation module generates user-association figure for the data according to acquisition；Wherein, the node of the user-association figure Side right for the user-association subgraph generated according to data characteristics, the user-association figure includes the similarity of node again；

Second generation module generates clique to be determined using community's partitioning algorithm and gathers for being based on the user-association figure；

9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The step of any one of claims 1 to 7 the method is realized when row.

10. a kind of device based on the fraud of graph model detection clique characterized by comprising

Memory is stored thereon with computer program；And

Processor, for executing the computer program in the memory, to realize any one of claims 1 to 7 institute The step of stating method.