CN113554308A - User community division and risk user identification method and device and electronic equipment - Google Patents

User community division and risk user identification method and device and electronic equipment Download PDF

Info

Publication number
CN113554308A
CN113554308A CN202110833619.0A CN202110833619A CN113554308A CN 113554308 A CN113554308 A CN 113554308A CN 202110833619 A CN202110833619 A CN 202110833619A CN 113554308 A CN113554308 A CN 113554308A
Authority
CN
China
Prior art keywords
user
relation
community
risk
heterogeneous network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110833619.0A
Other languages
Chinese (zh)
Other versions
CN113554308B (en
Inventor
雷奥林
柯振德
姚光军
邱李晴
蒋菱
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Citic Bank Corp Ltd
Original Assignee
China Citic Bank Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Citic Bank Corp Ltd filed Critical China Citic Bank Corp Ltd
Priority to CN202110833619.0A priority Critical patent/CN113554308B/en
Publication of CN113554308A publication Critical patent/CN113554308A/en
Application granted granted Critical
Publication of CN113554308B publication Critical patent/CN113554308B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Primary Health Care (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Probability & Statistics with Applications (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the application provides a method and a device for dividing a user community and identifying a risk user and electronic equipment. The method comprises the following steps: acquiring a heterogeneous network to be processed, wherein the heterogeneous network to be processed is constructed based on the relationship between a user and an entity of a specified entity type; extracting a target relation from the relation of the heterogeneous network to be processed based on the homorelation value probability of the relation to obtain the processed heterogeneous network, wherein the homorelation value probability is the probability that the relation of the same relation category owned by every two users has the same relation value; and carrying out community division on the processed heterogeneous network to obtain a user community. Based on the scheme, the effective relation in the relation network can be extracted according to the homorelation value probability, the data scale of the relation network is reduced, the effect of community division is ensured, and the method and the system are favorable for risk prevention based on the divided user communities.

Description

User community division and risk user identification method and device and electronic equipment
Technical Field
The application relates to the technical field of computers, in particular to a method and a device for dividing a user community and identifying a risk user and electronic equipment.
Background
With the rapid expansion of the business volume of financial institutions such as banks, the prevention of business risks becomes more severe.
At present, when risk prevention is carried out, a relationship network may be established for user data and community division is carried out, but when the amount of user data is large, the scale of the established relationship network data is large, the community division effect is influenced, and further the risk prevention effect is influenced.
Disclosure of Invention
The present application aims to solve at least one of the above technical drawbacks. The technical scheme adopted by the application is as follows:
in a first aspect, an embodiment of the present application provides a method for dividing a user community, where the method includes:
acquiring a heterogeneous network to be processed, wherein the heterogeneous network to be processed is constructed based on the relationship between a user and an entity of a specified entity type;
extracting a target relation from the relation of the heterogeneous network to be processed based on the homorelation value probability of the relation to obtain the processed heterogeneous network, wherein the homorelation value probability is the probability that the relation of the same relation category owned by every two users has the same relation value;
and carrying out community division on the processed heterogeneous network to obtain a user community.
Optionally, extracting a target relationship from the relationship of the heterogeneous network to be processed based on the homorelation value probability of the relationship, including:
determining a relation clipping score corresponding to the relation based on the homorelation value probability of the relation;
and comparing the relation clipping with the corresponding clipping threshold value, and determining a target relation from the relation.
Optionally, the method further includes:
acquiring an entity from user data of a user;
acquiring a risk label of a user;
and constructing a heterogeneous network to be processed based on the risk label and the entity.
Optionally, the method further includes:
determining a risk score based on the risk label;
and determining a clipping threshold value based on the relation clipping score corresponding to each relation under each relation category and the total number of the relations under each relation category and based on the risk ratio.
Optionally, the method further includes:
determining risk evaluation indexes of user communities;
and determining a risk decision rule of each user community based on the risk evaluation index.
In a second aspect, an embodiment of the present application provides a method for identifying a risky user, where the method includes:
determining a user community to which a user to be identified belongs, wherein the user community is divided according to the dividing method of the user community;
and determining whether the user to be identified is a risk user or not based on the risk decision rule of the user community.
In a third aspect, an embodiment of the present application provides an apparatus for dividing a user community, the apparatus including:
the heterogeneous network acquisition module is used for acquiring a heterogeneous network to be processed, and the heterogeneous network to be processed is constructed based on the relationship between a user and an entity of a specified entity type;
the relation extraction module is used for extracting a target relation from the relation of the heterogeneous network to be processed based on the homorelation value probability of the relation to obtain the processed heterogeneous network, wherein the homorelation value probability is the probability that the relation of the same relation category owned by each two users has the same relation value;
and the community division module is used for carrying out community division on the processed heterogeneous network to obtain the user community.
Optionally, when the relationship extraction module extracts the target relationship from the relationship of the heterogeneous network to be processed based on the homorelation value probability of the relationship, the relationship extraction module is specifically configured to:
determining a relation clipping score corresponding to the relation based on the homorelation value probability of the relation;
and comparing the relation clipping with the corresponding clipping threshold value, and determining a target relation from the relation.
Optionally, the apparatus further includes a heterogeneous network construction module, where the heterogeneous network construction module is configured to:
acquiring an entity from user data of a user;
acquiring a risk label of a user;
and constructing a heterogeneous network to be processed based on the risk label and the entity.
Optionally, the apparatus further includes a clipping threshold module, where the clipping threshold module is configured to:
determining a risk score based on the risk label;
and determining a clipping threshold value based on the relation clipping score corresponding to each relation under each relation category and the total number of the relations under each relation category and based on the risk ratio.
Optionally, the apparatus further includes a risk decision rule determining module, where the risk decision rule determining module is configured to:
determining risk evaluation indexes of user communities;
and determining a risk decision rule of each user community based on the risk evaluation index.
In a fourth aspect, an embodiment of the present application provides an apparatus for identifying an at-risk user, where the apparatus includes:
the user community determining module is used for determining a user community to which the user to be identified belongs, wherein the user community is divided according to the dividing method of the user community;
and the risk identification module is used for determining whether the user to be identified is a risk user or not based on the risk decision rule of the user community.
In a fifth aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory;
a memory for storing operating instructions;
a processor configured to perform the method as shown in any implementation of the first aspect or any implementation of the second aspect of the present application by calling an operation instruction.
In a sixth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the method shown in any of the embodiments of the first aspect or any of the embodiments of the second aspect of the present application.
The technical scheme provided by the embodiment of the application has the following beneficial effects:
according to the scheme provided by the embodiment of the application, the heterogeneous network to be processed is obtained, the target relation is extracted from the relation of the heterogeneous network to be processed based on the homorelation value probability of the relation, the heterogeneous network after processing is obtained, and then the heterogeneous network after processing is subjected to community division, so that the user community is obtained. Based on the scheme, the effective relation in the relation network can be extracted according to the homorelation value probability, the data scale of the relation network is reduced, the effect of community division is ensured, and the method and the system are favorable for risk prevention based on the divided user communities.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
Fig. 1 is a flowchart illustrating a method for dividing a user community according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of an identification method for a risky user according to an embodiment of the present disclosure;
fig. 3 is a schematic structural diagram of a device for dividing a user community according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an identification apparatus for an at risk user according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
With the rapid expansion of the business volume of financial institutions such as banks, the prevention of business risks becomes more severe, such as risk prevention and control of credit card application business, which directly affects the capital safety of financial institutions.
In the prior art, the risk prevention of the financial institution is mainly realized based on a scoring card model and a business rule strategy. The traditional scoring card model can not effectively identify in group fraud identification and intermediary packaging, but the business rule-based strategy has poor effect in risk prevention due to slow iteration of rule updating.
If the relationship network can be constructed according to the user data, and then community division is carried out on the constructed relationship network, risk groups can be divided in risk communities so as to carry out risk identification and early warning. However, at present, a relationship network is generally constructed by using the total amount of data of users, which causes a large scale of the constructed relationship network data, affects the effect of community division, and further affects the risk identification effect.
The method, the device and the electronic equipment for dividing the user community and identifying the risky user provided by the embodiment of the application aim to solve at least one of the above technical problems in the prior art.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Fig. 1 is a flowchart illustrating a method for dividing a user community according to an embodiment of the present application, and as shown in fig. 1, the method mainly includes:
step S110: and acquiring a heterogeneous network to be processed, wherein the heterogeneous network to be processed is constructed based on the relationship between the user and the entity of the specified entity type.
The heterogeneous network to be processed can be a heterogeneous network constructed according to user data of the user, and the user data of the user can be obtained according to data submitted during service application and credit investigation data. Information such as user equipment, longitude and latitude, mobile phone numbers, emergency contacts, units and the like can be used as the types of the specified entities, and the heterogeneous network to be processed can be constructed according to the relationship between the user and the entities.
The heterogeneous network to be processed may have a problem of large data scale, if community division is performed according to the heterogeneous network to be processed, a super-large community may be formed under certain relationship categories (for example, a super-large community formed by employment of a plurality of employees in a certain unit exists under the relationship category of employment units), so that the efficiency of the clustering algorithm is deteriorated, and in order to avoid the problem of the super-large community, relationship clipping may be performed on a heterogeneous graph.
Step S120: and extracting a target relation from the relation of the heterogeneous network to be processed based on the homorelation value probability of the relation to obtain the processed heterogeneous network, wherein the homorelation value probability is the probability that the relation of the same relation category owned by every two users has the same relation value.
The probability that the relationships owned by two users under the same relationship category in the heterogeneous graph to be processed have the same relationship value, namely the probability of the homorelationship value, can be calculated. And extracting a target relation from the relation of the heterogeneous network to be processed based on the homologism value probability, so that the processed heterogeneous network can be obtained, namely, the relation of the heterogeneous network to be processed is cut, and the cut processed heterogeneous network only contains the target relation. By dividing communities through the processed heterogeneous network, the possibly existing overlarge communities can be eliminated, and the data scale of the relational network is reduced.
Step S130: and carrying out community division on the processed heterogeneous network to obtain a user community.
In the embodiment of the application, community division can be performed on the processing heterogeneous graph through a clustering algorithm, for example, a Louvain clustering algorithm is applied. Through carrying out community division on the processed heterogeneous graphs, high-risk users can be divided into the same community as much as possible, so that corresponding processing is carried out on the high-risk communities.
According to the method provided by the embodiment of the application, the heterogeneous network to be processed is obtained, the target relation is extracted from the relation of the heterogeneous network to be processed based on the homorelation value probability of the relation, the heterogeneous network after processing is obtained, and then the heterogeneous network after processing is subjected to community division, so that the user community is obtained. Based on the scheme, the effective relation in the relation network can be extracted according to the homorelation value probability, the data scale of the relation network is reduced, the effect of community division is ensured, and the method and the system are favorable for risk prevention based on the divided user communities.
In an optional manner of the embodiment of the present application, extracting a target relationship from a relationship of a heterogeneous network to be processed based on a homorelation value probability of the relationship includes:
determining a relation clipping score corresponding to the relation based on the homorelation value probability of the relation;
and comparing the relation clipping with the corresponding clipping threshold value, and determining a target relation from the relation.
In the embodiment of the application, the relation clipping score corresponding to the relation can be determined based on the homorelation value probability of the relation.
In the embodiment of the application, the clipping score threshold values can be respectively determined according to different relation categories, and the relation clipping scores of all relations are respectively compared with the clipping score threshold values of corresponding relation categories, so that the target relation proposed from the relations is extracted.
By comparing the relation clipping and the clipping sub-threshold, the clipping of the relation edges is realized, thereby reducing the number of the relation edges and achieving the purpose of reducing the data volume of the relation network.
As an example, the relational crop score may be calculated by the following formula:
Figure BDA0003176388760000071
wherein m is any relation category in the heterogeneous network to be processed, uiAnd ujFor any two users in the pending heterogeneous network,
Figure BDA0003176388760000072
for user uiA relationship value of the relationship a, the relationship a belonging to the relationship class m,
Figure BDA0003176388760000073
for user ujRelation b belongs to the relation class m, pm(x) For user uiRelation a with user ujThe probability that the relationship value of the relationship b of (a) is the same relationship value x,
Figure BDA0003176388760000074
as the user uiWith user ujAnd when the relation a and the relation b of the same relation type exist, the relation corresponding to the relation a and the relation b is cut.
The relation clipping score calculated by the above formula may be determined as a target relation with a relation clipping score not lower than the clipping score threshold value when compared with the clipping score threshold value. That is, when the relation clipping score is not lower than the clipping score threshold, the relation is a close relation between the user and the entity, and can be extracted as a valid relation edge, that is, determined as a target relation.
In an optional manner of the embodiment of the present application, the method further includes:
acquiring an entity from user data of a user;
acquiring a risk label of a user;
and constructing a heterogeneous network to be processed based on the risk label and the entity.
In the embodiment of the application, the risk marking can be carried out on the user based on historical data, and the risk labels can be divided into before credit card application, during the credit card application and after the credit card application.
In the embodiment of the application, the entity can be extracted from the user data of the user. In particular, the specified entity types may include contact information, unit information, family information, device information, geographic information, build based on device, unit, mailbox, address, account (transaction type), phone, and the like. Correspondingly, the relationship category may include employment, emergency contact, mobile phone number, repayment account, mailbox, residence, and the like. A heterogeneous network can be constructed based on the information.
In actual use, when a heterogeneous network is constructed, the risk label can be referred to, so that the constructed heterogeneous network to be processed contains the risk label, and a foundation is provided for subsequent processing.
In an optional manner of the embodiment of the present application, the method further includes:
determining a risk score based on the risk label;
and determining a clipping threshold value based on the relation clipping score corresponding to each relation under each relation category and the total number of the relations under each relation category and based on the risk ratio.
In the embodiment of the present application, the risk ratio is the ratio of users with risks in the total number of users, and the users with risks may be users marked by risk labels.
As an example, the clipping threshold may be obtained by:
Figure BDA0003176388760000081
wherein θ is a clipping threshold, C is the number of users with risk, C can be determined by a risk label, and | V | is the total number of users
s' (m) can be calculated by the following formula:
Figure BDA0003176388760000082
wherein u isiAnd ujM is any one relation category in the heterogeneous network to be processed, M is all relations in the relation category M,
Figure BDA0003176388760000083
and (4) the sum of the relation cutting scores of any two users in the heterogeneous network to be processed under the relation category m, wherein w is the total number of the relations of any two users in the heterogeneous network to be processed under the relation category m. And s' (m) is the average relationship of the relationships under the relationship type m. SigmaMAnd s' (m) is the sum of average relation clipping scores of all relation class relations in the heterogeneous network to be processed.
As an example, when a community is divided for the processed heterogeneous network, the specific steps are as follows:
(1) initializing;
assigning a single community to all nodes
(2) Aiming at each node i in the processed heterogeneous network, the following steps a and b are carried out:
a. calculating a community in which the node is divided into the neighbor node j, and calculating the modularity gain of the community;
b. moving the node i to the community with the best modularity gain;
(3) and iterating until the modularity gain is not increased any more, so that the user community after community division can be obtained.
In an optional manner of the embodiment of the present application, the method further includes:
determining risk evaluation indexes of user communities;
and determining a risk decision rule of each user community based on the risk evaluation index.
In the embodiment of the application, the risk assessment indexes of the user community can include indexes such as community bad part ratio, community incoming time intensity and community number. The risk assessment indexes can represent risk conditions in the user communities, so that the risk decision rules of the user communities can be determined according to the risk assessment indexes.
As an example, value intervals of different risk assessment indexes may be respectively associated with each risk decision sub-rule, and then all risk decision sub-rules corresponding to the risk assessment indexes passing through the user community are summarized as the risk decision rules of the user community.
In practical application, the scheme provided by the embodiment of the application can be applied to an approval link before business, such as a pre-credit approval link, an application anti-fraud link before credit card issuance, and the like. The method is mainly used for identifying the ganged event and packaging the articles, can effectively determine the high-risk community, and carries out risk identification before the business occurs.
Fig. 2 shows a schematic flowchart of a method for identifying a risky user according to an embodiment of the present application, and as shown in fig. 2, the method mainly includes:
step S210: and determining a user community to which the user to be identified belongs, wherein the user community is divided according to the user community dividing method.
The user to be identified may be a user needing risk identification, and the user community may be obtained by dividing the user community for the identification user according to the above-mentioned method for dividing the user community.
Step S220: and determining whether the user to be identified is a risk user or not based on the risk decision rule of the user community.
In the embodiment of the application, after the user community to which the user to be identified belongs is determined, risk identification can be performed on the generation identification user by adopting the risk decision rule of the user community, so that whether the user to be identified is a risk user is determined.
According to the method provided by the embodiment of the application, whether the user to be identified is a risk user is determined based on the risk decision rule of the user community by determining the user community to which the user to be identified belongs. Based on the scheme, when the user community is divided, the relation in the relation network can be extracted through the probability of the same relation value, so that the data scale of the relation network is reduced, the divided user community has a good effect, risk identification is carried out on the user to be identified through the risk decision rule of the user community, and the risk prevention effect is guaranteed.
Based on the same principle as the method shown in fig. 1, fig. 3 shows a schematic structural diagram of a dividing apparatus for a user community provided by an embodiment of the present application, and as shown in fig. 3, the dividing apparatus 30 for a user community may include:
a heterogeneous network obtaining module 310, configured to obtain a heterogeneous network to be processed, where the heterogeneous network to be processed is constructed based on a relationship between a user and an entity of a specified entity type;
the relationship extraction module 320 is configured to extract a target relationship from the relationship of the heterogeneous network to be processed based on the homonymy value probability of the relationship, so as to obtain the processed heterogeneous network, where the homonymy value probability is a probability that the relationship of the same relationship category owned by each two users has the same relationship value;
and the community dividing module 330 is configured to perform community division on the processed heterogeneous network to obtain a user community.
According to the device provided by the embodiment of the application, the target relation is extracted from the relation of the heterogeneous network to be processed based on the homorelation value probability of the relation by acquiring the heterogeneous network to be processed, the heterogeneous network after processing is obtained, and then the heterogeneous network after processing is subjected to community division, so that the user community is obtained. Based on the scheme, the effective relation in the relation network can be extracted according to the homorelation value probability, the data scale of the relation network is reduced, the effect of community division is ensured, and the method and the system are favorable for risk prevention based on the divided user communities.
Optionally, when the relationship extraction module extracts the target relationship from the relationship of the heterogeneous network to be processed based on the homorelation value probability of the relationship, the relationship extraction module is specifically configured to:
determining a relation clipping score corresponding to the relation based on the homorelation value probability of the relation;
and comparing the relation clipping with the corresponding clipping threshold value, and determining a target relation from the relation.
Optionally, the apparatus further includes a heterogeneous network construction module, where the heterogeneous network construction module is configured to:
acquiring an entity from user data of a user;
acquiring a risk label of a user;
and constructing a heterogeneous network to be processed based on the risk label and the entity.
Optionally, the apparatus further includes a clipping threshold module, where the clipping threshold module is configured to:
determining a risk score based on the risk label;
and determining a clipping threshold value based on the relation clipping score corresponding to each relation under each relation category and the total number of the relations under each relation category and based on the risk ratio.
Optionally, the apparatus further includes a risk decision rule determining module, where the risk decision rule determining module is configured to:
determining risk evaluation indexes of user communities;
and determining a risk decision rule of each user community based on the risk evaluation index.
It is understood that the above modules of the dividing apparatus of a user community in the present embodiment have functions of implementing the respective steps of the dividing method of a user community in the embodiment shown in fig. 1. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above. The modules can be software and/or hardware, and each module can be implemented independently or by integrating a plurality of modules. For the functional description of each module of the apparatus for dividing a user community, reference may be specifically made to the corresponding description of the method for dividing a user community in the embodiment shown in fig. 1, and details are not repeated here.
Based on the same principle as the method shown in fig. 2, fig. 4 shows a schematic structural diagram of an apparatus for identifying an at risk user provided by an embodiment of the present application, and as shown in fig. 4, the apparatus for identifying an at risk user 40 may include:
a user community determining module 410, configured to determine a user community to which the user to be identified belongs, where the user community is divided according to the dividing method of the user community;
and the risk identification module 420 is used for determining whether the user to be identified is a risk user based on the risk decision rule of the user community.
According to the device provided by the embodiment of the application, whether the user to be identified is a risk user is determined based on the risk decision rule of the user community by determining the user community to which the user to be identified belongs. Based on the scheme, when the user community is divided, the relation in the relation network can be extracted through the probability of the same relation value, so that the data scale of the relation network is reduced, the divided user community has a good effect, risk identification is carried out on the user to be identified through the risk decision rule of the user community, and the risk prevention effect is guaranteed.
It is understood that the above modules of the identification apparatus of the risky user in the embodiment have functions of implementing the corresponding steps of the identification method of the risky user in the embodiment shown in fig. 2. The function can be realized by hardware, and can also be realized by executing corresponding software by hardware. The hardware or software includes one or more modules corresponding to the functions described above. The modules can be software and/or hardware, and each module can be implemented independently or by integrating a plurality of modules. For the functional description of each module of the identification apparatus for the risky user, reference may be specifically made to the corresponding description of the identification method for the risky user in the embodiment shown in fig. 2, and details are not repeated here.
The embodiment of the application provides an electronic device, which comprises a processor and a memory;
a memory for storing operating instructions;
and the processor is used for executing the method provided by any embodiment of the application by calling the operation instruction.
As an example, fig. 5 shows a schematic structural diagram of an electronic device to which an embodiment of the present application is applicable, and as shown in fig. 5, the electronic device 2000 includes: a processor 2001 and a memory 2003. Wherein the processor 2001 is coupled to a memory 2003, such as via a bus 2002. Optionally, the electronic device 2000 may also include a transceiver 2004. It should be noted that the transceiver 2004 is not limited to one in practical applications, and the structure of the electronic device 2000 is not limited to the embodiment of the present application.
The processor 2001 is applied to the embodiment of the present application to implement the method shown in the above method embodiment. The transceiver 2004 may include a receiver and a transmitter, and the transceiver 2004 is applied to the embodiments of the present application to implement the functions of the electronic device of the embodiments of the present application to communicate with other devices when executed.
The Processor 2001 may be a CPU (Central Processing Unit), general Processor, DSP (Digital Signal Processor), ASIC (Application Specific Integrated Circuit), FPGA (Field Programmable Gate Array) or other Programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 2001 may also be a combination of computing functions, e.g., comprising one or more microprocessors, DSPs and microprocessors, and the like.
Bus 2002 may include a path that conveys information between the aforementioned components. The bus 2002 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 2002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 5, but this is not intended to represent only one bus or type of bus.
The Memory 2003 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic disk storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these.
Optionally, the memory 2003 is used for storing application program code for performing the disclosed aspects, and is controlled in execution by the processor 2001. The processor 2001 is used to execute the application program code stored in the memory 2003 to implement the methods provided in any of the embodiments of the present application.
The electronic device provided by the embodiment of the application is applicable to any embodiment of the method, and is not described herein again.
Compared with the prior art, the electronic equipment extracts the target relation from the relation of the heterogeneous network to be processed based on the homorelation value probability of the relation by acquiring the heterogeneous network to be processed to obtain the heterogeneous network after processing, and then carries out community division on the heterogeneous network after processing to obtain the user community. Based on the scheme, the effective relation in the relation network can be extracted according to the homorelation value probability, the data scale of the relation network is reduced, the effect of community division is ensured, and the method and the system are favorable for risk prevention based on the divided user communities.
The present application provides a computer-readable storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the computer program implements the method shown in the above method embodiments.
The computer-readable storage medium provided in the embodiments of the present application is applicable to any of the embodiments of the foregoing method, and is not described herein again.
Compared with the prior art, the method includes the steps of extracting a target relation from a relation of the heterogeneous network to be processed based on a homorelation value probability of the relation by obtaining the heterogeneous network to be processed to obtain the heterogeneous network after processing, and then carrying out community division on the heterogeneous network after processing to obtain a user community. Based on the scheme, the effective relation in the relation network can be extracted according to the homorelation value probability, the data scale of the relation network is reduced, the effect of community division is ensured, and the method and the system are favorable for risk prevention based on the divided user communities.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A method for dividing a community of users, comprising:
acquiring a heterogeneous network to be processed, wherein the heterogeneous network to be processed is constructed based on the relationship between a user and an entity of a specified entity type;
extracting a target relation from the relation of the heterogeneous network to be processed based on the homonymy value probability of the relation to obtain the processed heterogeneous network, wherein the homonymy value probability is the probability that the relation of the same relation category owned by every two users has the same relation value;
and carrying out community division on the processed heterogeneous network to obtain a user community.
2. The method of claim 1, wherein extracting a target relationship from the relationships of the heterogeneous network to be processed based on the homologus value probabilities of the relationships comprises:
determining a relation clipping score corresponding to the relation based on the homorelation value probability of the relation;
and comparing the relation clipping with the corresponding clipping threshold value, and determining a target relation from the relation.
3. The method of claim 2, further comprising:
obtaining the entity from user data of the user;
acquiring a risk label of the user;
and constructing the heterogeneous network to be processed based on the risk label and the entity.
4. The method of claim 3, further comprising:
determining a risk score based on the risk label;
and determining the clipping score threshold value based on the relationship clipping score corresponding to each relationship under each relationship type and the total number of the relationships under each relationship type and based on the risk ratio.
5. The method according to any one of claims 1-4, further comprising:
determining a risk assessment index of each user community;
and determining a risk decision rule of each user community based on the risk assessment index.
6. A method for identifying an at-risk user, comprising:
determining a user community to which a user to be identified belongs, wherein the user community is divided according to the dividing method of the user community in any one of claims 1-5;
determining whether the user to be identified is a risky user based on a risk decision rule of the community of users.
7. An apparatus for dividing a community of users, comprising:
the heterogeneous network acquisition module is used for acquiring a heterogeneous network to be processed, and the heterogeneous network to be processed is constructed based on the relationship between a user and an entity of a specified entity type;
a relationship extraction module, configured to extract a target relationship from the relationship of the heterogeneous network to be processed based on a homonymy value probability of the relationship, to obtain a processed heterogeneous network, where the homonymy value probability is a probability that a relationship of the same relationship category owned by each two users has the same relationship value;
and the community dividing module is used for carrying out community division on the processed heterogeneous network to obtain a user community.
8. An apparatus for identifying an at-risk user, comprising:
a user community determining module, configured to determine a user community to which a user to be identified belongs, where the user community is divided according to the user community dividing method of any one of claims 1 to 5;
and the risk identification module is used for determining whether the user to be identified is a risk user or not based on a risk decision rule of the user community.
9. An electronic device comprising a processor and a memory;
the memory is used for storing operation instructions;
the processor is used for executing the method of any one of claims 1-6 by calling the operation instruction.
10. A computer-readable storage medium, characterized in that the storage medium has stored thereon a computer program which, when being executed by a processor, carries out the method of any one of claims 1-6.
CN202110833619.0A 2021-07-23 2021-07-23 User community division and risk user identification method and device and electronic equipment Active CN113554308B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110833619.0A CN113554308B (en) 2021-07-23 2021-07-23 User community division and risk user identification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110833619.0A CN113554308B (en) 2021-07-23 2021-07-23 User community division and risk user identification method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN113554308A true CN113554308A (en) 2021-10-26
CN113554308B CN113554308B (en) 2024-05-28

Family

ID=78104224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110833619.0A Active CN113554308B (en) 2021-07-23 2021-07-23 User community division and risk user identification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN113554308B (en)

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103166988A (en) * 2011-12-13 2013-06-19 深圳市腾讯计算机系统有限公司 Method and device for dividing user groups in social network site (SNS) community
CN103778186A (en) * 2013-12-31 2014-05-07 南京财经大学 Method for detecting sockpuppet
CN103824115A (en) * 2014-02-28 2014-05-28 中国科学院计算技术研究所 Open-network-knowledge-base-oriented between-entity relationship deduction method and system
CN108228706A (en) * 2017-11-23 2018-06-29 中国银联股份有限公司 For identifying the method and apparatus of abnormal transaction corporations
CN108319727A (en) * 2018-03-01 2018-07-24 南开大学 A method of any two points shortest path in social networks is found based on community structure
US20190132224A1 (en) * 2017-10-26 2019-05-02 Accenture Global Solutions Limited Systems and methods for identifying and mitigating outlier network activity
CN110177094A (en) * 2019-05-22 2019-08-27 武汉斗鱼网络科技有限公司 A kind of user community recognition methods, device, electronic equipment and storage medium
CN110297912A (en) * 2019-05-20 2019-10-01 平安科技(深圳)有限公司 Cheat recognition methods, device, equipment and computer readable storage medium
CN110457404A (en) * 2019-08-19 2019-11-15 电子科技大学 Social media account-classification method based on complex heterogeneous network
CN110674290A (en) * 2019-08-09 2020-01-10 国家计算机网络与信息安全管理中心 Relationship prediction method, device and storage medium for overlapping community discovery
CN112165401A (en) * 2020-09-28 2021-01-01 长春工业大学 Edge community discovery algorithm based on network pruning and local community expansion
CN113111133A (en) * 2021-04-09 2021-07-13 北京沃东天骏信息技术有限公司 User classification method and device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103166988A (en) * 2011-12-13 2013-06-19 深圳市腾讯计算机系统有限公司 Method and device for dividing user groups in social network site (SNS) community
CN103778186A (en) * 2013-12-31 2014-05-07 南京财经大学 Method for detecting sockpuppet
CN103824115A (en) * 2014-02-28 2014-05-28 中国科学院计算技术研究所 Open-network-knowledge-base-oriented between-entity relationship deduction method and system
US20190132224A1 (en) * 2017-10-26 2019-05-02 Accenture Global Solutions Limited Systems and methods for identifying and mitigating outlier network activity
CN108228706A (en) * 2017-11-23 2018-06-29 中国银联股份有限公司 For identifying the method and apparatus of abnormal transaction corporations
CN108319727A (en) * 2018-03-01 2018-07-24 南开大学 A method of any two points shortest path in social networks is found based on community structure
CN110297912A (en) * 2019-05-20 2019-10-01 平安科技(深圳)有限公司 Cheat recognition methods, device, equipment and computer readable storage medium
CN110177094A (en) * 2019-05-22 2019-08-27 武汉斗鱼网络科技有限公司 A kind of user community recognition methods, device, electronic equipment and storage medium
CN110674290A (en) * 2019-08-09 2020-01-10 国家计算机网络与信息安全管理中心 Relationship prediction method, device and storage medium for overlapping community discovery
CN110457404A (en) * 2019-08-19 2019-11-15 电子科技大学 Social media account-classification method based on complex heterogeneous network
CN112165401A (en) * 2020-09-28 2021-01-01 长春工业大学 Edge community discovery algorithm based on network pruning and local community expansion
CN113111133A (en) * 2021-04-09 2021-07-13 北京沃东天骏信息技术有限公司 User classification method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ANUBHUTI GARG 等: ""Extended Core-based Community Detection for Directed Networks"", 《2017 INTERNATIONAL CONFERENCE ON COMPUTER, INFORMATION AND TELECOMMUNICATION SYSTEMS (CITS)》, 14 September 2017 (2017-09-14), pages 1 - 5 *
王雪 等: ""混合隶属度对股票复杂网络社团划分的信息提示功能研究"", 《情报科学》, 6 July 2018 (2018-07-06), pages 113 - 119 *

Also Published As

Publication number Publication date
CN113554308B (en) 2024-05-28

Similar Documents

Publication Publication Date Title
CN109816397B (en) Fraud discrimination method, device and storage medium
TWI662421B (en) Community division method and device based on feature matching network
KR101627592B1 (en) Detection of confidential information
CN111460312A (en) Method and device for identifying empty-shell enterprise and computer equipment
CN112435137B (en) Cheating information detection method and system based on community mining
CN111340612B (en) Account risk identification method and device and electronic equipment
CN113807940B (en) Information processing and fraud recognition method, device, equipment and storage medium
CN112632609A (en) Abnormality detection method, abnormality detection device, electronic apparatus, and storage medium
CN114782161A (en) Method, device, storage medium and electronic device for identifying risky users
CN111985192A (en) Web attack report generation method, device, equipment and computer medium
CN113904834B (en) XSS attack detection method based on machine learning
CN111091385A (en) Weight-based object identification method and device and electronic equipment
CN111177362B (en) Information processing method, device, server and medium
CN110222484B (en) User identity recognition method and device, electronic equipment and storage medium
CN113554308A (en) User community division and risk user identification method and device and electronic equipment
CN111241187A (en) Big data mining system
CN113657902B (en) Financial security management method, system and storage medium based on graph database
CN114491563A (en) Method for acquiring risk level of information security event and related device
CN112613762B (en) Group rating method and device based on knowledge graph and electronic equipment
CN114297735A (en) Data processing method and related device
CN113487109A (en) Group identification method and device, electronic equipment and storage medium
CN112950222A (en) Resource processing abnormity detection method and device, electronic equipment and storage medium
Hak et al. Credit Card Fraud Detection Using Advanced Combination Heuristic and Bayes’ Theorem
CN117036012A (en) Method and device for identifying abnormal account based on encrypted flow spectrum feature analysis
CN110543632B (en) Text information identification method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant