CN111598713A

CN111598713A - Cluster recognition method and device based on similarity weight updating and electronic equipment

Info

Publication number: CN111598713A
Application number: CN202010724429.0A
Authority: CN
Inventors: 宋孟楠; 苏绥绥
Original assignee: Beijing Qiyu Information Technology Co Ltd
Current assignee: Beijing Qiyu Information Technology Co Ltd
Priority date: 2020-07-24
Filing date: 2020-07-24
Publication date: 2020-08-28
Anticipated expiration: 2040-07-24
Also published as: CN111598713B

Abstract

The invention discloses a method, a device and electronic equipment for identifying a group based on similarity weight updating, wherein the method comprises the following steps: acquiring a black seed sub-user sub-relationship graph according to the black seed sub-user social relationship network; updating the weight of the observation dimension Si according to the relationship between the observation dimensions Si in the black seed user sub-relationship graph; determining the similarity of the users in the black seed sub-user sub-relationship graph according to the updated weight of the observation dimension Si; and determining risk groups according to the similarity of the users. The similarity between the users is determined by updating the weight of the observation dimension Si in the black seed sub-user sub-relationship graph in real time, and the accuracy of similarity calculation in the group partner identification is improved, so that the risk group can be identified timely and accurately, the business wind control requirement is met, and the economic loss of enterprises is reduced.

Description

Cluster recognition method and device based on similarity weight updating and electronic equipment

Technical Field

The invention relates to the technical field of computer information processing, in particular to a method and a device for identifying a group based on similarity weight updating, electronic equipment and a computer readable medium.

Background

Due to the rapid development of the internet and the popularization of intelligent terminals, people can transact a plurality of services such as online shopping, online transfer, online loan and the like through the network without leaving home. Meanwhile, in order to earn interests, lawless persons are rampant about the behavior of ganging up and cheating by forging false information by other persons.

Group fraud causes greater economic loss to internet enterprises than personal fraud, and therefore how to identify and avoid group fraud so as to reduce economic loss is a problem to be solved urgently by internet enterprises.

Disclosure of Invention

The invention aims to solve the technical problem that the existing network technology cannot intelligently, quickly and accurately identify the group cheating behavior.

In order to solve the above technical problem, a first aspect of the present invention provides a method for group identification based on similarity weight update, where the method includes:

acquiring a black seed sub-user sub-relationship graph according to the black seed sub-user social relationship network;

updating the weight of the observation dimension Si according to the relationship between the observation dimensions Si in the black seed user sub-relationship graph;

determining the similarity of the users in the black seed sub-user sub-relationship graph according to the updated weight of the observation dimension Si;

determining risk groups according to the similarity of the users;

wherein i is a natural number.

According to a preferred embodiment of the present invention, the updating the weight of the observation dimension Si according to the relationship between the observation dimensions Si in the black seed sub-user sub-relationship graph includes:

generating an n-order similarity matrix Di according to the relation between the user observation dimensions Si;

performing spectral clustering on the similarity matrix Di to obtain a final similarity matrix D;

determining the number of the shared edges according to the final similarity matrix D;

updating Si weight according to the number of the shared edges until the target function meets the condition;

wherein n is the number of users contained in the black seed sub-user sub-relationship graph.

According to a preferred embodiment of the present invention, the final similarity matrix D is obtained by the following formula:

D=PiDi+T；

wherein Pi is a random value and T is a predetermined matrix.

According to a preferred embodiment of the present invention, the determining the similarity of the users in the black seed sub-user sub-relationship graph according to the updated weight of the observation dimension Si includes:

determining a user observation dimension value ri in the black seed sub-user sub-relation graph;

and determining the similarity of the users in the black seed sub-user sub-relationship graph according to the user observation dimension value ri and the weight of the corresponding updated observation dimension Si.

According to a preferred embodiment of the present invention, the obtaining the black seed sub-user sub-relationship graph according to the black seed sub-user social relationship network includes:

determining black seed users according to user historical data;

diffusing the contact persons of the black seed users according to the group partner scale to obtain a diffusion relation graph;

and segmenting the diffusion relation graph to obtain a black seed user sub-relation graph.

According to a preferred embodiment of the invention, the observation dimensions comprise: at least one of the attribution of the user ID number, the operating system of the device used by the user and the longitude and latitude of the position where the user is located.

In order to solve the above technical problem, a second aspect of the present invention provides a group identification apparatus updated based on similarity weight, the apparatus comprising:

the acquisition module is used for acquiring a black seed sub-user sub-relationship graph according to the black seed sub-user social relationship network;

the updating module is used for updating the weight of the observation dimension Si according to the relationship between the observation dimensions Si in the black seed user sub-relationship graph;

the first determining module is used for determining the similarity of users in the black seed sub-user sub-relationship graph according to the updated weight of the observation dimension Si;

the second determining module is used for determining risk groups according to the user similarity;

wherein i is a natural number.

According to a preferred embodiment of the present invention, the update module includes:

the generating module is used for generating an n-order similarity matrix Di according to the relation between the user observation dimensions Si;

the clustering module is used for carrying out spectral clustering on the similarity matrix Di to obtain a final similarity matrix D;

a sub-determination module, configured to determine the number of shared edges according to the final similarity matrix D;

the sub-updating module is used for updating the Si weight according to the number of the common edges until the target function meets the condition;

According to a preferred embodiment of the present invention, the clustering module obtains a final similarity matrix D by the following formula:

D=PiDi+T；

wherein Pi is a random value and T is a predetermined matrix.

According to a preferred embodiment of the present invention, the first determining module includes:

the first sub-determination module is used for determining a user observation dimension value ri in the black seed sub-user sub-relation graph;

and the second sub-determining module is used for determining the similarity of the users in the black seed sub-user sub-relationship graph according to the user observation dimension value ri and the weight of the corresponding updated observation dimension Si.

According to a preferred embodiment of the present invention, the obtaining module includes:

the third sub-determining module is used for determining black sub-users according to the historical data of the users;

the diffusion module is used for diffusing the contact of the black seed user according to the group scale to obtain a diffusion relation graph;

and the segmentation module is used for segmenting the diffusion relation graph to obtain a black seed user sub-relation graph.

To solve the above technical problem, a third aspect of the present invention provides an electronic device, comprising:

a processor; and

a memory storing computer executable instructions that, when executed, cause the processor to perform the method described above.

In order to solve the above technical problem, a fourth aspect of the present invention proposes a computer-readable storage medium, wherein the computer-readable storage medium stores one or more programs that, when executed by a processor, implement the above method.

Firstly, acquiring a black seed sub-user sub-relationship graph formed by users with close relationships in black seed sub-user contacts according to a social relationship network; updating the weight of the observation dimension Si according to the relationship between the observation dimensions Si in the black seed user sub-relationship graph; determining the similarity of every two users in the black seed sub-user sub-relationship graph according to the updated weight of the observation dimension Si; thereby determining risk groups according to user similarity; the similarity between the users is determined by updating the weight of the observation dimension Si in the black seed sub-user sub-relationship graph in real time, and the accuracy of similarity calculation in the group partner identification is improved, so that the risk group can be identified timely and accurately, the business wind control requirement is met, and the economic loss of enterprises is reduced.

Drawings

In order to make the technical problems solved by the present invention, the technical means adopted and the technical effects obtained more clear, the following will describe in detail the embodiments of the present invention with reference to the accompanying drawings. It should be noted, however, that the drawings described below are only illustrations of exemplary embodiments of the invention, from which other embodiments can be derived by those skilled in the art without inventive step.

Fig. 1 is a schematic flow chart of a group identification method based on similarity weight update according to the present invention;

FIG. 2 is a schematic flow chart of the present invention for updating the weights of the observation dimensions Si according to the relationship between the observation dimensions Si in the black seed user sub-relationship graph;

FIG. 3 is a schematic illustration of the present invention for determining common edges;

fig. 4 is a schematic structural framework diagram of a group recognition device based on similarity weight update according to the present invention;

FIG. 5 is a block diagram of an exemplary embodiment of an electronic device in accordance with the present invention;

FIG. 6 is a diagrammatic representation of one embodiment of a computer-readable medium of the present invention.

Detailed Description

Exemplary embodiments of the present invention will now be described more fully hereinafter with reference to the accompanying drawings, in which exemplary embodiments of the invention may be embodied in many specific forms, and should not be construed as limited to the embodiments set forth herein. Rather, these exemplary embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those skilled in the art.

The structures, properties, effects or other characteristics described in a certain embodiment may be combined in any suitable manner in one or more other embodiments, while still complying with the technical idea of the invention.

In describing particular embodiments, specific details of structures, properties, effects, or other features are set forth in order to provide a thorough understanding of the embodiments by one skilled in the art. However, it is not excluded that a person skilled in the art may implement the invention in a specific case without the above-described structures, performances, effects or other features.

The flow chart in the drawings is only an exemplary flow demonstration, and does not represent that all the contents, operations and steps in the flow chart are necessarily included in the scheme of the invention, nor does it represent that the execution is necessarily performed in the order shown in the drawings. For example, some operations/steps in the flowcharts may be divided, some operations/steps may be combined or partially combined, and the like, and the execution order shown in the flowcharts may be changed according to actual situations without departing from the gist of the present invention.

The block diagrams in the figures generally represent functional entities and do not necessarily correspond to physically separate entities. I.e. these functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor means and/or microcontroller means.

The same reference numerals denote the same or similar elements, components, or parts throughout the drawings, and thus, a repetitive description thereof may be omitted hereinafter. It will be further understood that, although the terms first, second, third, etc. may be used herein to describe various elements, components, or sections, these elements, components, or sections should not be limited by these terms. That is, these phrases are used only to distinguish one from another. For example, a first device may also be referred to as a second device without departing from the spirit of the present invention. Furthermore, the term "and/or", "and/or" is intended to include all combinations of any one or more of the listed items.

Aiming at the existing group fraud behavior in the internet enterprises, the invention combines the specific scene characteristics of the internet service to identify the risk group and provide the identification result to the internet enterprise staff, and the staff can process the resource application of the related staff by rejecting the application (such as rejecting the resource request) or increasing the manual review and the like, so as to reduce the economic loss risk of the internet.

Referring to fig. 1, fig. 1 is a flowchart of a method for group identification based on similarity weight update according to the present invention, as shown in fig. 1, the method includes:

s1, acquiring a black seed sub-user sub-relationship graph according to the black seed sub-user social relationship network;

illustratively, this step includes:

s11, determining black seed users according to the user historical data;

in the invention, the black seed user is a user with bad behaviors such as fraud records or unreturned records of resources. Specifically, users with fraud records or unreturned funds records can be identified through user history data and marked as black seed users.

The user history data may include user service information, user identification information, user contact information, and the like. The user service information is used for recording service data of a user, taking a loan service as an example, the service information is used for recording data of borrowing and repayment of the user, and taking online shopping as an example, the service information is used for recording data of ordering, paying, returning and refunding of the user. The user identification information is used for uniquely identifying the user and can be an Identity (ID) number of the user, and the user contact information can comprise a mailbox, a telephone, a social APP account, an address, an equipment fingerprint, login IP information and the like.

S12, diffusing the contact of the black seed user according to the group scale to obtain a diffusion relation graph;

according to the method and the device, the social contact relationship network of the black seed user is diffused, the user who has the first-degree and second-degree contact relationship with the black seed user forms a diffusion relationship graph, and the recognition efficiency and accuracy of the risk group can be balanced.

The first-degree contact relation means that two users have a direct association relation, and the second-degree contact relation means that two users have an indirect association relation. For example, user a and user B have a first degree contact relationship, and user B and user C also have a first degree contact relationship, then user a and user C have an indirect association relationship through user B, that is, user a and user C have a second degree contact relationship.

The invention adds the black seed user into a pre-established social relationship network as one node. Each node in the social relationship network is used for representing different users, and connecting lines among the nodes are used for representing contact relationships among the users. In the invention, as the black seed user is a user with fraud records, the user who has a contact relation with the black seed user is suspected of fraud.

In practical application, there are many methods for calculating the relationship between two users in a social contact network, and the embodiment of the present application is not particularly limited. If any mail communication, conversation, same equipment, same IP login or social communication and the like exist between the two users, the contact relation between the two users can be regarded as existing, and therefore the contact relation between the two users can be calculated according to the contact information of the users.

Specifically, a weight is newly and respectively set for each contact in the multiple contact information of the black seed user; and counting the times of establishing connection between the black seed user and the first user through each piece of contact information respectively, wherein the first user is any user except the black seed user in the social contact network. And then, according to the number of times of establishing contact between the black seed user and the first user through each piece of contact information and the weight of the corresponding contact information, calculating the contact degree between the black seed user and the first user, and if the calculated contact degree meets a preset condition, determining that the first user and the black seed user have a one-degree contact relationship. In an optional embodiment, the preset condition may be that the contact degree is greater than 1, that is, if the calculated contact degree is greater than 1, it is indicated that the black seed user has a one-degree contact relationship with the first user.

In addition, if the first user has a first degree contact relationship with the black seed user and the second user does not have a first degree contact relationship with the first user, determining that the second user has a second degree contact relationship with the black seed user, wherein the second user is any user in the social relationship network except the black seed user and the first user.

After a social contact network of the black seed user is established, determining to spread first-degree contacts or first-degree contacts and second-degree contacts of the black seed user according to the group scale;

wherein, the group size refers to the number of the personnel included in the risk group. The invention can set the group scale according to the business experience. In one example, the first degree contacts of the black seed user are flooded when the group size is equal to or less than 3 people, and the first degree contacts and the second degree contacts of the black seed user are flooded when the group size is greater than 3 people.

In the invention, if the first-degree contact of the black seed user is diffused, all the first-degree contacts of the black seed user are searched, and a diffusion relation graph is formed by all the first-degree contacts of the black seed user; each node in the diffusion relation graph is used for representing different users, connecting lines between the nodes are used for representing contact person relations between the users, and the users in the diffusion relation graph are black seed users or first-degree contact persons of the black seed users. And if the first-degree contact and the second-degree contact of the black seed user are diffused, searching the first-degree contact and the second-degree contact of the black seed user to obtain a diffusion relation graph. The users in the diffusion relation graph are black seed users, first degree contacts of the black seed users, or second degree contacts of the black seed users.

S13, segmenting the diffusion relation graph to obtain a black seed user sub-relation graph;

the invention completes a group identification process of a stage of coarse granularity by segmenting the diffusion relation graph. After segmentation, the users with close relations in the diffusion relation graph are segmented into the same black seed sub-user sub-relation graph, the black seed sub-user sub-relation graph corresponds to a suspected risk group, and the users which are not segmented into any sub-relation graph do not form the risk group.

The method for segmenting the diffusion relation graph can adopt the existing heap method or the method for constructing a confidence network and segmenting connected subgraphs through the confidence network.

S2, updating the weight of the observation dimension Si according to the relation between the observation dimensions Si in the black seed user sub-relation graph;

the method comprises the step of carrying out two-stage risk group recognition based on the calculation of the user similarity in the black seed sub-user sub-relationship graph so as to improve the accuracy of group recognition. When the user similarity is calculated, the weight of each observation dimension Si is updated in real time according to the relation between the observation dimensions Si in the black seed sub-user sub-relation graph, and the similarity calculation accuracy in the group identification can be effectively improved.

The observation dimension may be user information of different dimensions, and specifically may include: the ID number attribution, the operating system of the equipment used by the user and the longitude and latitude of the position of the user; account name, registration time, IP address information used at the time of registration, device information of a device used at the time of registration, and the like.

Illustratively, as shown in fig. 2, the present step includes:

s21, generating an n-order similarity matrix Di according to the relation between the user observation dimensions Si;

wherein n is the number of users contained in the black seed sub-user sub-relationship graph, and i is the number of observation dimensions. In the similarity matrix Di corresponding to each observation dimension Si, 0 represents that the observation dimensions Si between two users are different, and 1 represents that the observation dimensions Si between two users are similar.

Taking the black seed sub-user sub-relationship diagram including three users, i.e., user 1, user 2 and user 3 (i.e., n = 3), the user observation dimension S1 is selected as the ID number attribution, if the user 1 and the user 2 belong to different ID numbers, the user 1 and the user 3 belong to the same ID number, and the user 2 and the user 3 belong to the same ID number. A 3 rd order similarity matrix D1 is generated based on the relationship between the user ID numbers home locations,

。

in the same way, the similarity matrix Di corresponding to each observation dimension Si can be obtained.

S22, performing spectral clustering on the similarity matrix Di to obtain a final similarity matrix D;

the main idea of spectral clustering is to consider all data as points in space, and these points can be connected by edges. The edge weight value between two points with a longer distance is lower, the edge weight value between two points with a shorter distance is higher, and the graph formed by all data points is cut, so that the edge weight sum between different subgraphs after graph cutting is as low as possible, and the edge weight sum in the subgraph is as high as possible, thereby achieving the purpose of clustering. The spectral clustering is based on spectral segmentation of the graph, when clustering is carried out, an object set to be clustered is taken as a vertex set to construct a weighted graph, and then a clustering result is obtained by analyzing a characteristic vector and a characteristic value of a matrix related to the weighted graph.

In the invention, an n-order final similarity matrix D is obtained by the following formula:

D=PiDi+T；

wherein Pi is a random value, preferably 1/n, and T is a preset matrix.

After spectral clustering, it is obtained which users are clustered into a class and which users cannot be clustered into a class. Correspondingly, in the final similarity matrix D, 1 represents that two users are grouped into one class, and 0 represents that two users cannot be grouped into one class. Illustratively, if the final similarity matrix is:

then, it indicates that user 2 and user 3 are grouped into a class, and user 1 and user 2 and user 3 cannot be grouped into a class.

S23, determining the number of the shared edges according to the final similarity matrix D;

the shared edge refers to an edge formed by the same observation dimension between two users gathered to one class in the black seed sub-user sub-relationship graph. Specifically, the users gathered to one class are determined according to the final similarity matrix D, and then the number of the two users having the same observation dimension gathered to one class is used as the number of the shared edges. As shown in fig. 3, the black seed sub-user sub-relationship graph includes three users, user 1, user 2, and user 3, and the selected observation dimensions include: s1, S2, and S3. Wherein, the user 1 and the user 2 have the same observation dimensions S2 and S3, the user 1 and the user 3 have the same observation dimension S2, the user 2 and the user 3 have the same observation dimensions S1 and S3, and it is determined that the user 1 and the user 2 are grouped into one class according to the final similarity matrix D, and then the number of the same observation dimensions of the user 1 and the user 2 is taken as the number of the shared edges, that is, two shared edges.

S24, updating the Si weight according to the number of the shared edges until the objective function meets the condition;

the objective function is used for judging whether the result of the spectral clustering meets the condition of Si updating iteration, and can be specifically set according to needs. And circularly executing the steps S21-S24, and stopping the Si updating iteration when the objective function meets the Si updating iteration condition. In general, the result of spectral clustering can converge at 3-5 degrees, and therefore, the iteration of Si update can be stopped at 3-5 degrees.

Specifically, for users clustered into a class, the Si weight may be updated according to the proportion of the total shared edge contributed by each observation dimension, for example, if user 1 and user 2 are clustered into a class, and if user 1 and user 2 have four shared edges, where S1 contributes 1 shared edge, S2 contributes 2 shared edges, and S3 contributes 1 shared edge, S1 weight is 1/4, S2 weight is 2/4, and S3 weight is 1/4.

S3, determining the similarity of the users in the black seed sub-user sub-relationship graph according to the updated weight of the observation dimension Si;

illustratively, this step includes:

s31, determining a user observation dimension value ri in the black seed sub-user sub-relationship graph;

the observation dimension Si between the two users is the same, and the corresponding observation dimension value ri is 1, and the observation dimension Si between the two users is different, and the corresponding observation dimension value ri is 0.

And S32, determining the similarity of the users in the black seed sub-user relationship graph according to the user observation dimension value ri and the weight of the corresponding updated observation dimension Si.

Specifically, the similarity of the users in the black seed user sub-relationship graph can be obtained by multiplying each observation dimension Si by the corresponding weight and summing.

S4, determining risk groups according to the user similarity;

specifically, if the similarity between two users is equal, the two users are determined to belong to a risk group, and then the two users are respectively compared with the similarity of other users in the sub-relationship graph, so that the risk group is finally determined.

In this embodiment, the risk group data obtained in step S4 is preliminary group data, which may include some data of normal accounts, for example, data of anchor trumpet, in this case, a white list may be generated according to the user' S remark, for example, to remark that an account is a trumpet, and if an account in the risk group data exists in the white list, the account in the white list may be deleted in the risk group data, so as to obtain final risk group data. Therefore, the present invention may further perform the following steps:

s5, determining whether a preset white list contains users in the risk group or not;

and S6, if the preset white list has the users in the risk group, deleting the risk group users contained in the preset white list in the risk group.

In an embodiment, after finally determining the risk group data, the embodiment may further include the following steps:

and sending the risk group data to a wind control platform, automatically triggering the monitoring of the accounts in the risk group data by the wind control platform at the later stage, and automatically associating and blocking the whole group account after a part of the group accounts violate rules.

For example, it can be determined from data fed back from the business itself or a third party whether the account is prohibited (for example, if an account has a violation behavior in daily business reported by other accounts, or is patrolled or triggers a high-risk behavior, etc., all of the accounts are prohibited), and if there are 10 accounts in the risk group data, of which 7 accounts have been prohibited, and the prohibited proportion exceeds a preset threshold, the risk group data can be considered as a high-risk group, and the remaining 3 accounts in the group are prohibited.

Fig. 4 is a schematic diagram of an architecture of a group recognition device based on similarity weight update according to the present invention, as shown in fig. 4, the device includes:

an obtaining module 41, configured to obtain a black seed sub-user sub-relationship graph according to the black seed sub-user social relationship network;

an updating module 42, configured to update the weight of the observation dimension Si according to the relationship between the observation dimensions Si in the black seed user sub-relationship graph; wherein the observation dimensions include: at least one of the attribution of the user ID number, the operating system of the device used by the user and the longitude and latitude of the position where the user is located.

A first determining module 43, configured to determine similarity of users in the black seed sub-user sub-relationship graph according to the updated weight of the observation dimension Si;

a second determining module 44, configured to determine risk groups according to the user similarity;

wherein i is a natural number.

In a specific embodiment, the obtaining module 41 includes:

a third sub-determining module 411, configured to determine a black seed sub-user according to the user history data;

the diffusion module 412 is configured to diffuse the contact of the black seed user according to the group scale to obtain a diffusion relation graph;

and the segmentation module 413 is configured to segment the diffusion relation graph to obtain a black seed user sub-relation graph.

The update module 42 includes:

the generating module 421 is configured to generate an n-order similarity matrix Di according to a relationship between the user observation dimensions Si;

the clustering module 422 is configured to perform spectral clustering on the similarity matrix Di to obtain a final similarity matrix D;

the sub-determining module 423 is configured to determine the number of the shared edges of the black seed sub-user sub-relational graph according to the final similarity matrix D;

a sub-updating module 424, configured to update the Si weight according to the number of the common edges until the objective function meets the condition;

Specifically, the clustering module 422 obtains the final similarity matrix D by the following formula:

D=PiDi+T；

wherein Pi is a random value and T is a predetermined matrix.

The first determination module 43 includes:

a first sub-determining module 431, configured to determine a user observation dimension value ri in the black seed sub-user sub-relationship graph;

a second sub-determining module 432, configured to determine the similarity of the users in the black seed sub-user sub-relationship graph according to the user observation dimension value ri and the weight of the corresponding updated observation dimension Si.

Those skilled in the art will appreciate that the modules in the above-described embodiments of the apparatus may be distributed as described in the apparatus, and may be correspondingly modified and distributed in one or more apparatuses other than the above-described embodiments. The modules of the above embodiments may be combined into one module, or further split into multiple sub-modules.

In the following, embodiments of the electronic device of the present invention are described, which may be regarded as an implementation in physical form for the above-described embodiments of the method and apparatus of the present invention. Details described in the embodiments of the electronic device of the invention should be considered supplementary to the embodiments of the method or apparatus described above; for details which are not disclosed in embodiments of the electronic device of the invention, reference may be made to the above-described embodiments of the method or the apparatus.

Fig. 5 is a block diagram of an exemplary embodiment of an electronic device according to the present invention. The electronic device shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 5, the electronic device 500 of the exemplary embodiment is represented in the form of a general-purpose data processing device. The components of the electronic device 500 may include, but are not limited to: at least one processing unit 510, at least one memory unit 520, a bus 530 connecting different electronic device components (including the memory unit 520 and the processing unit 510), a display unit 540, and the like.

The storage unit 520 stores a computer readable program, which may be a code of a source program or a read-only program. The program may be executed by the processing unit 510 such that the processing unit 510 performs the steps of various embodiments of the present invention. For example, the processing unit 510 may perform the steps as shown in fig. 1.

The memory unit 520 may include a readable medium in the form of a volatile memory unit, such as a random access memory unit (RAM) 5201 and/or a cache memory unit 5202, and may further include a read only memory unit (ROM) 5203. The memory unit 520 may also include a program/utility 5204 having a set (at least one) of program modules 5205, such program modules 5205 including, but not limited to: operating the electronic device, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 530 may be one or more of any of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 500 may also communicate with one or more external devices 300 (e.g., keyboard, display, network device, bluetooth device, etc.), enable a user to interact with the electronic device 500 via the external devices 500, and/or enable the electronic device 500 to communicate with one or more other data processing devices (e.g., router, modem, etc.). Such communication can occur via input/output (I/O) interfaces 550, and can also occur via network adapter 560 to one or more networks, such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet. The network adapter 560 may communicate with other modules of the electronic device 500 via the bus 530. It should be appreciated that although not shown in FIG. 5, other hardware and/or software modules may be used in the electronic device 500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID electronics, tape drives, and data backup storage electronics, among others.

FIG. 6 is a schematic diagram of one computer-readable medium embodiment of the present invention. As shown in fig. 6, the computer program may be stored on one or more computer readable media. The computer readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electronic device, apparatus, or device that is electronic, magnetic, optical, electromagnetic, infrared, or semiconductor, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. The computer program, when executed by one or more data processing devices, enables the computer-readable medium to implement the above-described method of the invention, namely: acquiring a black seed sub-user sub-relationship graph according to the black seed sub-user social relationship network; updating the weight of the observation dimension Si according to the relationship between the observation dimensions Si in the black seed user sub-relationship graph; determining the similarity of the users in the black seed sub-user sub-relationship graph according to the updated weight of the observation dimension Si; and determining risk groups according to the similarity of the users.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments of the present invention described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiment of the present invention can be embodied in the form of a software product, which can be stored in a computer-readable storage medium (which can be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to make a data processing device (which can be a personal computer, a server, or a network device, etc.) execute the above-mentioned method according to the present invention.

The computer readable storage medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable storage medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution electronic device, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including object oriented programming languages such as Java, C + + or the like and conventional procedural programming languages, such as "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

In summary, the present invention can be implemented as a method, an apparatus, an electronic device, or a computer-readable medium executing a computer program. Some or all of the functions of the present invention may be implemented in practice using a general purpose data processing device such as a microprocessor or a Digital Signal Processor (DSP).

While the foregoing embodiments have described the objects, aspects and advantages of the present invention in further detail, it should be understood that the present invention is not inherently related to any particular computer, virtual machine or electronic device, and various general-purpose machines may be used to implement the present invention. The invention is not to be considered as limited to the specific embodiments thereof, but is to be understood as being modified in all respects, all changes and equivalents that come within the spirit and scope of the invention.

Claims

1. A method for group recognition based on similarity weight update, the method comprising:

updating Si weight according to the number of the shared edges until an objective function meets a condition, wherein n is the number of users contained in the black seed user sub-relational graph;

determining risk groups according to the similarity of the users;

wherein i is a natural number.

2. The method of claim 1, wherein the final similarity matrix D is obtained by the following formula:

D=PiDi+T；

wherein Pi is a random value and T is a predetermined matrix.

3. The method according to claim 1, wherein the determining the similarity of the users in the black seed sub-user relationship graph according to the updated weights of the observation dimensions Si comprises:

4. The method of claim 1, wherein obtaining the black seed sub-user sub-relationship graph according to the black seed sub-user social relationship network comprises:

determining black seed users according to user historical data;

5. The method of claim 1, wherein the observation dimensions comprise: at least one of the attribution of the user ID number, the operating system of the device used by the user and the longitude and latitude of the position where the user is located.

6. A group recognition apparatus based on similarity weight update, the apparatus comprising:

wherein i is a natural number;

the update module includes:

7. An electronic device, comprising:

a processor; and

a memory storing computer-executable instructions that, when executed, cause the processor to perform the method of any of claims 1-5.

8. A computer readable storage medium, wherein the computer readable storage medium stores one or more programs which, when executed by a processor, implement the method of any of claims 1-5.