CN112148767A - Group mining method, abnormal group identification method and device and electronic equipment - Google Patents

Group mining method, abnormal group identification method and device and electronic equipment Download PDF

Info

Publication number
CN112148767A
CN112148767A CN202010952147.6A CN202010952147A CN112148767A CN 112148767 A CN112148767 A CN 112148767A CN 202010952147 A CN202010952147 A CN 202010952147A CN 112148767 A CN112148767 A CN 112148767A
Authority
CN
China
Prior art keywords
group
partnership
target user
map
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010952147.6A
Other languages
Chinese (zh)
Inventor
张屹綮
王维强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010952147.6A priority Critical patent/CN112148767A/en
Publication of CN112148767A publication Critical patent/CN112148767A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Abstract

The embodiment of the specification provides a group mining method, an abnormal group identification device and electronic equipment. The group mining method comprises the following steps: and clustering the network relationship maps of the target user groups based on service connection by using a first clustering algorithm to obtain a first group-partner relationship map. Inputting the preset service characteristic data of the user in the first group-partnership map into a neural network model to obtain the expression characteristics of the user in the first group-partnership map, wherein the neural network model is obtained by training based on the preset service characteristic data of the user in the sample group and the label of the sample group, and the label indicates whether the sample group is clustered by the first clustering algorithm or not. And clustering the network relationship map and/or the first group relationship map based on the expression characteristic similarity by using a second clustering algorithm to obtain a second group relationship map. And generating a group mining result of the target user group based on the first group relationship map and the second group relationship map.

Description

Group mining method, abnormal group identification method and device and electronic equipment
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for group mining, a method and an apparatus for identifying abnormal groups, and an electronic device.
Background
Illegal group activities (such as gambling, marketing, ticket-brushing, cash-out, etc.) are important risk prevention and control objects of organizations. With the development of artificial intelligence, more and more organizations adopt deep learning models to mine potential groups. However, at present, such models are only analyzed and identified according to the degree of closeness of connection between users, but similarity/correlation of features between users is not considered, so that differences between users in a group obtained by mining are large, and the mining accuracy is low due to the defect, so that the practicability is affected.
In view of the foregoing, there is a need for a group mining scheme that considers both connection compactness and feature similarity.
Disclosure of Invention
The embodiment of the specification aims to provide a group mining method, an abnormal group identification device and electronic equipment, which can be used for mining the group with service contact and certain similarity of service characteristics based on a mechanical mode and further identifying the abnormal group according to a mining result.
In order to achieve the above object, the embodiments of the present specification are implemented as follows:
in a first aspect, a group mining method is provided, including:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
and generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
In a second aspect, a method for identifying abnormal group is provided, which includes:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map;
and determining abnormal group partners from the group partner mining results corresponding to the target user groups.
In a third aspect, a group mining device is provided, comprising:
the first clustering module is used for clustering the network relationship maps corresponding to the target user group based on service connection by using a first clustering algorithm to obtain a first group partner relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by two corresponding users under preset association logic;
the characteristic expression module is used for inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm or not;
the second clustering module is used for clustering the network relationship map and/or the first group-partner relationship map based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group-partner relationship map corresponding to the target user group;
and the mining module is used for generating a group mining result corresponding to the target user group based on the first group relation map and the second group relation map.
In a fourth aspect, an electronic device is provided comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program being executed by the processor to:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
and generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
In a fifth aspect, a computer-readable storage medium is provided, having stored thereon a computer program which, when executed by a processor, performs the steps of:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
and generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
In a sixth aspect, an apparatus for identifying an abnormal group is provided, which includes:
the first clustering module is used for clustering the network relationship maps corresponding to the target user group based on service connection by using a first clustering algorithm to obtain a first group partner relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by two corresponding users under preset association logic;
the characteristic expression module is used for inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm or not;
the second clustering module is used for clustering the network relationship map and/or the first group-partner relationship map based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group-partner relationship map corresponding to the target user group;
the mining module generates a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map;
and the abnormal recognition module is used for determining abnormal group partners from the group partner mining results corresponding to the target user group.
In a seventh aspect, an electronic device is provided that includes: a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program being executed by the processor to:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map;
and determining abnormal group partners from the group partner mining results corresponding to the target user groups.
In an eighth aspect, a computer readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map;
and determining abnormal group partners from the group partner mining results corresponding to the target user groups.
The scheme of the embodiment of the specification can comprehensively consider the business connection and the commonality of the business characteristics among the users to mine the potential group, thereby improving the accuracy and the coverage rate of the group mining. Powerful data support is provided for subsequent related applications based on the result of group mining, such as assisting in identifying groups participating in illegal activities.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without any creative efforts.
Fig. 1 is a first flowchart of a group mining method according to an embodiment of the present disclosure.
Fig. 2 is a second flowchart of a group mining method according to an embodiment of the present disclosure.
Fig. 3 is a process diagram of an abnormal group identification method provided in an embodiment of the present specification.
Fig. 4 is a schematic structural diagram of a gang digging device provided in an embodiment of the present disclosure.
Fig. 5 is a schematic structural diagram of an abnormal group partner identifying device according to an embodiment of the present disclosure.
Fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of this specification.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments in the present specification without any inventive step should fall within the scope of protection of the present specification.
As mentioned above, in the conventional group mining scheme, mechanical analysis and recognition are performed according to the connection tightness between users, and because the similarity/correlation of features between users is not considered, the difference between users in the mined group is large, and the mining accuracy is low due to the defect, so that the practicability of application is influenced.
To this end, this document aims to provide a partnership mining scheme that takes into account both connection tightness and feature similarity, and an abnormal partnership identification scheme that is subsequently performed based on the results of the partnership mining.
Fig. 1 is a flow chart of a method group mining method according to an embodiment of the present disclosure. The method shown in fig. 1 may be performed by a corresponding apparatus below, comprising the steps of:
s102, clustering based on service connection is carried out on the network relation maps corresponding to the target user group by using a first clustering algorithm to obtain a first group partner relation map corresponding to the target user group, wherein nodes in the network relation maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relation maps represent connection established by the two corresponding users under preset association logic.
It should be understood that the embodiments of the present specification do not specifically limit the network relationship represented by the network relationship map. That is, edges in the network relationship graph may represent any type of connection between users. For example, interactive contacts, business contacts, regional contacts, work contacts, and the like may be represented.
Here, the network relationship map of the target user group may be regarded as a whole network relationship map, and after performing clustering based on service connection based on a first clustering algorithm (the first clustering algorithm is not specifically limited, and may include a community discovery algorithm, a label propagation algorithm, a connected subgraph algorithm, and the like), a sub-relationship map of each group potential in the target user group, that is, the first group relationship map referred to herein, may be obtained.
S104, inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained based on preset service characteristic data corresponding to the user in the sample partnership and labels corresponding to the sample partnership, and the labels corresponding to the sample partnership are used for indicating whether the sample partnership can be clustered by the first clustering algorithm or not.
Specifically, in the training process of the neural network model, preset service characteristic data corresponding to users in a sample group is used as input data, and a label corresponding to the sample group is used as output data. The neural network model can encode preset service characteristic data corresponding to the users in the sample gangues into expression characteristics, and performs analysis and calculation based on the expression characteristics corresponding to the users in the sample gangues so as to predict whether the sample gangues can be clustered by the first clustering algorithm. Here, the training result, which is the result of the neural network model prediction, has an error from the true value result indicated by the label. In the embodiment of the present specification, an error value between a training result and a true value result shown by a label may be calculated based on a loss function derived by maximum likelihood estimation, and with the purpose of reducing the error value, parameter optimization is performed on a functional layer (for example, a full link layer) in a neural network model, which is used for encoding preset service feature data into an expression feature, so as to improve the performance of the neural network model for encoding the preset service feature data into the expression feature, thereby achieving a training effect. It should be appreciated that the trained neural network model has the ability to generate expression features for group users.
And S106, clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group.
Similarly, the second clustering algorithm is not unique (may include at least one of a K-means clustering algorithm, a density clustering algorithm, and a density peak clustering algorithm), and is not specifically limited herein.
It should be understood that in this step: clustering network relationship maps corresponding to target user groups by using a second clustering algorithm, wherein the clustering can be regarded as a similarity mode based on expression characteristics to cluster the whole network relationship maps to obtain a second group relationship map which is in parallel with the first group relationship map; clustering the first partnership map by using a second clustering algorithm can be regarded as further subdividing and clustering the partnership shown by the first partnership map in a similarity mode based on expression characteristics to obtain a second partnership map advanced by the first partnership map.
And S108, generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
Specifically, the step may summarize the first partnership map and the second partnership map to obtain a partnership mining result. For example, in S106, the network relationship graph is clustered based on the similarity of the expression features by using the second clustering algorithm, so that the partnership shown in the first partnership graph and the partnership shown in the second partnership graph can be used as final partnership mining results.
Alternatively, the step may also use the group with user intersection in the first group relationship map and the second group relationship map as the group mining result. For example, if a first partnership project shows a partnership project comprising user A, B, C, D and a second partnership project shows a partnership project comprising user A, B, C, E, this step may finalize user A, B, C as a partnership project.
Based on the method shown in fig. 1, it can be known that the solution of the embodiment of the present specification can comprehensively consider business connections and commonalities of business features between users to mine potential group partners, thereby improving accuracy and coverage of group partner mining. Powerful data support is provided for subsequent related applications based on the result of group mining, such as assisting in identifying groups participating in illegal activities.
The group mining method according to the embodiment of the present disclosure will be described in detail below.
The scheme of the embodiment of the specification adopts a clustering algorithm of two modes to realize group mining. One of the clustering algorithms is based on business contact for clustering, and the other clustering algorithm is based on business feature similarity for clustering. And finally outputting the group of the members with the service contact and the service characteristic similarity by combining the two clustering results.
As shown in fig. 2, the specific process includes the following steps:
s201, generating a network relation map of the target user group.
By way of exemplary introduction, this step may subdivide the user group into payers and payees based on historical business data of the target user group and construct a bipartite graph of payers and payees, i.e., a network relationship graph of the target user group.
S202, clustering based on business connection is carried out on the network relation maps corresponding to the target user group based on a first clustering algorithm, and a first group-partner relation map is obtained.
Here, the first clustering algorithm is taken as an example of a community discovery algorithm (other feasible algorithms belong to the prior art, and are not described in detail herein by way of example).
This step is based on the community discovery algorithm, and can divide communities for the payer and the payee in the bipartite graph (the communities in the community discovery algorithm can be regarded as group partners herein). In the process of dividing communities, the following steps a) to c) are performed in a specific iteration:
a) building communities corresponding to the nodes of the bipartite graph one by one; wherein, the payer and the payee of the bipartite graph are used as initial nodes of the bipartite graph.
b) And determining a target community corresponding to the node based on the modularity increment of the node for each community, and classifying the node into the corresponding target community until the communities to which all the nodes belong do not change any more.
It should be noted that the modularity increment is determined according to the edge weight and the ring weight of the node, and is not described herein again by way of example because it belongs to the prior art.
c) Merging all nodes classified into the same community into a new node, converting the edge weight between the nodes in the community before merging into the ring weight of the new node after merging, and converting the edge weight between the intervals before merging into the edge weight between the new nodes after merging; wherein, the new node formed by combination is used as the updated node of the bipartite graph; if the number of the difference communities between the communities constructed in the current iteration and the communities constructed in the previous iteration is less than or equal to a preset threshold value, ending the iteration; or, if the iteration of the current round reaches the preset iteration times, the iteration is ended.
In this step, the community of the bipartite graph and the payer and the payee included in the community, which are constructed in the iterative process, or the community of the bipartite graph and the payer and the payee included in the community, which are constructed in the last iterative process, may be used as the community discovery result of the bipartite graph, which is the first group-partnership map in this document.
S203, preprocessing the first partnership project.
The preprocessing may include, among other things, structure restoration and/or structure reconstruction. The structure recovery means that edges between nodes in the first group-partner relationship graph lost due to classification calculation of a first clustering algorithm are recovered; the structural reconstruction refers to reconstructing the first group relation map into a full connection map.
It should be appreciated that this step may enhance the aggregation of node connections in the first partnership project by way of structural reconfiguration and structural restoration.
In practical application, considering the problem of execution efficiency, for a first partnership project with a node size less than 1000, the step can select to execute structural reconstruction; for a first partnership project with a node size of over 1000, this step may choose to perform structure restoration.
S204, generating an expression characteristic corresponding to the user in the first group relation map by using the graph neural network GNN model.
It should be understood that the GNN model is a neural network model that is adapted to take an image as an input. That is, in this step, the preset service feature data corresponding to the user in the first partnership map may be added to the first partnership map and then directly input to the GNN model, and the GNN model further generates the expression feature corresponding to the user in the first partnership map.
In the embodiment of the present disclosure, the GNN model may be obtained by training in a self-supervision manner, that is, the training of the GNN model may be completed without manually labeling training data.
The following describes the self-supervised training of the GNN model in detail.
First, labeled training data is generated mechanically.
Specifically, the network relationship maps of the sample user groups may be clustered by using a first clustering algorithm to obtain each group relationship map of the sample user groups.
Here, positive and negative sample gangs are constructed with reference to respective ganged relationship maps of sample user groups. The positive sample group is formed by clustering the network relationship maps of the sample user groups by the first clustering algorithm and forming users with edge structures, and the negative sample group is formed by clustering the network relationship maps of the sample user groups by the first clustering algorithm and not forming the users with edge structures.
It should be understood that positive and negative sample parties are labeled by tags. That is, a positive sample group indicates that a positive sample group can be clustered by the first clustering algorithm, and a label for a negative sample group indicates that a negative sample group cannot be clustered by the first clustering algorithm.
Considering that the scheme can be performed more simply, a pair of nodes (sample users) without edge structures in the group relationship graph of the sample user group can be randomly sampled to be used as a negative sample, or a pair of nodes with small node degrees in the group relationship graph can be also sampled to be used as a negative sample. Similarly, a pair of nodes with edge structures in the group relationship graph of the sample user group is randomly sampled to be used as a positive sample, or a pair of nodes with small node degrees in the group relationship graph can be extracted to be used as a negative sample. Based on the extraction scheme, a pair of nodes extracted as positive samples serves as a positive sample group, and a pair of nodes extracted as negative samples serves as a negative sample group.
And then, constructing training data of the positive sample group and the negative sample group, wherein the training data comprises preset business characteristic data of user commonality in the sample group and a label of the sample group. Here, in order to simplify the scheme, the training data corresponding to one sample group may only include one value of preset service feature data, that is, the average preset service feature data of the users in the sample group.
After the training data is accumulated to a certain extent, the training data can be used for training the GNN model, so that the GNN model obtains the capability of outputting the expression features (the training process is described above, and is not described here again).
S205, clustering based on the similarity of the expression characteristics is carried out on the network relationship map and/or the first group relationship map of the target user group based on a second clustering algorithm, and a second group relationship map is obtained.
Here, the second clustering algorithm is taken as an example of a K-means clustering algorithm (other feasible specific algorithms belong to the prior art, and are not described in detail herein by way of example). The K-means clustering algorithm comprises the following specific steps:
step 1: k feature vectors are selected from feature vectors (i.e., preset service feature data) generated by a plurality of users of a target user group in a certain service operation as initial clustering centers, and the K initial clustering centers correspond to K categories, where the value of K is a positive integer greater than or equal to 1.
Step 2: and respectively calculating the distances from other feature vectors to each initial clustering center, wherein the distances are various and can be Euclidean distances or Manhattan distances, and the like, and classifying the feature vectors into the category closest to the initial clustering centers according to the distances obtained by calculation.
And step 3: after the feature vectors corresponding to the sampling users are divided into K classes according to the obtained K initial clustering centers, the arithmetic mean value corresponding to the feature vectors in each class is calculated according to the feature vectors contained in each class, and a new clustering center is obtained.
And 4, step 4: and repeating the steps according to the new clustering center, and reclassifying the feature vectors corresponding to the users until a certain convergence condition is met.
It should be understood that the final output of the K-means clustering algorithm is the second partnership project described herein.
S206, summarizing the group shown in the first group relationship map and the second group relationship map to obtain the group mining result corresponding to the target user group.
Obviously, after the group mining result is obtained by the method shown in fig. 1 in the embodiment of the present specification, a target group can be further selected from the group mining result according to the business requirement to implement the related extended application. For example, the group mining method in the embodiment of the specification can be applied to the field of wind control, and is used for assisting in identifying groups engaged in illegal financial activities such as gambling, biography, brushing, cash register and the like; or, the method can also be applied to the field of media delivery, and helps to identify the suitable group for different marketing schemes.
The above is a description of the method of the embodiments of the present specification. It will be appreciated that appropriate modifications may be made without departing from the principles outlined herein, and such modifications are intended to be included within the scope of the embodiments herein.
In correspondence with the above method, the present specification further provides an abnormal group partner identification method, which can identify an abnormal group partner from the group partner mining results provided by the method shown in fig. 1. Fig. 3 is a flowchart of a method identification method according to an embodiment of the present disclosure. The method shown in fig. 3 may be performed by a corresponding apparatus below, comprising the steps of:
s302, clustering based on service connection is carried out on the network relation maps corresponding to the target user group by using a first clustering algorithm to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relation maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relation maps represent connection established by the two corresponding users under preset association logic.
S304, inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained based on preset service characteristic data corresponding to the user in the sample partnership and labels corresponding to the sample partnership, and the labels corresponding to the sample partnership are used for indicating whether the sample partnership can be clustered by the first clustering algorithm.
And S306, clustering the network relationship map and/or the first group relationship map based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship map corresponding to the target user group.
S308, generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
S310, determining abnormal group from the group mining result corresponding to the target user group.
For example, abnormal evaluation can be performed on the gangues in the gangue mining result based on historical service data, so as to determine abnormal gangues; or determining a group in the group mining result, which presents an outlier in the first group relation map and/or the second group relation map, as an abnormal group.
Obviously, the identification method of the embodiment of the specification can comprehensively consider the business connection and the commonality of the business characteristics among the users to mine the potential group, thereby improving the accuracy and the coverage rate of the group mining. Then, the abnormal group is further found out from the group mining result to be used for executing the related risk precautionary measures.
In addition, corresponding to the group mining method shown in fig. 1, the embodiment of the present specification further provides a group mining device. Fig. 4 is a schematic structural diagram of a gang digging device 400 according to an embodiment of the present disclosure, including:
the first clustering module 410 performs service-connection-based clustering on network relationship maps corresponding to a target user group by using a first clustering algorithm to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connections established by two corresponding users under preset association logic.
The feature expression module 420 is configured to input preset service feature data corresponding to a user in the first partnership map into a neural network model to obtain an expression feature corresponding to the user in the first partnership map, where the neural network model is obtained by training based on the preset service feature data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used to indicate whether the sample partnership can be clustered by the first clustering algorithm.
The second clustering module 430 performs clustering based on the similarity of the expression features on the network relationship graph and/or the first partnership graph by using a second clustering algorithm to obtain a second partnership graph corresponding to the target user group.
And the mining module 440 generates a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
The group mining device in the embodiment of the specification can comprehensively consider the business connection and the commonality of the business characteristics among the users to mine potential groups, so that the accuracy and the coverage rate of group mining are improved. Powerful data support is provided for subsequent related applications based on the result of group mining, such as assisting in identifying groups participating in illegal activities.
Optionally, before inputting the preset service feature data corresponding to the user in the first partnership map into the neural network model, the feature expression module 420 may further perform structure restoration and/or structure reconstruction on the first partnership map to enhance aggregation of node connection relationships in the first partnership map. The structure recovery means recovering edges between nodes in the first partnership project lost by the classification calculation of the first clustering algorithm, and the structure reconstruction means reconstructing the first partnership project into a full connection graph.
Optionally, the neural network model is trained based on average preset service feature data of users in a sample group and a label corresponding to the sample group.
Optionally, the sample group comprises a positive sample group and a negative sample group, wherein the positive sample group is composed of users which are clustered by the first clustering algorithm on the network relationship maps of the sample user group and form an edge structure, and the negative sample group is composed of users which are clustered by the first clustering algorithm on the network relationship maps of the sample user group and do not form an edge structure.
Optionally, the first clustering algorithm comprises at least one of a community discovery algorithm, a label propagation algorithm, and a connected subgraph algorithm.
Optionally, the second clustering algorithm comprises at least one of a K-means clustering algorithm, a density clustering algorithm, and a density peak clustering algorithm.
Obviously, the group mining device according to the embodiment of the present disclosure may serve as an execution subject of the group mining method shown in fig. 1, and thus can implement the functions of the group mining method implemented in fig. 1 and fig. 2. Since the principle is the same, the detailed description is omitted here.
In addition, corresponding to the identification method shown in fig. 3, the embodiment of the present specification further provides an identification apparatus for an abnormal group. Fig. 5 is a schematic structural diagram of an identification apparatus 500 according to an embodiment of the present disclosure, including:
the first clustering module 510 performs service-connection-based clustering on a network relationship graph corresponding to a target user group by using a first clustering algorithm to obtain a first group relationship graph corresponding to the target user group, wherein nodes in the network relationship graph corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship graph represent connections established by two corresponding users under preset association logic.
The feature expression module 520 inputs preset service feature data corresponding to the user in the first partnership map into a neural network model to obtain an expression feature corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on the preset service feature data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm.
The second clustering module 530 performs clustering on the network relationship graph and/or the first partnership graph based on the similarity of the expression features by using a second clustering algorithm to obtain a second partnership graph corresponding to the target user group.
And the mining module 540 generates a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
The abnormal recognition module 550 determines abnormal gangues from the gangue mining results corresponding to the target user group; such as: performing abnormal evaluation on the gangues in the gangue mining result based on historical service data to determine abnormal gangues; or determining a group in the group mining result that presents an outlier in the first group relationship map and/or the second group relationship map as an abnormal group.
Obviously, the identification device of the embodiment of the present specification can comprehensively consider the business connection and the commonality of the business features between the users to mine the potential group, thereby improving the accuracy and the coverage rate of the group mining. Then, the abnormal group is further found out from the group mining result to be used for executing the related risk precautionary measures.
Obviously, the recognition apparatus in the embodiment of the present specification can be used as the execution subject of the recognition method shown in fig. 3, and thus can realize the function of the recognition method realized in fig. 3. Since the principle is the same, the detailed description is omitted here.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present specification. Referring to fig. 6, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 6, but that does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code comprising computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads a corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the group partner identifying device on a logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connections established by the two corresponding users under preset association logic.
Inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm.
And clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group.
And generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
Or the processor reads the corresponding computer program from the nonvolatile memory into the memory and runs the computer program to form the abnormal group recognition device on the logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connections established by the two corresponding users under preset association logic.
Inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm.
And clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group.
And generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
And determining abnormal group partners from the group partner mining results corresponding to the target user groups.
The group mining method disclosed in the embodiment of fig. 1 or the identification method disclosed in the embodiment of fig. 3 may be implemented in a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present specification may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present specification may be embodied directly in a hardware decoding processor, or in a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
It should be understood that the electronic device of the embodiment of the present specification may implement the functions of the above-described group mining apparatus in the embodiment shown in fig. 1 and 2, or may implement the functions of the above-described abnormal group identification apparatus in the embodiment shown in fig. 3. Since the principle is the same, the detailed description is omitted here.
Of course, besides the software implementation, the electronic device in this specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may also be hardware or logic devices.
Furthermore, the present specification embodiments also propose a computer-readable storage medium storing one or more programs, the one or more programs including instructions.
Wherein the above instructions, when executed by a portable electronic device comprising a plurality of application programs, enable the portable electronic device to perform the method of the embodiment shown in fig. 1, and in particular to perform the following method:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connections established by the two corresponding users under preset association logic.
Inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm.
And clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group.
And generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
Alternatively, the above instructions, when executed by a portable electronic device comprising a plurality of application programs, can cause the portable electronic device to perform the method of the embodiment shown in fig. 3, and in particular to perform the following method:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connections established by the two corresponding users under preset association logic.
Inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm.
And clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group.
And generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
And determining abnormal group partners from the group partner mining results corresponding to the target user groups.
It will be appreciated that the above instructions, when executed by a portable electronic device comprising a plurality of applications, can cause a group mining apparatus as described above to carry out the functions of the embodiments of figures 1 and 2, or alternatively, can cause a recognition apparatus as described above to carry out the functions of the embodiment of figure 3. Since the principle is the same, the detailed description is omitted here.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification. Moreover, all other embodiments obtained by a person skilled in the art without making any inventive step shall fall within the scope of protection of this document.

Claims (14)

1. A method of gang mining comprising:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
and generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
2. The method of claim 1, wherein the first and second light sources are selected from the group consisting of,
before inputting the preset service characteristic data corresponding to the user in the first group-partner relationship map into the neural network model, the method further includes:
performing structure restoration and/or structure reconstruction on the first partnership project; the structure recovery means recovering edges between nodes in the first partnership project lost by the classification calculation of the first clustering algorithm, and the structure reconstruction means reconstructing the first partnership project into a full connection graph.
3. The method as set forth in claim 1,
the neural network model is obtained by training the average preset service characteristic data of users in a sample group and the label corresponding to the sample group.
4. The method as set forth in claim 1,
the sample group comprises a positive sample group and a negative sample group, wherein the positive sample group is formed by clustering the network relationship maps of the sample user group by the first clustering algorithm and forming edge structures, and the negative sample group is formed by clustering the network relationship maps of the sample user group by the first clustering algorithm and not forming edge structures.
5. The method of claim 1, wherein the first and second light sources are selected from the group consisting of,
the first clustering algorithm includes at least one of a community discovery algorithm, a label propagation algorithm, and a connected subgraph algorithm.
6. The method of claim 1, wherein the first and second light sources are selected from the group consisting of,
the second clustering algorithm includes at least one of a K-means clustering algorithm, a density clustering algorithm, and a density peak clustering algorithm.
7. An abnormal group identification method comprises the following steps:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map;
and determining abnormal group partners from the group partner mining results corresponding to the target user groups.
8. The method of claim 1, wherein the first and second light sources are selected from the group consisting of,
determining abnormal gangues from the gangue mining results corresponding to the target user group, wherein the method comprises the following steps:
performing abnormal evaluation on the gangues in the gangue mining result based on historical service data to determine abnormal gangues; alternatively, the first and second electrodes may be,
determining a party in the group mining results that is present as an outlier at the first and/or second group relationship graph.
9. A gang digging implement comprising:
the first clustering module is used for clustering the network relationship maps corresponding to the target user group based on service connection by using a first clustering algorithm to obtain a first group partner relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by two corresponding users under preset association logic;
the characteristic expression module is used for inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm or not;
the second clustering module is used for clustering the network relationship map and/or the first group-partner relationship map based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group-partner relationship map corresponding to the target user group;
and the mining module is used for generating a group mining result corresponding to the target user group based on the first group relation map and the second group relation map.
10. An electronic device includes: a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program being executed by the processor to:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
and generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
11. A computer-readable storage medium having a computer program stored thereon, which when executed by a processor, performs the steps of:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
and generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map.
12. An abnormal group partner identifying apparatus, comprising:
the first clustering module is used for clustering the network relationship maps corresponding to the target user group based on service connection by using a first clustering algorithm to obtain a first group partner relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by two corresponding users under preset association logic;
the characteristic expression module is used for inputting preset service characteristic data corresponding to the user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm or not;
the second clustering module is used for clustering the network relationship map and/or the first group-partner relationship map based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group-partner relationship map corresponding to the target user group;
the mining module generates a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map;
and the abnormal recognition module is used for determining abnormal group partners from the group partner mining results corresponding to the target user group.
13. An electronic device includes: a memory, a processor, and a computer program stored on the memory and executable on the processor, the computer program being executed by the processor to:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map;
and determining abnormal group partners from the group partner mining results corresponding to the target user groups.
14. A computer-readable storage medium having a computer program stored thereon, which when executed by a processor, performs the steps of:
using a first clustering algorithm to perform service-connection-based clustering on network relationship maps corresponding to a target user group to obtain a first group relationship map corresponding to the target user group, wherein nodes in the network relationship maps corresponding to the target user group represent users of the target user group, and edges of two nodes in the network relationship maps represent connection established by the two corresponding users under preset association logic;
inputting preset service characteristic data corresponding to a user in the first partnership map into a neural network model to obtain expression characteristics corresponding to the user in the first partnership map, wherein the neural network model is obtained by training based on preset service characteristic data corresponding to the user in a sample partnership and a label corresponding to the sample partnership, and the label corresponding to the sample partnership is used for indicating whether the sample partnership can be clustered by the first clustering algorithm;
clustering the network relationship graph and/or the first group relationship graph based on the similarity of the expression characteristics by using a second clustering algorithm to obtain a second group relationship graph corresponding to the target user group;
generating a group mining result corresponding to the target user group based on the first group relationship map and the second group relationship map;
and determining abnormal group partners from the group partner mining results corresponding to the target user groups.
CN202010952147.6A 2020-09-11 2020-09-11 Group mining method, abnormal group identification method and device and electronic equipment Pending CN112148767A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010952147.6A CN112148767A (en) 2020-09-11 2020-09-11 Group mining method, abnormal group identification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010952147.6A CN112148767A (en) 2020-09-11 2020-09-11 Group mining method, abnormal group identification method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN112148767A true CN112148767A (en) 2020-12-29

Family

ID=73890826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010952147.6A Pending CN112148767A (en) 2020-09-11 2020-09-11 Group mining method, abnormal group identification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN112148767A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112910888A (en) * 2021-01-29 2021-06-04 杭州迪普科技股份有限公司 Illegal domain name registration group mining method and device
CN112968870A (en) * 2021-01-29 2021-06-15 国家计算机网络与信息安全管理中心 Network group discovery method based on frequent itemset
CN112967105A (en) * 2021-03-03 2021-06-15 北京嘀嘀无限科技发展有限公司 Order information processing method, equipment, storage medium and computer program product
CN113284027A (en) * 2021-06-10 2021-08-20 支付宝(杭州)信息技术有限公司 Method for training group recognition model, and method and device for recognizing abnormal group
CN113569059A (en) * 2021-09-07 2021-10-29 浙江网商银行股份有限公司 Target user identification method and device
CN115150130A (en) * 2022-06-08 2022-10-04 北京天融信网络安全技术有限公司 Method, device, equipment and storage medium for tracking and analyzing attack group

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112910888A (en) * 2021-01-29 2021-06-04 杭州迪普科技股份有限公司 Illegal domain name registration group mining method and device
CN112968870A (en) * 2021-01-29 2021-06-15 国家计算机网络与信息安全管理中心 Network group discovery method based on frequent itemset
CN112967105A (en) * 2021-03-03 2021-06-15 北京嘀嘀无限科技发展有限公司 Order information processing method, equipment, storage medium and computer program product
CN113284027A (en) * 2021-06-10 2021-08-20 支付宝(杭州)信息技术有限公司 Method for training group recognition model, and method and device for recognizing abnormal group
CN113284027B (en) * 2021-06-10 2023-05-09 支付宝(杭州)信息技术有限公司 Training method of partner recognition model, abnormal partner recognition method and device
CN113569059A (en) * 2021-09-07 2021-10-29 浙江网商银行股份有限公司 Target user identification method and device
CN115150130A (en) * 2022-06-08 2022-10-04 北京天融信网络安全技术有限公司 Method, device, equipment and storage medium for tracking and analyzing attack group
CN115150130B (en) * 2022-06-08 2023-11-10 北京天融信网络安全技术有限公司 Tracking analysis method, device, equipment and storage medium for attack group

Similar Documents

Publication Publication Date Title
CN112148767A (en) Group mining method, abnormal group identification method and device and electronic equipment
US10943582B2 (en) Method and apparatus of training acoustic feature extracting model, device and computer storage medium
Wang et al. From semantic categories to fixations: A novel weakly-supervised visual-auditory saliency detection approach
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
CN116978011B (en) Image semantic communication method and system for intelligent target recognition
CN115687732A (en) User analysis method and system based on AI and stream computing
CN111639230A (en) Similar video screening method, device, equipment and storage medium
CN110796240A (en) Training method, feature extraction method, device and electronic equipment
US11574250B2 (en) Classification of erroneous cell data
CN116881430A (en) Industrial chain identification method and device, electronic equipment and readable storage medium
CN115757900B (en) User demand analysis method and system applying artificial intelligent model
CN112364198A (en) Cross-modal Hash retrieval method, terminal device and storage medium
CN114003648B (en) Identification method and device for risk transaction group partner, electronic equipment and storage medium
CN111400764B (en) Personal information protection wind control model training method, risk identification method and hardware
CN116150429A (en) Abnormal object identification method, device, computing equipment and storage medium
CN114155410A (en) Graph pooling, classification model training and reconstruction model training method and device
CN113627514A (en) Data processing method and device of knowledge graph, electronic equipment and storage medium
CN112651753A (en) Intelligent contract generation method and system based on block chain and electronic equipment
CN111611531A (en) Personnel relationship analysis method and device and electronic equipment
CN111310806B (en) Classification network, image processing method, device, system and storage medium
CN114241243B (en) Training method and device for image classification model, electronic equipment and storage medium
CN116405330B (en) Network abnormal traffic identification method, device and equipment based on transfer learning
CN115510243A (en) Knowledge graph construction method and device, electronic equipment and storage medium
Xu et al. A Clustering Method with Graph Maximum Decoding Information
JP2017142712A (en) Call graph difference extraction method, call graph difference extraction program, and information processing device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40043788

Country of ref document: HK