CN113987087A - Account processing method and device, electronic equipment and storage medium - Google Patents

Account processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113987087A
CN113987087A CN202111253169.4A CN202111253169A CN113987087A CN 113987087 A CN113987087 A CN 113987087A CN 202111253169 A CN202111253169 A CN 202111253169A CN 113987087 A CN113987087 A CN 113987087A
Authority
CN
China
Prior art keywords
account
accounts
attribution
same
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111253169.4A
Other languages
Chinese (zh)
Inventor
张戎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202111253169.4A priority Critical patent/CN113987087A/en
Publication of CN113987087A publication Critical patent/CN113987087A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/26Visual data mining; Browsing structured data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Primary Health Care (AREA)
  • Marketing (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Algebra (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure relates to an account processing method, apparatus, device and storage medium, the present disclosure includes: taking each account in a preset account set as a node, and taking an association relationship between any two accounts as an edge of the node to construct an account association relationship graph; the incidence relation represents that at least one same entity object is associated between the two accounts; carrying out community division on the account association relation graph to generate a community graph; deleting a target edge in the community graph to obtain an adjusted community graph; the prediction probability that accounts corresponding to two nodes connected with the target edge belong to the same belonger is smaller than a preset threshold value; the prediction probability is determined according to the difference between the behavior characteristics of the attribution people corresponding to the two accounts; the attribution behavior characteristics are behavior characteristics generated when the corresponding account attribution uses the account; and determining accounts corresponding to the nodes connected in the adjusted community graph as accounts belonging to the same attribution. The method and the device can improve the analysis accuracy while ensuring the calculation efficiency.

Description

Account processing method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to an account processing method and apparatus, an electronic device, and a storage medium.
Background
With the development of computer technology, the same natural person can use multiple accounts on a network platform (such as a social platform and a shopping platform). Wherein, the network platform can push the content which is interesting to the nature to the corresponding account. Since nature has multiple accounts, the following often occur: in a content recommendation scene, after pushing content to one account of a natural person, a network platform pushes the same content to other accounts of the natural person, so that interference is generated on the natural person; secondly, in a black birth attack scene, even if a certain account is found to be bad, a natural person who operates the account behind can operate other owned accounts even if the account is closed, and effective attack cannot be realized. Therefore, in order to reduce the interference of the pushed content to the natural person or improve the accuracy of the hit of the black products, it is necessary to analyze accounts belonging to the same natural person. However, in the process of analyzing accounts belonging to the same natural person, since the number of accounts is large, it is difficult to achieve both efficiency and accuracy of account analysis.
Disclosure of Invention
The present disclosure provides an account processing method, an account processing apparatus, an electronic device, and a storage medium, to at least solve a problem that it is difficult to consider account analysis efficiency and accuracy when analyzing accounts of the same natural person in related art. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided an account processing method, including:
taking each account in a preset account set as a node, and establishing an account association relationship graph by taking the association relationship between any two accounts as the edge of the node; the incidence relation is used for representing that at least one same entity object is associated between the two accounts;
carrying out community division on the account association relation graph to generate a community graph;
deleting the target edge in the community graph to obtain an adjusted community graph; the prediction probability that accounts corresponding to two nodes connected with the target edge belong to the same belonger is smaller than a preset threshold value; the prediction probability is determined according to the difference between the behavior characteristics of the attribution persons corresponding to the two accounts; the attribution behavior characteristics are behavior characteristics generated when the corresponding attribution of the account uses the account;
and determining accounts corresponding to the nodes connected in the adjusted community graph as accounts belonging to the same owner.
In one possible implementation, before the deleting the target edge in the community graph, the method further includes:
determining edges in the community graph;
and performing probability prediction on whether the accounts corresponding to the two nodes connected with each edge belong to the same belonger, and taking the edge with the prediction probability smaller than a preset threshold value as the target edge.
In a possible implementation manner, the performing probability prediction on whether the accounts corresponding to the two nodes connected to each edge belong to the same owner includes:
obtaining the attribute person behavior feature similarity between the accounts corresponding to the two nodes connected on each side based on the attribute person behavior feature of the account corresponding to the two nodes connected on each side;
inputting the behavior feature similarity of the attribution person into a pre-constructed prediction model;
and obtaining the prediction probability that the accounts corresponding to the two nodes connected with each edge output by the prediction model belong to the same belonger.
In one possible implementation, the method further includes:
when the unique identifier of the affiliate is bound to each sample account, determining a sample account pair bound with the same unique identifier of the affiliate;
taking the unique identification code of the same attribution person as a training label of the sample account pair;
and obtaining the prediction model based on the training labels and the similarity of the behavior features of the attributions of the sample account pairs.
In a possible implementation manner, when the similarity of behavior features of the affiliate is multiple, obtaining a prediction probability that accounts corresponding to two nodes connected to each edge and output by the prediction model belong to the same affiliate includes:
acquiring fusion attribution behavior feature similarity formed by the prediction model after weighting and summing the attribution behavior feature similarity based on the prediction weight corresponding to the attribution behavior feature similarity;
and taking the behavior feature similarity of the fusion attribution people output by the prediction model as prediction probability.
In one possible implementation, the method further includes:
when the unique identification code of the affiliate is not bound to each sample account, acquiring account content similarity between two sample accounts of the sample account pair;
and determining the prediction weight based on the account content similarity and the attribution behavior feature similarity of the sample account pair.
In one possible implementation manner, the determining the prediction weight based on the account content similarity and the attribution behavior feature similarity of the sample account pair includes:
acquiring a preset initial prediction weight;
obtaining the prediction probability that two sample accounts of the sample account pair belong to the same attribution person based on the initial prediction weight and the attribution person behavior feature similarity of the sample account pair;
when the prediction probability characterizes that the two sample accounts of the sample account pair do not belong to the same attribution person and the account content similarity characterizes that the two sample accounts of the sample account pair belong to the same attribution person, adjusting the initial prediction weight and determining the prediction probability that the sample account pair belongs to the same attribution person again based on the adjusted initial prediction weight;
and when the re-determined prediction probability represents that the two sample accounts of the sample account pair belong to the same person, taking the adjusted initial prediction weight as the prediction weight.
According to a second aspect of the embodiments of the present disclosure, there is provided an account processing apparatus including:
the account association relation graph building module is configured to execute the steps that each account in a preset account set is used as a node, and an association relation between any two accounts is an edge of the node to build an account association relation graph; the incidence relation is used for representing that at least one same entity object is associated between the two accounts;
the community dividing module is configured to perform community division on the account association relation graph to generate a community graph;
the community graph adjusting module is configured to delete the target edge in the community graph to obtain an adjusted community graph; the prediction probability that accounts corresponding to two nodes connected with the target edge belong to the same belonger is smaller than a preset threshold value; the prediction probability is determined according to the difference between the behavior characteristics of the attribution persons corresponding to the two accounts; the attribution behavior characteristics are behavior characteristics generated when the corresponding attribution of the account uses the account;
and the account processing module is configured to execute the step of determining accounts corresponding to the nodes connected in the adjusted community graph as accounts belonging to the same attribution.
In one possible implementation, the apparatus further includes a target edge determination module configured to perform determining edges in the community graph; and performing probability prediction on whether the accounts corresponding to the two nodes connected with each edge belong to the same belonger, and taking the edge with the prediction probability smaller than a preset threshold value as the target edge.
In a possible implementation manner, the target edge determining module is configured to execute attribution behavior feature of accounts corresponding to the two nodes connected to each edge, so as to obtain attribution behavior feature similarity between the accounts corresponding to the two nodes connected to each edge; inputting the behavior feature similarity of the attribution person into a pre-constructed prediction model; and obtaining the prediction probability that the accounts corresponding to the two nodes connected with each edge output by the prediction model belong to the same belonger.
In one possible implementation manner, the apparatus further includes a prediction model training module configured to determine a sample account pair bound with the same unique identifier of the attribution when each sample account binds with the unique identifier of the attribution; taking the unique identification code of the same attribution person as a training label of the sample account pair; and obtaining the prediction model based on the training labels and the similarity of the behavior features of the attributions of the sample account pairs.
In a possible implementation manner, when the attribution behavior feature similarities are multiple, the target edge determining module is configured to execute a fusion attribution behavior feature similarity formed by obtaining the prediction model and performing weighted summation processing on the attribution behavior feature similarities based on the prediction weights corresponding to the attribution behavior feature similarities; and taking the behavior feature similarity of the fusion attribution people output by the prediction model as prediction probability.
In one possible implementation manner, the device further comprises a prediction weight determining module configured to obtain account content similarity between two sample accounts of the sample account pair when each sample account is not bound with the unique identifier code of the attribution; and determining the prediction weight based on the account content similarity and the attribution behavior feature similarity of the sample account pair.
In one possible implementation manner, the prediction weight determining module is configured to perform obtaining a preset initial prediction weight; obtaining the prediction probability that two sample accounts of the sample account pair belong to the same attribution person based on the initial prediction weight and the attribution person behavior feature similarity of the sample account pair; when the prediction probability characterizes that the two sample accounts of the sample account pair do not belong to the same attribution person and the account content similarity characterizes that the two sample accounts of the sample account pair belong to the same attribution person, adjusting the initial prediction weight and determining the prediction probability that the sample account pair belongs to the same attribution person again based on the adjusted initial prediction weight; and when the re-determined prediction probability represents that the two sample accounts of the sample account pair belong to the same person, taking the adjusted initial prediction weight as the prediction weight.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the processor implements the account processing method according to the first aspect or any one of the possible implementation manners of the first aspect when executing the computer program.
According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, implements an account processing method according to the first aspect or any one of the possible implementations of the first aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product, the program product comprising a computer program, the computer program being stored on a readable storage medium, from which at least one processor of a device reads and executes the computer program, so that the device performs the account processing method of any one of the possible implementations of the first aspect.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
the method comprises the steps of obtaining an incidence relation between accounts based on the same condition of entity objects associated with any two accounts, connecting nodes corresponding to the accounts by using edges based on the incidence relation to obtain an account incidence relation graph, avoiding connecting the accounts without the incidence relation while connecting, obtaining an accurate account incidence relation graph, and improving subsequent processing efficiency; secondly, carrying out community division on the account association relationship graph to obtain a community graph, deleting a target edge, with the prediction probability of belonging to the same attribution being smaller than a preset threshold value, of accounts corresponding to two connected nodes from the community graph to obtain an adjusted community graph, avoiding directly deleting the target edge on the account association relationship graph, and improving the processing efficiency; and the deleted prediction probability of the target edge is determined according to the difference between the behavior characteristics of the affiliate corresponding to the two accounts, wherein the behavior characteristics of the affiliate are the behavior characteristics generated when the affiliate of the corresponding account uses the account, so the accuracy of the prediction probability of whether the accounts corresponding to the nodes connected by the edge belong to the same affiliate is improved, and then the accounts corresponding to the nodes connected in the adjusted community graph are determined to belong to the same affiliate, thereby improving the analysis accuracy.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIG. 1 is a flow diagram illustrating a method of account processing according to an example embodiment.
FIG. 2 is a flow diagram illustrating a method of account processing according to an example embodiment.
Fig. 3(a) is a flow diagram illustrating a method of account processing according to an example embodiment.
Fig. 3(b) is a flow diagram illustrating a method of account processing according to an example embodiment.
Fig. 4 is a flow diagram illustrating a method of account processing according to an example embodiment.
FIG. 5 is a block diagram illustrating an account processing device according to an example embodiment.
FIG. 6 is a block diagram illustrating an electronic device in accordance with an example embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1 is a flowchart illustrating an account processing method for use in an electronic device such as a server according to an exemplary embodiment, including the following steps.
In step S101, an account association graph is constructed by using each account in a preset account set as a node, and using an association relationship between any two accounts as an edge of the node.
The preset account set comprises a plurality of accounts, such as user _ id1, user _ id2, user _ id3 and the like; the user can use related applications such as a social application, a short video application and the like by logging in the account, and the user can also be called an account owner; the same user may use the same application through multiple accounts.
The account used by the user may be associated with other types of information, such as: the device used to log in the account, the network (such as mobile data or WiFi) used to log in the account, the geographic location when using the account, the id card bound to the account, the mobile phone number bound to the account, the cash-out card number bound to the account, etc., and these other types of information may be referred to as entity objects associated with the account.
Two accounts can be associated with at least one same entity object, and when two accounts are associated with at least one same entity object, the two accounts can be considered to have an association relationship: firstly, associating the same identity card (namely binding the same identity card by two accounts); second, the same mobile phone number is associated (namely, the two accounts are bound with the same mobile phone number); third, the same cash card number is associated (namely, the same cash card number is bound to two accounts); associating the same device (namely logging in the two accounts by using the same device); associate the same WIFI (i.e. use both accounts under the same WIFI); sixthly, associating the same geographic position (namely using the two accounts under the same geographic position); that is, the association relationship is used to represent that at least one identical entity object is associated between two of the above-mentioned accounts.
After obtaining the association relationship between the accounts, the accounts may be used as nodes, and the nodes having the association relationship are connected by using edges to obtain an account association relationship graph, where G ═ V, E > may be used to represent the account association relationship graph, where V represents a node set formed by the nodes corresponding to the accounts, and E represents an edge set connected to the account nodes having the association relationship.
In step S102, the community map is generated by performing community division on the account association map.
The community division mainly includes that accounts which are communicated with each other through edges are mined from an account association relation graph, for example, each account which is communicated with an account user _ id1 is mined from the account association relation graph to form a community graph; that is, accounts in the same community map are connected by edges.
In this step, the electronic device may use a Connected Component algorithm (Connected Component) in graph mining to mine a plurality of Connected branches from the account association relationship graph, and each Connected branch is considered as each community graph, where the accounts of the same community graph may belong to the same owner.
In step S103, the target edge in the community map is deleted to obtain an adjusted community map.
The behavior characteristics of the owner are behavior characteristics generated when the owner of the corresponding account uses the account, the behavior characteristics of the owner may include the entity objects related to the account (for example, a device used when the account is used, a network used when the account is used, a geographic location when the account is used, and the like), and the behavior characteristics of the owner may further include a time when the account is used.
Different accounts each have a corresponding attribution behavior feature, and therefore, there is a difference between the attribution behavior features corresponding to the two accounts, and the difference may include: a is1。Whether two accounts use the same device; a is2。Whether two accounts use the same WiFi;a3。whether two accounts are used in the same geographic location; a is4。Whether two accounts are used in different geographic locations at the same time; a is5。Conditional probability that two accounts are used simultaneously, such as the conditional probability P that user _ id2 is used simultaneously (user _ id2| user _ id1) when user _ id1 is used, and the conditional probability P that user _ id1 is used simultaneously (user _ id1| user _ id2) when user _ id2 is used.
Based on the comprehensive analysis of the difference between the behavior characteristics of the attribution persons corresponding to the two accounts, probability prediction can be carried out on the two accounts belonging to the same attribution person, and prediction probability is obtained. A above1To a5The relationship between the corresponding various differences and the probability that the account belongs to the same attribution is as follows: when two accounts use the same equipment, the probability that the two accounts belong to the same attribution is higher; when two accounts use the same WIFI, the probability that the two accounts belong to the same owner is higher; when two accounts are used in the same geographic location, the probability that the two accounts belong to the same owner is high; when two accounts are used at different geographical locations at the same time, the probability that the two accounts belong to the same owner is small; when the conditional probability that two accounts have been used at the same time is large, the probability that the two accounts belong to the same owner is small.
When the prediction probability of two accounts which have an association relation and are located in the same community graph is smaller than a preset threshold value, it can be determined that the two accounts have a smaller probability of belonging to the same owner, at this time, an edge of a node connecting the two accounts in the community graph can be used as a target edge, the target edge can be deleted from the community graph, and the community graph with the target edge deleted is used as an adjusted community graph.
In step S104, the accounts corresponding to the nodes connected in the adjusted community map are determined as accounts belonging to the same owner.
After the adjusted community graph is obtained by deleting the target edge of the community graph, the probability that the accounts corresponding to the nodes connected by the reserved edge belong to the same belonger is greater than or equal to a preset threshold value, and the high probability that the accounts corresponding to the nodes connected by the edge belong to the same belonger, so that the accounts corresponding to the nodes connected by the edge can be used as the accounts belonging to the same belonger.
In the account processing method, the incidence relation between the accounts is obtained based on the same condition of entity objects associated with any two accounts, the nodes corresponding to the accounts are connected by using the edges based on the incidence relation to obtain the account incidence relation graph, and the incidence account pair is determined in the account set, so that any account user without the incidence relation can be prevented from being connected by using the edges to obtain the accurate account incidence relation graph, the analysis is carried out on the accurate account incidence relation graph, and the subsequent processing calculation efficiency is improved; secondly, carrying out community division on the account association relationship graph to obtain a community graph, deleting a target edge, with the prediction probability of belonging to the same attribution being smaller than a preset threshold value, of accounts corresponding to two connected nodes from the community graph to obtain an adjusted community graph, avoiding directly deleting the target edge on the account association relationship graph, and improving the processing efficiency; and the deleted prediction probability of the target edge is determined according to the difference between the behavior characteristics of the affiliate corresponding to the two accounts, wherein the behavior characteristics of the affiliate are the behavior characteristics generated when the affiliate of the corresponding account uses the account, so the accuracy of the prediction probability of whether the accounts corresponding to the nodes connected by the edge belong to the same affiliate is improved, and then the accounts corresponding to the nodes connected in the adjusted community graph are determined to belong to the same affiliate, thereby improving the analysis accuracy.
In one possible implementation, before deleting the target edge in the community graph, the electronic device may further perform the following steps: determining each edge in the community graph; and performing probability prediction on whether the accounts corresponding to the two nodes connected with each side belong to the same belonger or not, and taking the side with the prediction probability smaller than a preset threshold value as the target side.
Since the edges of the community graph are used for connecting account nodes with association, the present embodiment mainly performs probability prediction on accounts with association belonging to the same owner.
Illustratively, account user _ id1The account user _ id2 has an association relationship with the account user _ id, so that the behavior characteristics of the same-class attribution corresponding to the account user _ id1 and the account user _ id2 can be subjected to difference analysis, and the difference of the behavior characteristics of the same-class attribution between the two accounts is obtained; when there are various types of attribution behavior characteristics (e.g., a device used when using an account, a network used when using an account, a geographical location when using an account, a time when using an account, etc.), various types of differences (e.g., a described above) can be obtained1To a5) (ii) a Then, based on the relation between the various differences and the probability that the account belongs to the same ascription person, comprehensively analyzing the various differences, and performing probability prediction on the account user _ id1 and the account user _ id2 belonging to the same ascription person to obtain the prediction probability; and when the prediction probability is smaller than a preset threshold value, taking an edge connecting the node of the account user _ id1 and the node of the account user _ id2 as a target edge, and deleting the target edge from the community graph.
In the above mode, the probability prediction of belonging to the same belonged person is performed on the accounts corresponding to the two nodes connected at the same side in the community graph, the probability prediction of belonging to the same belonged person is not required to be performed on the accounts corresponding to the two nodes not connected at the same side, the determination efficiency of the target side is improved, and the analysis efficiency of belonging to the same belonged person on different accounts is improved.
In a possible implementation manner, the step of performing probability prediction on whether the accounts corresponding to the two nodes connected at each edge belong to the same owner may specifically include: obtaining the attribute person behavior characteristic similarity between the accounts corresponding to the two nodes connected on each side based on the attribute person behavior characteristics of the accounts corresponding to the two nodes connected on each side; inputting the similarity of the behavior characteristics of the attribution people into a pre-constructed prediction model; and obtaining the prediction probability that the accounts corresponding to the two nodes connected with each edge output by the prediction model belong to the same belonger.
Wherein, the similarity of the behavior characteristics of the affiliates between the two accounts can be obtained by comparing the behavior characteristics of the affiliates of the two accounts, which can be referred to as a above1To a5Namely: a is1。Whether two accounts use the same devicePreparing; a is2。Whether two accounts use the same WiFi; a is3。Whether two accounts are used in the same geographic location; a is4。Whether two accounts are used in different geographic locations at the same time; a is5。Conditional probability that two accounts are used simultaneously, such as the conditional probability P that user _ id2 is used simultaneously (user _ id2| user _ id1) when user _ id1 is used, and the conditional probability P that user _ id1 is used simultaneously (user _ id1| user _ id2) when user _ id2 is used. The difference comparison between the attribution behavior features of the two accounts with the association relationship can be performed to obtain a data table of the attribution behavior feature similarity shown in the following table 1.
Figure BDA0003323047650000101
TABLE 1
The similarity of the attribution human behavior features is used for representing the similarity between the attribution human behavior features, and the larger the similarity between the attribution human behavior features is, the smaller the difference between the attribution human behavior features is. Illustratively, when the attribute is the device used when using an account, the more similar the devices used when using the two accounts, the less the difference in the devices used when using the two accounts; for example, when the attribution behavior characteristic is a geographic location when using an account, the more similar the geographic locations when using two accounts, the less the difference in geographic locations when using the two accounts.
Under the condition that a sample account as a training sample is bound with the unique identifier of the affiliate, the prediction model can be constructed by taking the unique identifier of the affiliate as a training label in a supervised training mode; in the case that the sample account as the training sample is not bound with the unique identifier of the attribution, the prediction model can be constructed in an unsupervised training mode on the basis of the content similarity between accounts.
After determining accounts corresponding to two nodes connected by an edge in a community graph, obtaining the similarity of the behavior characteristics of the affiliation person between the two accounts based on the comparison of the behavior characteristics of the affiliation person of the two accounts, and inputting the similarity of the behavior characteristics of the affiliation person into a pre-constructed prediction model; the prediction model predicts the probability of the two accounts belonging to the same attribution person based on the attribution person behavior feature similarity, and outputs the prediction probability.
In the above manner, the attribution behavior feature similarity between the two accounts is obtained based on the attribution behavior feature of the account corresponding to the two nodes connected on each side, the attribution behavior feature similarity between the two accounts is input into a pre-constructed prediction model, probability prediction of belonging to the same attribution is performed on the two accounts through the prediction model, and prediction accuracy is improved.
Under the condition that a sample account as a training sample is bound with an attribution unique identification code, a prediction model which is constructed by taking the attribution unique identification code as a training label in a supervised training mode specifically comprises the following steps: when the unique identifier of the affiliate is bound to each sample account, determining a sample account pair bound with the same unique identifier of the affiliate; taking the unique identification code of the same attribution person as a training label of the sample account pair; and obtaining the prediction model based on the training labels and the similarity of the behavior characteristics of the attributions of the sample account pairs.
The sample account refers to an account used as a training sample for constructing the prediction model. The unique identification code of the attribution is used for uniquely representing the attribution, such as a mobile phone number and an identification card number; when opening an account, a user can determine whether to provide the unique identifier of the attribution according to own intention, and the opened account is bound with the unique identifier of the attribution. When some users open accounts, the unique identification codes of the affiliates are voluntarily provided, the opened accounts are bound with the unique identification codes of the affiliates, and at the moment, the accounts bound with the unique identification codes of the affiliates can be used as sample accounts for training a prediction model.
Determining a sample account pair bound with the unique identification code of the same attribution from the sample accounts bound with the unique identification code of the attribution; the unique identification code of the affiliate is used as a training label of the sample account pair, and two sample accounts representing the sample account pair belong to the same affiliate; in the process of training the prediction model, the prediction model endows corresponding weight to behavior feature similarity of attributive people of the sample account pair to obtain a corresponding prediction result, when the prediction result represents that two sample accounts of the sample account pair belong to the same attributive person, the prediction result obtained based on the weight is consistent with the meaning represented by the training label of the sample account pair, and at the moment, the weight adjustment amount of the prediction model can be smaller; when the prediction model gives corresponding weight to the behavior feature similarity of the attribution people of the sample account pair to obtain a corresponding prediction result, and when the prediction result represents that two sample accounts of the sample account pair do not belong to the same attribution person, the prediction result obtained based on the weight is inconsistent with the meaning represented by the training label of the sample account pair, and at the moment, the weight adjustment amount of the prediction model can be larger. And adjusting the weight of the behavior feature similarity given to each attribution person in the prediction model according to the mode, finishing supervised training of the prediction model, and taking the trained prediction model as the constructed prediction model.
In the above manner, under the condition that the sample account is bound with the unique identifier of the affiliate, the sample account pair bound with the same unique identifier of the affiliate can be determined, the unique identifier of the affiliate is used as a training label of the sample account pair, and a prediction model is constructed based on the training label and the similarity of behavior characteristics of the affiliate of the sample account pair, so that supervised training is realized, and the prediction accuracy of the prediction model is ensured.
In a possible implementation manner, when the similarity of the behavior features of the affiliate is multiple, the step of obtaining the prediction probability that the accounts corresponding to the two nodes connected to each edge output by the prediction model belong to the same affiliate may specifically include: acquiring a fusion attribution behavior feature similarity formed by the prediction model after weighting and summing the attribution behavior feature similarity based on the prediction weight corresponding to the attribution behavior feature similarity; and taking the behavior feature similarity of the fusion attribution people output by the prediction model as a prediction probability.
After the prediction model is constructed, the prediction model includes weights assigned to behavioral feature similarities of each attribution person. When the account user _ id1 and the account user _ id2 have an association relationship, the electronic device may determine the similarity of various attribution behavior features between the account user _ id1 and the account user _ id2 from table 1, and input the similarity of various attribution behavior features between the account user _ id1 and the account user _ id2 into the prediction model; the prediction model endows corresponding weights for the behavior feature similarities of various attributions, and sums the behavior feature similarities of various attributions endowed with the corresponding weights to obtain a weighted sum result serving as the behavior feature similarity of the fusion attributions; the electronic equipment takes the behavior feature similarity of the fused affiliation people output by the prediction model as a prediction probability, when the prediction probability is larger than or equal to a probability threshold, the account user _ id1 and the account user _ id2 belong to the same affiliation person, and edges for connecting a node of the account user _ id1 and a node of the account user _ id2 in a community graph are reserved; when the prediction probability is smaller than the probability threshold, the account user _ id1 and the account user _ id2 do not belong to the same owner, and an edge connecting the node of the account user _ id1 and the node of the account user _ id2 in the community map is deleted.
In the above manner, the prediction model performs weighted summation processing on the behavior feature similarity of each attribution based on the prediction weight corresponding to the behavior feature similarity of each attribution, so as to reflect the importance degree of different attribution behavior features on two accounts corresponding to prediction belonging to the same attribution, and ensure the accuracy of the prediction probability obtained by the weighted summation processing.
Under the condition that most users do not want to provide the unique identification code of the attribution person, and the sample account is not bound with the unique identification code of the attribution person, the prediction model can be built in an unsupervised training mode to obtain the prediction weight of the prediction model, and the method specifically comprises the following steps: when the unique identification code of the affiliate is not bound to each sample account, acquiring account content similarity between two sample accounts of the sample account pair; and determining the prediction weight based on the account content similarity and the attribution behavior feature similarity of the sample account pair.
The account content can be account nicknames, account head portraits, video information issued by accounts and the like which are provided by user consent; comparing the account contents of the two accounts to obtain the account content similarity between the two accounts; illustratively, comparing the account nickname, the account avatar and the video information issued by the account between two accounts, when the account contents between the two accounts are similar, the two accounts have a higher probability of belonging to the same owner.
After a sample account pair is obtained, the account content similarity and the attribute person behavior feature similarity between two sample accounts of the sample account pair are input into a prediction model; in the training process of the prediction model, when the prediction model gives a weight to the attribution behavior feature similarity, a corresponding prediction result is obtained; when the prediction result indicates that the two sample accounts of the sample account pair belong to the same belonger, and the account content similarity also indicates that the two sample accounts of the sample account pair belong to the same belonger, the weight is adjusted to be smaller; and when the two sample accounts of the prediction result representation sample account pair do not belong to the same affiliate, but the two sample accounts of the account content similarity representation sample account pair belong to the same affiliate, and the content represented by the prediction result is inconsistent with the content represented by the account content similarity, the weight is adjusted greatly. And adjusting the weight of the behavior feature similarity given to each attribution person in the prediction model according to the mode, completing the unsupervised training of the prediction model, and taking the trained prediction model as the constructed prediction model.
In the above manner, under the condition that the sample account is not bound with the unique identifier of the affiliate, unsupervised training is performed based on the content similarity between accounts, so that the prediction accuracy of the constructed prediction model is ensured.
In one possible implementation manner, the step of determining the prediction weight based on the account content similarity and the attribution behavior feature similarity of the sample account pair specifically includes: acquiring a preset initial prediction weight; obtaining the prediction probability that two sample accounts of the sample account pair belong to the same attribution person based on the initial prediction weight and the attribution person behavior feature similarity of the sample account pair; when the prediction probability indicates that the two sample accounts of the sample account pair do not belong to the same belongings, and the account content similarity indicates that the two sample accounts of the sample account pair belong to the same belongings, adjusting the initial prediction weight and determining the prediction probability that the sample account pair belongs to the same belongings again based on the adjusted initial prediction weight; and when the re-determined prediction probability indicates that the two sample accounts of the sample account pair belong to the same owner, taking the adjusted initial prediction weight as the prediction weight.
After the electronic equipment obtains the preset initial prediction weight, corresponding initial prediction weights are given to the behavior feature similarity of each attribution person of the sample account pair, and the obtained weighted sum result is used as the prediction probability; when the two sample accounts of the sample account pair are represented by the similarity of the account contents, the initial prediction weight can be greatly adjusted, the adjusted initial prediction weight is given to the behavior feature similarity of each belonged person, the obtained weighted sum result is used as the prediction probability, when the obtained prediction probability represents that the two sample accounts of the sample account pair belong to the same belonged person and are consistent with the two sample accounts of the sample account pair represented by the similarity of the account contents, the adjusted initial prediction weight is used as the trained prediction weight, and the constructed prediction model is obtained.
In the above manner, under the condition that the unique identifier of the affiliate is not bound to the sample account, a preset initial prediction weight is given to the behavioral characteristic similarity of the affiliate of the sample account pair, under the condition that the content represented by the prediction result obtained based on the weighted summation is inconsistent with the content represented by the content similarity of the account content, the initial prediction weight is adjusted until the content represented by the prediction result obtained after adjustment is consistent with the content represented by the content similarity of the account content, and the adjusted initial prediction weight is used as the prediction weight after training is completed, so that unsupervised training is realized, and the prediction accuracy of the prediction model is ensured.
After the prediction model is constructed, the prediction model includes weights (w) assigned to behavioral feature similarities of each attribution person1,…,wn). When the account user _ id1 and the account user _ id2 have an association relationship, the electronic device may determine the similarity of various attribution behavior features between the account user _ id1 and the account user _ id2 from table 1, and input the similarity of various attribution behavior features between the account user _ id1 and the account user _ id2 into the prediction model; the prediction model endows the behavior feature similarity of various attribution persons with corresponding weight, and sums the behavior feature similarity of various attribution persons endowed with corresponding weight to obtain the weighted sum result
Figure BDA0003323047650000141
Similarity of behavior characteristics of people as fusion affiliations; the electronic equipment takes the behavior feature similarity of the fused affiliate output by the prediction model as a prediction probability, when the prediction probability is greater than or equal to a probability threshold value, namely sim (user _ id1, user _ id2) ≧ p, the account user _ id1 and the account user _ id2 belong to the same affiliate, and an edge for connecting a node of the account user _ id1 and a node of the account user _ id2 in a community graph is reserved; when the prediction probability is smaller than the probability threshold, the account user _ id1 and the account user _ id2 do not belong to the same owner, and an edge connecting the node of the account user _ id1 and the node of the account user _ id2 in the community map is deleted.
For better understanding of the above method, an application example of the account processing method of the present disclosure is explained in detail below with reference to fig. 3(a), fig. 3(b), and fig. 4. The application example can be used in account analysis of the same social network, and the corresponding relation from the account (user _ id) to the natural person (person _ id) is constructed. On the whole, the application example combines various incidence relations between accounts to construct an account incidence relation graph (in the account incidence relation graph, accounts are used as nodes, based on the incidence relations between the accounts, the nodes corresponding to the accounts are connected by edges), and under the condition that a sample account is bound with the unique identifier of an affiliate, real-name authentication information such as the unique identifier of the affiliate and the like of an identity card/a mobile phone number and the like is used as a training label of a prediction model for supervised training; under the condition that the sample account is not bound with the unique identification code of the affiliate, performing unsupervised training based on the content similarity between the accounts; the method comprises the steps of calculating the weight of the edge through a statistical method, then segmenting the graph through a scheme such as an algorithm (for example, connected branch calculation and community mining) of graph mining, and finally outputting the corresponding relation from an account to a natural person.
The whole framework of this application example has used recall layer and has added the election layer, wherein:
(1) the user _ id refers to an account id and corresponds to a unique identifier of the social network platform;
(2) community _ id refers to community id, accounts of the same community may belong to the same natural person;
(3) person _ id refers to a natural person id, corresponding to the natural person behind using user _ id.
The application example mainly constructs a mapping relation from a user _ id to a person _ id, and can comprise two parts: the first part is to construct a mapping relationship from user _ id to community _ id, and the second part is to construct a mapping relationship from community _ id to person _ id, as shown in fig. 3 (a).
The first part is to ensure recall rate, that is, to ensure that the rest of the associated accounts of each account user _ id are in the same community, that is, all the accounts of each natural person are in the community as much as possible, so as to form a candidate account set of natural persons. The second part is to ensure accuracy, that is, to ensure accuracy of mapping relationship after the partition from community _ id to person _ id. The application example can effectively guarantee the accuracy and recall rate of natural person extraction through the two parts. The above two parts are described in detail below:
a first part: a recall layer;
in the construction scheme of the recall layer, the association relationship between the account and the account needs to be extracted (step S401). And the association relationship between the accounts includes, but is not limited to, the following: (1) the incidence relation of the identity cards; (2) association relation of mobile phone numbers; (3) correlation of the same cash withdrawal account; (4) association of devices; (5) correlation of WIFI; (6) and association relation of geographic positions. If the above association relationship exists between the account user _ id1 and the account user _ id2, an edge can be formed between the user _ id1 and the user _ id2, and a graph G ═ V, E > is obtained. Where V represents the set of accounts and E represents the binding of the account to the edge between accounts. Wherein, the construction mode of the edge also has a plurality of types: (1) edges formed by only using a certain type of incidence relation; (2) edges formed using intersections or unions of the multi-class associations. And how to establish the edges needs to be formulated and selected according to specific situations.
After the account association graph is constructed, a plurality of connected branches can be mined from the association graph by using a connected branch algorithm (connected component) in graph mining, and the connected branch ids can be used as community ids which represent a candidate set of natural people under the same community id. Through the graph mining method, all the other account ids associated with a certain user _ id can be mined to form a community _ id (step S402).
A second part: a fine selection layer;
the goal of the refinement layer is to refine the natural person attribute of the user, and the refinement goal is to map the community id (community _ id) to the natural person id (person _ id), in other words, to distinguish dissimilar persons in the community, and to gather the similar persons in the community together, so as to form an accurate natural person. The identity of the same natural person is essentially an identity of the similarity of the two accounts, i.e. it is determined (user _ id1, user _ id2) whether it is the same natural person. If user _ id1 and user _ id2 are the same natural person, then label 1 is used to represent; if the user _ id1 and the user _ id2 are not the same natural person, they are represented by the label 0.
The first step is as follows: if natural persons are to be extracted, a table of features (as shown in table 1) of the associated account pair needs to be constructed (step S403), the features of which include but are not limited to: (1) whether the user _ id1 and the user _ id2 use the same device; (2) whether the same WIFI is used by the user _ id1 and the user _ id 2; (3) whether the user _ id1 and the user _ id2 appear in the same geographical location; (4) whether user _ id1 and user _ id2 appear in different geographic locations at the same time. (5) Condition outlineRates, e.g. f (user _ id1, user _ id2) ═ a1,…,an) The feature function represents the feature functions of the accounts user _ id1 and user _ id 2.
The second step is that: with and without labels
Case (i): if there is a label (e.g., ID card, phone number) that can determine whether the user _ id1 and the user _ id2 are the same natural person, then a two-classifier can be used for prediction (step S404 a). Model training of the two classifiers is carried out based on the existing labels (obtained through manual labeling or verification information such as identity cards and mobile phone numbers); after the trained classifier is obtained, the existing edge in the connected branch is predicted, and whether the edge should exist or not is judged. If the two-classifier prediction value is close to 1, the edge is reserved, and if the two-classifier prediction value is close to 0, the edge is deleted. Illustratively, as shown in fig. 3(b), the account usage similarity between the two accounts of the connected branch 1 is 0.2, which indicates that this edge should not exist, i.e., the two accounts do not belong to the same natural person, and the edge is deleted at this time.
Case (c): if there is no tag capable of judging whether the user _ id1 and the user _ id2 are the same natural person, it is necessary to use an unsupervised algorithm for processing (step S404 b). For n features, set the prediction weights (w)1,…,wn) The similarity between the account user _ id1 and the user _ id2 is calculated as
Figure BDA0003323047650000161
If sim (user _ id1, user _ id2) ≧ p, the edge between accounts is preserved, otherwise, a delete is made.
The third step: a natural person extraction process (step S405), wherein if the edges are deleted based on the account usage similarity or the result of the two classifiers, the natural person can be further extracted using a connected branch algorithm, and the same natural person is in one connected branch; if edges are not deleted, but are weighted, natural people can be extracted using a community finding algorithm, and the same natural person is in a community. In addition, a natural person set can be obtained by using a clustering algorithm such as hierarchical clustering. As shown in fig. 3(b), after the edges of the connected branches 1 are deleted, the connected branches algorithm is used to extract the natural persons 1 to 4.
In the community division of the recall layer, a community division algorithm can be directly used to form a community graph; in the natural person extraction part of the selection layer, besides the connected branch algorithm and the community division algorithm, a clustering algorithm, such as a hierarchical clustering algorithm, a KMeans algorithm and the like, can be used.
The application example has the following effects: by adopting a strategy of organically combining the recall layer and the selection layer, the accuracy can be improved as much as possible on the premise of ensuring the recall rate and the calculation efficiency; in actual use, the result of the recall layer or the result of the fine selection layer can be determined according to actual service requirements, so that the scheme can be more easily connected to different systems; for the fine selection layer, a supervised algorithm can be used if the label exists, and an unsupervised algorithm can be used when the label does not exist.
It should be understood that, although the steps in the flowcharts of fig. 1 to 4 are shown in sequence as indicated by the arrows, the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1 to 4 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or stages is not necessarily sequential, but may be performed alternately or alternately with other steps or at least some of the other steps or stages.
FIG. 5 is a block diagram illustrating an account processing device according to an example embodiment. Referring to fig. 5, the apparatus includes:
an account association relationship graph building module 501, configured to execute building an account association relationship graph by taking each account in a preset account set as a node and taking an association relationship between any two accounts as an edge of the node; the incidence relation is used for representing that at least one same entity object is associated between the two accounts;
a community division module 502 configured to perform community division on the account association relationship graph to generate a community graph;
a community graph adjustment module 503 configured to delete the target edge in the community graph to obtain an adjusted community graph; the prediction probability that accounts corresponding to two nodes connected with the target edge belong to the same belonger is smaller than a preset threshold value; the prediction probability is determined according to the difference between the behavior characteristics of the attribution persons corresponding to the two accounts; the attribution behavior characteristics are behavior characteristics generated when the corresponding attribution of the account uses the account;
the account processing module 504 is configured to execute determining accounts corresponding to the nodes connected in the adjusted community graph as accounts belonging to the same owner.
In one possible implementation, the apparatus further includes a target edge determination module configured to perform determining edges in the community graph; and performing probability prediction on whether the accounts corresponding to the two nodes connected with each edge belong to the same belonger, and taking the edge with the prediction probability smaller than a preset threshold value as the target edge.
In a possible implementation manner, the target edge determining module is configured to execute attribution behavior feature of accounts corresponding to the two nodes connected to each edge, so as to obtain attribution behavior feature similarity between the accounts corresponding to the two nodes connected to each edge; inputting the behavior feature similarity of the attribution person into a pre-constructed prediction model; and obtaining the prediction probability that the accounts corresponding to the two nodes connected with each edge output by the prediction model belong to the same belonger.
In one possible implementation manner, the apparatus further includes a prediction model training module configured to determine a sample account pair bound with the same unique identifier of the attribution when each sample account binds with the unique identifier of the attribution; taking the unique identification code of the same attribution person as a training label of the sample account pair; and obtaining the prediction model based on the training labels and the similarity of the behavior features of the attributions of the sample account pairs.
In a possible implementation manner, when the attribution behavior feature similarities are multiple, the target edge determining module is configured to execute a fusion attribution behavior feature similarity formed by obtaining the prediction model and performing weighted summation processing on the attribution behavior feature similarities based on the prediction weights corresponding to the attribution behavior feature similarities; and taking the behavior feature similarity of the fusion attribution people output by the prediction model as prediction probability.
In one possible implementation manner, the device further comprises a prediction weight determining module configured to obtain account content similarity between two sample accounts of the sample account pair when each sample account is not bound with the unique identifier code of the attribution; and determining the prediction weight based on the account content similarity and the attribution behavior feature similarity of the sample account pair.
In one possible implementation manner, the prediction weight determining module is configured to perform obtaining a preset initial prediction weight; obtaining the prediction probability that two sample accounts of the sample account pair belong to the same attribution person based on the initial prediction weight and the attribution person behavior feature similarity of the sample account pair; when the prediction probability characterizes that the two sample accounts of the sample account pair do not belong to the same attribution person and the account content similarity characterizes that the two sample accounts of the sample account pair belong to the same attribution person, adjusting the initial prediction weight and determining the prediction probability that the sample account pair belongs to the same attribution person again based on the adjusted initial prediction weight; and when the re-determined prediction probability represents that the two sample accounts of the sample account pair belong to the same person, taking the adjusted initial prediction weight as the prediction weight.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 6 is a block diagram illustrating an electronic device 600 for account processing in accordance with an exemplary embodiment. For example, the electronic device 600 may be a server. Referring to fig. 6, electronic device 600 includes a processing component 620 that further includes one or more processors, and memory resources, represented by memory 622, for storing instructions, such as application programs, that are executable by processing component 620. The application programs stored in memory 622 may include one or more modules that each correspond to a set of instructions. Further, the processing component 620 is configured to execute instructions to perform the methods of account processing described above.
The electronic device 600 may also include a power component 624 configured to perform power management for the device 600, a wired or wireless network interface 626 configured to connect the electronic device 600 to a network, and an input/output (I/O) interface 628. The electronic device 600 may operate based on an operating system stored in the memory 622, such as Window 66 over, Mac O6X, Unix, Linux, FreeB6D, or the like.
In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as the memory 622 comprising instructions, executable by the processor of the electronic device 600 to perform the above-described method is also provided. The storage medium may be a computer-readable storage medium, which may be, for example, a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk to optical data storage device, or the like.
In an exemplary embodiment, there is also provided a computer program product comprising a computer program for execution by a processor to perform the method of account processing described above.
It should be noted that the information (including but not limited to device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) related to the account referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification to the examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings, and that various modifications to changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (10)

1. An account processing method, comprising:
taking each account in a preset account set as a node, and establishing an account association relationship graph by taking the association relationship between any two accounts as the edge of the node; the incidence relation is used for representing that at least one same entity object is associated between the two accounts;
carrying out community division on the account association relation graph to generate a community graph;
deleting the target edge in the community graph to obtain an adjusted community graph; the prediction probability that accounts corresponding to two nodes connected with the target edge belong to the same belonger is smaller than a preset threshold value; the prediction probability is determined according to the difference between the behavior characteristics of the attribution persons corresponding to the two accounts; the attribution behavior characteristics are behavior characteristics generated when the corresponding attribution of the account uses the account;
and determining accounts corresponding to the nodes connected in the adjusted community graph as accounts belonging to the same owner.
2. The account processing method of claim 1, wherein prior to the deleting the target edge in the community graph, the method further comprises:
determining edges in the community graph;
and performing probability prediction on whether the accounts corresponding to the two nodes connected with each edge belong to the same belonger, and taking the edge with the prediction probability smaller than a preset threshold value as the target edge.
3. The account processing method according to claim 2, wherein the performing probability prediction on whether the accounts corresponding to the two nodes connected to each edge belong to the same owner comprises:
obtaining the attribute person behavior feature similarity between the accounts corresponding to the two nodes connected on each side based on the attribute person behavior feature of the account corresponding to the two nodes connected on each side;
inputting the behavior feature similarity of the attribution person into a pre-constructed prediction model;
and obtaining the prediction probability that the accounts corresponding to the two nodes connected with each edge output by the prediction model belong to the same belonger.
4. The account processing method of claim 3, wherein the method further comprises:
when the unique identifier of the affiliate is bound to each sample account, determining a sample account pair bound with the same unique identifier of the affiliate;
taking the unique identification code of the same attribution person as a training label of the sample account pair;
and obtaining the prediction model based on the training labels and the similarity of the behavior features of the attributions of the sample account pairs.
5. The account processing method according to claim 3, wherein when the similarity of behavior features of the affiliate is multiple, obtaining the prediction probability that the accounts corresponding to the two nodes connected to each edge output by the prediction model belong to the same affiliate, comprises:
acquiring fusion attribution behavior feature similarity formed by the prediction model after weighting and summing the attribution behavior feature similarity based on the prediction weight corresponding to the attribution behavior feature similarity;
and taking the behavior feature similarity of the fusion attribution people output by the prediction model as prediction probability.
6. The account processing method of claim 3, wherein the method further comprises:
when the unique identification code of the affiliate is not bound to each sample account, acquiring account content similarity between two sample accounts of the sample account pair;
and determining a prediction weight based on the account content similarity and the attribution behavior feature similarity of the sample account pair.
7. An account processing apparatus, comprising:
the account association relation graph building module is configured to execute the steps that each account in a preset account set is used as a node, and an association relation between any two accounts is an edge of the node to build an account association relation graph; the incidence relation is used for representing that at least one same entity object is associated between the two accounts;
the community dividing module is configured to perform community division on the account association relation graph to generate a community graph;
the community graph adjusting module is configured to delete the target edge in the community graph to obtain an adjusted community graph; the prediction probability that accounts corresponding to two nodes connected with the target edge belong to the same belonger is smaller than a preset threshold value; the prediction probability is determined according to the difference between the behavior characteristics of the attribution persons corresponding to the two accounts; the attribution behavior characteristics are behavior characteristics generated when the corresponding attribution of the account uses the account;
and the account processing module is configured to execute the step of determining accounts corresponding to the nodes connected in the adjusted community graph as accounts belonging to the same attribution.
8. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the account processing method of any of claims 1 to 6.
9. A computer-readable storage medium having instructions thereon which, when executed by a processor of an electronic device, enable the electronic device to perform the account processing method of any of claims 1 to 6.
10. A computer program product comprising a computer program, wherein the computer program when executed by a processor implements the account processing method of any of claims 1 to 6.
CN202111253169.4A 2021-10-27 2021-10-27 Account processing method and device, electronic equipment and storage medium Pending CN113987087A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111253169.4A CN113987087A (en) 2021-10-27 2021-10-27 Account processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111253169.4A CN113987087A (en) 2021-10-27 2021-10-27 Account processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113987087A true CN113987087A (en) 2022-01-28

Family

ID=79742225

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111253169.4A Pending CN113987087A (en) 2021-10-27 2021-10-27 Account processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113987087A (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070061302A1 (en) * 2005-09-14 2007-03-15 Jorey Ramer Location influenced search results
CN105512914A (en) * 2015-12-09 2016-04-20 联想(北京)有限公司 Information processing method and electronic device
KR20180017784A (en) * 2016-08-11 2018-02-21 주식회사 넥슨코리아 Method and device to contrl abusing
CN108390788A (en) * 2018-03-05 2018-08-10 北京奇艺世纪科技有限公司 User identification method, device and electronic equipment
CN109451182A (en) * 2018-10-19 2019-03-08 北京邮电大学 A kind of detection method and device of fraudulent call
CN109978033A (en) * 2019-03-15 2019-07-05 第四范式(北京)技术有限公司 The method and apparatus of the building of biconditional operation people's identification model and biconditional operation people identification
CN110569437A (en) * 2019-09-05 2019-12-13 腾讯科技(深圳)有限公司 click probability prediction and page content recommendation methods and devices
CN111104609A (en) * 2018-10-26 2020-05-05 百度在线网络技术(北京)有限公司 Interpersonal relationship prediction method, interpersonal relationship prediction device, computer program, and storage medium
CN111368013A (en) * 2020-06-01 2020-07-03 深圳市卡牛科技有限公司 Unified identification method, system, equipment and storage medium based on multiple accounts
CN111666346A (en) * 2019-03-06 2020-09-15 京东数字科技控股有限公司 Information merging method, transaction query method, device, computer and storage medium
CN111932386A (en) * 2020-09-09 2020-11-13 腾讯科技(深圳)有限公司 User account determining method and device, information pushing method and device, and electronic equipment
CN112035548A (en) * 2020-08-31 2020-12-04 北京嘀嘀无限科技发展有限公司 Identification model acquisition method, identification method, device, equipment and medium
CN112084422A (en) * 2020-08-31 2020-12-15 腾讯科技(深圳)有限公司 Intelligent processing method and device for account data
CN113486211A (en) * 2021-06-30 2021-10-08 北京达佳互联信息技术有限公司 Account identification method and device, electronic equipment, storage medium and program product

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070061302A1 (en) * 2005-09-14 2007-03-15 Jorey Ramer Location influenced search results
CN105512914A (en) * 2015-12-09 2016-04-20 联想(北京)有限公司 Information processing method and electronic device
KR20180017784A (en) * 2016-08-11 2018-02-21 주식회사 넥슨코리아 Method and device to contrl abusing
CN108390788A (en) * 2018-03-05 2018-08-10 北京奇艺世纪科技有限公司 User identification method, device and electronic equipment
CN109451182A (en) * 2018-10-19 2019-03-08 北京邮电大学 A kind of detection method and device of fraudulent call
CN111104609A (en) * 2018-10-26 2020-05-05 百度在线网络技术(北京)有限公司 Interpersonal relationship prediction method, interpersonal relationship prediction device, computer program, and storage medium
CN111666346A (en) * 2019-03-06 2020-09-15 京东数字科技控股有限公司 Information merging method, transaction query method, device, computer and storage medium
CN109978033A (en) * 2019-03-15 2019-07-05 第四范式(北京)技术有限公司 The method and apparatus of the building of biconditional operation people's identification model and biconditional operation people identification
CN110569437A (en) * 2019-09-05 2019-12-13 腾讯科技(深圳)有限公司 click probability prediction and page content recommendation methods and devices
CN111368013A (en) * 2020-06-01 2020-07-03 深圳市卡牛科技有限公司 Unified identification method, system, equipment and storage medium based on multiple accounts
CN112035548A (en) * 2020-08-31 2020-12-04 北京嘀嘀无限科技发展有限公司 Identification model acquisition method, identification method, device, equipment and medium
CN112084422A (en) * 2020-08-31 2020-12-15 腾讯科技(深圳)有限公司 Intelligent processing method and device for account data
CN111932386A (en) * 2020-09-09 2020-11-13 腾讯科技(深圳)有限公司 User account determining method and device, information pushing method and device, and electronic equipment
CN113486211A (en) * 2021-06-30 2021-10-08 北京达佳互联信息技术有限公司 Account identification method and device, electronic equipment, storage medium and program product

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴铮;于洪涛;刘树新;朱宇航;: "基于信息熵的跨社交网络用户身份识别方法", 计算机应用, no. 08, 10 August 2017 (2017-08-10), pages 2374 - 2380 *
孙嘉文等: "基于多社交媒体的个体身份关键技术研究", 小型微型计算机系统, 15 February 2017 (2017-02-15), pages 299 - 303 *

Similar Documents

Publication Publication Date Title
CN111428881B (en) Recognition model training method, device, equipment and readable storage medium
US20230334089A1 (en) Entity recognition from an image
CN113127633B (en) Intelligent conference management method and device, computer equipment and storage medium
CN112307472A (en) Abnormal user identification method and device based on intelligent decision and computer equipment
CN110166344B (en) Identity identification method, device and related equipment
CN112035549B (en) Data mining method, device, computer equipment and storage medium
CN106303599A (en) A kind of information processing method, system and server
CN111724174A (en) Citizen credit point evaluation method applying Xgboost modeling
CN113240505B (en) Method, apparatus, device, storage medium and program product for processing graph data
CN111400504A (en) Method and device for identifying enterprise key people
CN110163245A (en) Class of service prediction technique and system
CN113449011A (en) Big data prediction-based information push updating method and big data prediction system
CN110955677A (en) Identity verification method, device and system
CN115063233A (en) Method, system and device for realizing banking business service process
CN111177481A (en) User identifier mapping method and device
CN107948312B (en) Information classification and release method and system with position points as information access ports
CN116805039B (en) Feature screening method, device, computer equipment and data disturbance method
Dia et al. A closed sets based learning classifier for implicit authentication in web browsing
CN106875175B (en) Method and device convenient for payment subject expansion
CN116957112A (en) Training method, device, equipment and storage medium of joint model
CN113987087A (en) Account processing method and device, electronic equipment and storage medium
CN114896977A (en) Dynamic evaluation method for entity service trust value of Internet of things
CN115660001A (en) Near field communication card confirmation method and device, storage medium and electronic equipment
CN114092268A (en) User community detection method and device, computer equipment and storage medium
CN110895604B (en) Correlation fusion method of virtual identity information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination