WO2020259054A1 - Associated account analysis method and apparatus, and computer-readable storage medium - Google Patents

Associated account analysis method and apparatus, and computer-readable storage medium Download PDF

Info

Publication number
WO2020259054A1
WO2020259054A1 PCT/CN2020/086930 CN2020086930W WO2020259054A1 WO 2020259054 A1 WO2020259054 A1 WO 2020259054A1 CN 2020086930 W CN2020086930 W CN 2020086930W WO 2020259054 A1 WO2020259054 A1 WO 2020259054A1
Authority
WO
WIPO (PCT)
Prior art keywords
account
predetermined
relationship
threshold
information
Prior art date
Application number
PCT/CN2020/086930
Other languages
French (fr)
Chinese (zh)
Inventor
周亮
Original Assignee
京东数字科技控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东数字科技控股有限公司 filed Critical 京东数字科技控股有限公司
Publication of WO2020259054A1 publication Critical patent/WO2020259054A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/45Structures or tools for the administration of authentication

Definitions

  • the present disclosure relates to the technical field of big data analysis, in particular to an associated account analysis method, device and computer-readable storage medium.
  • account identification is often carried out by means of real-name account information and registered mobile phone number.
  • the real-name account management system can improve security, prevent online fraud, or achieve rapid recovery of losses, on the other hand, it can be targeted during business promotion and improve promotion efficiency.
  • an associated account analysis method which includes: establishing an association relationship between an account and a number identification according to account information, where the number identification is a mobile phone number or an ID number; The relationship between the account and the login terminal; establish a relationship graph based on the relationship between the account and the number identification and the relationship between the account and the login terminal, where the account, the number identification, and the login terminal are the vertices, and the vertices with the associated relationship are connected; and according to The relationship diagram determines that an account whose shortest path between accounts is less than a predetermined first threshold is an associated account.
  • the associated account analysis method further includes: determining the association relationship between the account and the number identifier according to the account data information associated with the account, and the association weight of the association relationship between the account and the login terminal, as the corresponding association relationship
  • the weight of the association relationship is greater than 0, and the association relationship is negatively correlated with at least one of the number of occurrences in the account data information or the predetermined importance of the event to which the association relationship appears in the account data information.
  • determining that an account whose shortest path between accounts is less than a predetermined first threshold is an associated account includes: determining an account whose shortest path to a predetermined seed account is less than a predetermined first threshold is a target account; and determining that the target account is The associated account of the predetermined seed account.
  • establishing the relationship graph includes: generating an account, number identification, and abstract vertex identification of the login terminal, and recording the corresponding relationship between the abstract vertex identification and the account; and connecting the abstract vertex identification with an association relationship to generate the relationship graph; determining the account Accounts with the shortest path between them less than a predetermined first threshold being associated accounts include: taking the abstract vertex identifier of the predetermined seed account as a starting point, determining that the distance from the predetermined seed account in the relationship graph is less than the predetermined first threshold and has a corresponding relationship with the account Abstract vertex identifier; and according to the corresponding relationship between the abstract vertex identifier and the account, the determined abstract vertex identifier is restored to the account.
  • the associated account analysis method further includes: selecting the previously predetermined second threshold account as the key associated account of the account according to the descending order of the weight of the path.
  • the side lengths in the relationship graph are positively correlated with weights; according to the relationship graph, it is determined that the account whose shortest path between accounts is less than the predetermined first threshold is the associated account: According to the shortest path algorithm, it is determined that the path length is less than the predetermined value.
  • the associated account analysis method further includes: storing at least one of the associated account or key associated account of each account; and supplementing the account real-name system information according to at least one of the associated account or the key associated account.
  • the associated account analysis method further includes: storing at least one of the associated account or key associated account of each account; and reducing the push of information to at least one of the associated account or the key associated account according to the promotion of the information The effect adjusts at least one of the predetermined first threshold or the predetermined second threshold.
  • determining according to the relationship graph that the account whose shortest path between accounts is less than a predetermined first threshold is an associated account includes: calculating the source vertices according to the breadth-first search BFS algorithm and all vertices that can be accessed within the predetermined first threshold distance ;According to the Dijkstra algorithm, calculate the first predetermined number of vertices to reach the source vertex within the range of all vertices that can be accessed; and determine the account corresponding to the first predetermined number of vertices as the associated account of the source vertex account.
  • an associated account analysis device including: an association relationship establishing unit configured to establish an association relationship between an account and a number identifier according to account information, and establish an account and log in according to login information The association relationship between terminals; where the number identifier is a mobile phone number or ID number; the relationship graph generating unit is configured to establish a relationship diagram based on the account and number identifier and the association relationship between the account and the login terminal, where the account and number identifier The login terminal is a vertex, and the vertices with an association relationship are connected; and the associated account determination unit is configured to determine, according to the relationship graph, an account whose shortest path between accounts is less than a predetermined first threshold is an associated account.
  • an associated account analysis device including: a memory; and a processor coupled to the memory, and the processor is configured to execute any of the above based on instructions stored in the memory Associated account analysis method.
  • a computer-readable storage medium on which computer program instructions are stored, which when executed by a processor, implement the steps of any of the above associated account analysis methods.
  • FIG. 1 is a flowchart of some embodiments of the associated account analysis method of the present disclosure.
  • FIG. 2 is a schematic diagram of vertex relationships of some embodiments in the associated account analysis method of the present disclosure.
  • FIG. 3 is a flowchart of other embodiments of the associated account analysis method of the present disclosure.
  • FIG. 4 is a flowchart of still other embodiments of the associated account analysis method of the present disclosure.
  • FIG. 5 is a schematic diagram of some embodiments of account conversion in the associated account analysis method of the present disclosure.
  • FIG. 6 is a schematic diagram of some embodiments of the associated account analysis device of the present disclosure.
  • FIG. 7 is a schematic diagram of other embodiments of the associated account analysis device of the present disclosure.
  • FIG. 8 is a schematic diagram of still other embodiments of the associated account analysis device of the present disclosure.
  • FIG. 1 The flowchart of some embodiments of the associated account analysis method of the present disclosure is shown in FIG. 1 and includes steps 101-104.
  • step 101 an association relationship between the account and the number identification is established according to the account information.
  • the number identification is a mobile phone number or an ID number.
  • the account information may include an account number, and one or more of the user's name, contact information, or ID number.
  • step 102 an association relationship between the account and the login terminal is established according to the login information.
  • the order of execution of steps 101 and 102 is in no particular order.
  • step 103 a relationship diagram is established according to the association relationship between the account number and the number identification and the login terminal, and the account number, the number identification, and the login terminal are the vertices, as shown in FIG. 2.
  • the relationship graph vertices with an association relationship are connected. For example, if two accounts are logged in using the same terminal, the vertices of the two accounts are connected to the corresponding vertices of the login terminal.
  • step 104 an account whose shortest path between accounts is less than a predetermined first threshold is determined as an associated account according to the relationship graph.
  • the path length between vertices may be determined based on the shortest path algorithm.
  • account A and account B have the same mobile phone number, then account B will be identified as an associated account (or vest account) of account A. If account C and account A do not match on the login terminal and mobile phone number, they will be recognized as unassociated accounts. However, if the account C and the account B are the same on the login terminal, the account C is likely to be the vest account of the account A at this time, and this situation is often not recognized. In addition, it is very complicated and time-consuming to use a relational database to identify related accounts within a million-level seed users in a large-scale account.
  • the relationship diagram of account, number identification, and login terminal can be established based on account information and login information, and the associated account can be determined according to the shortest path between the accounts, thereby improving the ability to identify the associated account, thereby improving Account management capabilities are conducive to network security management and control.
  • a predetermined seed account in order to reduce the amount of calculation, can be selected as needed.
  • the target account whose association relationship needs to be analyzed is used as the predetermined seed account, and the shortest path length to the vertices of other accounts is determined from the predetermined seed account. This avoids excessive calculations due to starting point traversal and improves calculation efficiency.
  • the predetermined first threshold may be set as the threshold of the shortest path length. If the shortest path length between accounts is less than the predetermined first threshold, the two are associated account relationships; if it is greater than or equal to the predetermined first threshold, exclude Its associated account relationship.
  • the upper limit of the associated account of the source account can also be set. According to the order of the shortest path length, the number of associated accounts of the source account is not greater than the preset upper limit, so as to avoid excessive association between accounts and reduce Probability of error.
  • the weight of the association relationship between the account and the number identification, and the login terminal may also be determined according to the account data information associated with the account.
  • the weight of the relationship is used as the weight of the path between the vertices of the corresponding relationship.
  • the size of the weight is negatively related to the strength of the relationship, such as the inverse of the strength of the relationship.
  • the account data information may include one or more of order information, login information, or delivery information corresponding to the account.
  • the weight of the association relationship may be determined according to at least one of the number of occurrences of the association relationship in the account data information or the predetermined importance of the event to which the association relationship appears in the account data information, for example, an order event
  • the weight of is 2, and the reciprocal of the number of shopping orders associated with the mobile phone number multiplied by 2 is the weight between the mobile phone number and the account number generated by the order event; the weight of the login event is 1, then the user account is on a certain device
  • the reciprocal of the number of login times multiplied by 1 is the weight between the login terminal and the account due to the login event.
  • the similarity between the associated accounts can be determined.
  • the sum of the weights is negatively correlated with the similarity, for example, in a negative ratio.
  • the number of associated accounts is further determined.
  • the degree of similarity between the accounts helps to further measure the degree of association between accounts.
  • FIG. 3 The flowcharts of other embodiments of the associated account analysis method of the present disclosure are shown in FIG. 3, and include steps 301 to 307.
  • step 301 an association relationship between the account and the number identification is established according to the account information.
  • step 302 an association relationship between the account and the login terminal is established according to the login information.
  • steps 301 and 302 are performed in no particular order.
  • step 303 a relationship diagram is established according to the association relationship between the account number and the number identification and the login terminal, and then step 304 and step 305 are performed respectively.
  • step 304 the weight of the association relationship is determined according to the account data information, as the weight of the path between the vertices of the corresponding association relationship, and step 306 is performed.
  • step 305 an account whose shortest path between accounts is less than a predetermined first threshold is determined as an associated account according to the relationship graph, and step 306 is performed.
  • the Dijkstra algorithm can be improved based on the calculation of the first N shortest paths within the distance K from a vertex (K and N are positive integers), and the source vertex is called root. Then the calculation process of the algorithm is:
  • the first N shortest paths to the root vertex are calculated within the range of the vertices recorded by U. Because the Dijkstra algorithm is an algorithm that generates the shortest paths in increasing order of path length, there is no need to find all the shortest paths of the root vertex. Path, the shortest path generated N times is the first N shortest paths among all the shortest paths, and N is a positive integer.
  • step 306 the similarity between the associated accounts is determined according to the sum of the weights of the paths in the shortest path of the associated accounts.
  • step 307 the key associated accounts are selected in the descending order of the weight of the path, for example, the first predetermined second threshold account is selected as the key associated accounts of the accounts in the descending order of the weight of the shortest path.
  • the weight of the association relationship can be determined according to the number of events, weight, etc., so as to exclude the account association caused by accidental events in the screening process, and further ensure the reliability of the association between accounts.
  • the side length when generating the relationship graph, can be set to be positively related to the path weight. For example, the side length is equal to the path weight, then the side length is negatively related to the strength of the association relationship, and the stronger the association strength, the side length Shorter.
  • the shortest path algorithm is used to calculate the second predetermined threshold associated accounts whose path length is less than the first predetermined threshold, that is, the second predetermined threshold associated accounts with the shortest path (highest similarity) within the first predetermined threshold.
  • a predetermined number of associated accounts with the closest similarity can be obtained through a one-time shortest path calculation, and the associated strength is ensured to be greater than the predetermined requirement, the calculation efficiency is improved, and the calculation pressure on the device is reduced.
  • FIG. 4 The flowchart of still other embodiments of the associated account analysis method of the present disclosure is shown in FIG. 4, and includes steps 401-405.
  • step 401 an association relationship between the account number, the number identification, and the login terminal is established.
  • step 402 the account number, number identifier, and abstract vertex identifier of the login terminal are generated, and the correspondence between the abstract vertex identifier and the account is recorded.
  • each row of records represents an edge, and each edge contains a starting vertex. , The weight value of the ending vertex and edge.
  • Each edge record is divided into three columns, the first column is the user account, the second column is the phone number or device number, and the third column is the weight value of the edge. The first two columns represent the starting and ending vertices of the edge, and the third column is the weight information of the edge.
  • the seed user data file stores the accounts of all seed users. By specifying the account of the seed user, the associated account of the seed user's account can be obtained in a targeted manner, which improves the pertinence and execution efficiency.
  • abstract graph data based on the original edge data and seed user data.
  • For the original edge data generate a unique corresponding continuous value (abstract vertex) for each user account, device number, and mobile phone number in the original edge, and store it in the abstract graph abstraction
  • the edge data file is input as edge data, and the mapping relationship between the user account and its corresponding abstract vertices is saved.
  • seed user data the abstract vertex data of each seed user account is obtained according to the mapping relationship between the user account and the abstract vertex, and stored in the abstract graph vertex data file as the vertex data input.
  • user account 1 is abstracted as identity 0
  • mobile phone number 1 is abstracted as identity 1
  • mobile phone number 2 is abstracted as identity 2
  • device number 1 (login terminal identity)
  • user account 2 is abstracted as identity 4.
  • Device number 2 is abstracted as logo 5
  • user account 3 is abstracted as logo 6
  • mobile phone number 3 is abstracted as logo 7
  • device number 3 is abstracted as logo 8.
  • the vertex data identifiers corresponding to seed user accounts 1 and 3 are 0 and 6, respectively.
  • step 403 the abstract vertex identifiers with association relationships are connected to generate a relationship graph.
  • step 404 using the abstract vertex identifier of the predetermined seed account as a starting point, determine the abstract vertex identifier in the relationship graph that has a corresponding relationship with the account whose distance from the predetermined seed account is less than a predetermined first threshold.
  • the graph can be represented in the form of an adjacency matrix in the memory by loading the abstract graph edge data and vertex data, and then execute the shortest path algorithm to calculate the first N of each seed user vertex within K steps for the purpose of the user account
  • the shortest path to the vertex The destination vertex of each shortest path is the associated account of the seed user, and the sum of the weights on the path (path weight) is the degree of account similarity.
  • the maximum N shortest paths calculated by each seed user obtain the maximum N associated account information of the seed user, and all the calculated associated account vertices and seed account vertices information are output to the result file.
  • step 405 according to the corresponding relationship between the abstract vertex identifier and the account, the determined abstract vertex identifier is restored to the account.
  • the original data can be abstracted and then the graph calculation can be performed, which reduces the amount of data that needs to be processed during graph calculation, improves the accuracy of the calculation, and also improves the efficiency of the calculation.
  • the associated account analysis method may further include steps 406 and 407.
  • step 406 at least one of the associated account and the key associated account of each account is stored.
  • the account real-name system information is supplemented according to at least one of the associated account or the key associated account. For example, for two accounts whose similarity is greater than a predetermined similarity threshold, it can be considered that they belong to the same user, and both have the same real-name information.
  • the identity of the user without real-name information can be determined by investigating the real-name user of its associated account, thereby increasing the probability of successful network security tracing.
  • step 408 can also be performed in the account data application.
  • step 408 push information to at least one of the associated account or the key associated account is reduced.
  • the data of colleagues of known high-value users will be calculated based on the address information, and then marketing information will be sent to these colleagues by SMS.
  • the user's vest account can be identified and excluded first, and repeated push of information to the same user is avoided, which not only improves operation efficiency, but also saves SMS costs.
  • the predetermined first threshold and the predetermined second threshold mentioned above can be modified according to the execution effect in steps 407 and 408, and the generation rule of the association weight in the weight determination process, the weight of the event, etc. can also be modified. , So as to continuously modify the parameters during the operation and application process to further improve the accuracy.
  • the association relationship establishment unit 601 can establish an association relationship between an account and a number identifier according to account information, and establish an association relationship between an account and a login terminal according to login information.
  • the number identification is a mobile phone number or an ID number.
  • the relationship graph generating unit 602 can establish a relationship graph according to the association relationship between the account number, the number identification and the login terminal, and the account number, the number identification, and the login terminal are the vertices.
  • the associated account determination unit 603 can establish a relationship graph according to the associated relationship between the account, the number identification and the login terminal, and the account, the number identification, and the login terminal are the apex.
  • Such an associated account analysis device can establish a relationship diagram of account, number identification, and login terminal based on account information and login information, and determine the associated account based on the shortest path between the accounts, thereby improving the ability to identify associated accounts and thereby improving account management capabilities , Which is conducive to network security management and control.
  • the associated account analysis device may further include a weight determination unit 604 and a similarity determination unit 605.
  • the weight determination unit 604 can determine the account number and the number identification according to the account data information associated with the account number.
  • the weight of the association relationship between the registration terminals is used as the weight of the path between the vertices of the corresponding association when the relationship graph is generated.
  • the similarity determination unit 605 can determine the similarity between the associated accounts according to the sum of the weights of the paths in the shortest path of the associated accounts.
  • the sum of the weights is negatively correlated with the similarity, for example, in a negative ratio, so that after the associated accounts are filtered out, Further determining the degree of similarity between related accounts is helpful to further measure the degree of connection between accounts.
  • the relationship graph generating unit 602 may set the side length to be positively related to the path weight when generating the relationship graph. If the side length is equal to the path weight, the side length is negatively related to the strength of the association relationship, and the relationship strength The stronger the side length the shorter.
  • the associated account determination unit 603 uses the shortest path algorithm to calculate the second predetermined threshold whose path length is less than the first predetermined threshold. The number of associated accounts is the second predetermined threshold with the shortest path (highest similarity) within the first predetermined threshold. Link accounts.
  • Such an associated account analysis device can obtain a predetermined number of associated accounts with the closest similarity through a one-time shortest path calculation, and ensure that the associated strength is greater than a predetermined requirement, improve computing efficiency, and reduce computing pressure on equipment.
  • the associated account analysis device may further include an associated information application unit 606, which can supplement real-name account information based on associated accounts and key associated accounts. For example, when the similarity of two accounts is greater than a predetermined similarity threshold, it can be considered that the users belong to the same user, and the two accounts have the same real-name information. Such an associated account analysis device can supplement user real-name information and improve network supervision. In addition, when searching for a user who does not have real-name information, the associated information application unit 606 can determine the identity of the user without real-name information by investigating the real-name user of the associated account, thereby increasing the probability of successful security tracking.
  • an associated information application unit 606 can supplement real-name account information based on associated accounts and key associated accounts. For example, when the similarity of two accounts is greater than a predetermined similarity threshold, it can be considered that the users belong to the same user, and the two accounts have the same real-name information. Such an associated account analysis device can supplement user real-name information and improve network supervision. In addition, when searching for a
  • the associated information application unit 606 can reduce information pushes to at least one of an associated account or a key associated account, so that the user’s vest account is first identified and excluded during the operation process, so as to avoid repeatedly pushing information to the same user , While improving operational efficiency, it also saves SMS costs.
  • the associated account analysis device may further include a threshold adjustment unit 607, which can modify the predetermined first threshold and the predetermined second threshold mentioned above according to the operating effect of the associated information application unit 606, and can also modify the weight determination process.
  • the generation rules of the correlation weights in the middle, the weights of events, etc., are continuously revised during the operation and application process to further improve the accuracy.
  • the associated account analysis device includes a memory 701 and a processor 702.
  • the memory 701 may be a magnetic disk, flash memory or any other non-volatile storage medium.
  • the memory is used to store the instructions in the corresponding embodiment of the above associated account analysis method.
  • the processor 702 is coupled to the memory 701 and can be implemented as one or more integrated circuits, such as a microprocessor or a microcontroller.
  • the processor 702 is configured to execute instructions stored in the memory, which can improve the ability to identify associated accounts, thereby improving account management capabilities, and is beneficial to network security management and control.
  • the associated account analysis device 800 includes a memory 801 and a processor 802.
  • the processor 802 is coupled to the memory 801 through the BUS bus 803.
  • the associated account analysis device 800 can also be connected to an external storage device 805 via a storage interface 804 to call external data, and can also be connected to a network or another computer system (not shown) via a network interface 806. No more detailed introduction here.
  • storing data instructions through a memory and processing the above instructions through a processor can improve the ability to identify associated accounts, thereby improving account management capabilities, which is beneficial to network security management and control.
  • a computer-readable storage medium has computer program instructions stored thereon, which, when executed by a processor, implement the steps of the method in the corresponding embodiment of the associated account analysis method.
  • the embodiments of the present disclosure may be provided as methods, devices, or computer program products. Therefore, the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware.
  • the present disclosure may take the form of a computer program product implemented on one or more computer-usable non-transitory storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. .
  • These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device.
  • the device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
  • These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment.
  • the instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.
  • the method and apparatus of the present disclosure may be implemented in many ways.
  • the method and apparatus of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, and firmware.
  • the above-mentioned order of the steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above, unless specifically stated otherwise.
  • the present disclosure may also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Telephonic Communication Services (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Provided are an associated account analysis method and apparatus, and a computer-readable storage medium, which relate to the technical field of big data analysis. The associated account analysis method comprises: according to account information, establishing an association relationship between an account and a number identifier, wherein the number identifier is a mobile phone number or the number of an identity card; according to login information, establishing an association relationship between the account and a login terminal; establishing a relationship diagram according to the association relationships between the account and the number identifier and between the account and the login terminal, wherein the account, the number identifier and the login terminal are vertexes, and the vertexes with association relationships are in communication; and according to the relationship diagram, determining an account, with the shortest path between the accounts being smaller than a preset first threshold, as an associated account. By this method, the capability of identifying the associated account can be improved, thus improving account management capability, and facilitating network security management and control.

Description

关联账号分析方法、装置和计算机可读存储介质Associated account analysis method, device and computer readable storage medium
相关申请的交叉引用Cross references to related applications
本申请是以CN申请号为201910572269.X,申请日为2019年6月28日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。This application is based on the application with the CN application number 201910572269.X and the application date on June 28, 2019, and claims its priority. The disclosure of the CN application is hereby incorporated into this application as a whole.
技术领域Technical field
本公开涉及大数据分析技术领域,特别是一种关联账号分析方法、装置和计算机可读存储介质。The present disclosure relates to the technical field of big data analysis, in particular to an associated account analysis method, device and computer-readable storage medium.
背景技术Background technique
在用户账户管理过程中,往往利用账户实名制信息、注册手机号实名制等方式进行账户识别。账户管理实名制一方面能够提高安全性,预防网络诈骗,或实现损失的迅速追回,另一方面能够在业务推广时有针对性,提高推广效率。In the process of user account management, account identification is often carried out by means of real-name account information and registered mobile phone number. On the one hand, the real-name account management system can improve security, prevent online fraud, or achieve rapid recovery of losses, on the other hand, it can be targeted during business promotion and improve promotion efficiency.
发明内容Summary of the invention
根据本公开的一些实施例的一个方面,提出一种关联账号分析方法,包括:根据账号信息建立账号与号码标识间的关联关系,其中,号码标识为手机号码或身份证号码;根据登录信息建立账号与登录终端间的关联关系;根据账号与号码标识和账号与登录终端间的关联关系建立关系图,其中,账号、号码标识、登录终端为顶点,具备关联关系的顶点之间连通;和根据关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号。According to an aspect of some embodiments of the present disclosure, an associated account analysis method is proposed, which includes: establishing an association relationship between an account and a number identification according to account information, where the number identification is a mobile phone number or an ID number; The relationship between the account and the login terminal; establish a relationship graph based on the relationship between the account and the number identification and the relationship between the account and the login terminal, where the account, the number identification, and the login terminal are the vertices, and the vertices with the associated relationship are connected; and according to The relationship diagram determines that an account whose shortest path between accounts is less than a predetermined first threshold is an associated account.
在一些实施例中,关联账号分析方法还包括:根据与账号相关联的账号数据信息确定账号与号码标识间的关联关系,以及账号与登录终端间的关联关系的关联关系权重,作为对应关联关系的顶点之间的路径的权重,其中,账号数据信息包括订单信息、登录信息或收货信息中的一种或多种;和根据关联账号的最短路径中路径的权重之和确定关联账号之间的相似度。In some embodiments, the associated account analysis method further includes: determining the association relationship between the account and the number identifier according to the account data information associated with the account, and the association weight of the association relationship between the account and the login terminal, as the corresponding association relationship The weight of the path between the vertices, where the account data information includes one or more of order information, login information, or delivery information; and the weight of the path in the shortest path of the associated account is determined by the sum of the weights of the paths The similarity.
在一些实施例中,关联关系权重大于0,且关联关系与账号数据信息中出现的次数、或关联关系在账号数据信息中出现时归属的事件的预定重要程度中的至少一项负相关。In some embodiments, the weight of the association relationship is greater than 0, and the association relationship is negatively correlated with at least one of the number of occurrences in the account data information or the predetermined importance of the event to which the association relationship appears in the account data information.
在一些实施例中,确定账号之间的最短路径小于预定第一阈值的账号为关联账号包括:确定与预定种子账号之间的最短路径小于预定第一阈值的账号为目标账号;确定目标账号为预定种子账号的关联账号。In some embodiments, determining that an account whose shortest path between accounts is less than a predetermined first threshold is an associated account includes: determining an account whose shortest path to a predetermined seed account is less than a predetermined first threshold is a target account; and determining that the target account is The associated account of the predetermined seed account.
在一些实施例中,建立关系图包括:生成账号、号码标识和登录终端的抽象顶点标识,记录抽象顶点标识与账号的对应关系;和连通具备关联关系的抽象顶点标识,生成关系图;确定账号之间的最短路径小于预定第一阈值的账号为关联账号包括:以预定种子账号的抽象顶点标识为起点,确定关系图中与预定种子账号距离小于预定第一阈值、且与账号具备对应关系的抽象顶点标识;和根据抽象顶点标识与账号的对应关系,将确定的抽象顶点标识还原为账号。In some embodiments, establishing the relationship graph includes: generating an account, number identification, and abstract vertex identification of the login terminal, and recording the corresponding relationship between the abstract vertex identification and the account; and connecting the abstract vertex identification with an association relationship to generate the relationship graph; determining the account Accounts with the shortest path between them less than a predetermined first threshold being associated accounts include: taking the abstract vertex identifier of the predetermined seed account as a starting point, determining that the distance from the predetermined seed account in the relationship graph is less than the predetermined first threshold and has a corresponding relationship with the account Abstract vertex identifier; and according to the corresponding relationship between the abstract vertex identifier and the account, the determined abstract vertex identifier is restored to the account.
在一些实施例中,关联账号分析方法还包括:按照路径的权重从小到大的顺序,选择前预定第二阈值个账号作为账号的重点关联账号。In some embodiments, the associated account analysis method further includes: selecting the previously predetermined second threshold account as the key associated account of the account according to the descending order of the weight of the path.
在一些实施例中,关系图中的边长与权重正相关;根据关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号为:按照最短路径算法确定路径长度小于预定低于距离的预定第二阈值个账号作为账号的重点关联账号。In some embodiments, the side lengths in the relationship graph are positively correlated with weights; according to the relationship graph, it is determined that the account whose shortest path between accounts is less than the predetermined first threshold is the associated account: According to the shortest path algorithm, it is determined that the path length is less than the predetermined value. The predetermined second threshold of the distance accounts for the key associated accounts of the accounts.
在一些实施例中,关联账号分析方法还包括:存储各个账号的关联账号或重点关联账号中的至少一种;和根据关联账号或重点关联账号中的至少一种补充账号实名制信息。In some embodiments, the associated account analysis method further includes: storing at least one of the associated account or key associated account of each account; and supplementing the account real-name system information according to at least one of the associated account or the key associated account.
在一些实施例中,关联账号分析方法还包括:存储各个账号的关联账号或重点关联账号中的至少一种;和减少向关联账号或重点关联账号中的至少一种推送信息,根据信息的推广效果调整预定第一阈值或预定第二阈值中的至少一项。In some embodiments, the associated account analysis method further includes: storing at least one of the associated account or key associated account of each account; and reducing the push of information to at least one of the associated account or the key associated account according to the promotion of the information The effect adjusts at least one of the predetermined first threshold or the predetermined second threshold.
在一些实施例中,根据关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号包括:根据广度优先搜索BFS算法求源顶点预定第一阈值距离之内能够访问到的所有顶点;根据迪杰斯特拉Dijkstra算法在能够访问到的所有顶点范围内计算出到达源顶点的前预定数量个顶点;和确定前预定数量个顶点对应的账号为源顶点账号的关联账号。In some embodiments, determining according to the relationship graph that the account whose shortest path between accounts is less than a predetermined first threshold is an associated account includes: calculating the source vertices according to the breadth-first search BFS algorithm and all vertices that can be accessed within the predetermined first threshold distance ;According to the Dijkstra algorithm, calculate the first predetermined number of vertices to reach the source vertex within the range of all vertices that can be accessed; and determine the account corresponding to the first predetermined number of vertices as the associated account of the source vertex account.
根据本公开的另一些实施例的一个方面,提出一种关联账号分析装置,包括:关联关系建立单元,被配置为根据账号信息建立账号与号码标识间的关联关系,根据登录信息建立账号与登录终端间的关联关系;其中,号码标识为手机号码或身份证号码;关系图生成单元,被配置为根据账号与号码标识和账号与登录终端间的关联关系建立关系图,其中,账号、号码标识、登录终端为顶点,具备关联关系的顶点之间连通; 和关联账号确定单元,被配置为根据关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号。According to an aspect of other embodiments of the present disclosure, an associated account analysis device is provided, including: an association relationship establishing unit configured to establish an association relationship between an account and a number identifier according to account information, and establish an account and log in according to login information The association relationship between terminals; where the number identifier is a mobile phone number or ID number; the relationship graph generating unit is configured to establish a relationship diagram based on the account and number identifier and the association relationship between the account and the login terminal, where the account and number identifier The login terminal is a vertex, and the vertices with an association relationship are connected; and the associated account determination unit is configured to determine, according to the relationship graph, an account whose shortest path between accounts is less than a predetermined first threshold is an associated account.
根据本公开的又一些实施例的一个方面,提出一种关联账号分析装置,包括:存储器;以及耦接至存储器的处理器,处理器被配置为基于存储在存储器的指令执行上文中任意一种关联账号分析方法。According to an aspect of still other embodiments of the present disclosure, an associated account analysis device is provided, including: a memory; and a processor coupled to the memory, and the processor is configured to execute any of the above based on instructions stored in the memory Associated account analysis method.
根据本公开的再一些实施例的一个方面,提出一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现上文中任意一种关联账号分析方法的步骤。According to an aspect of still other embodiments of the present disclosure, a computer-readable storage medium is provided, on which computer program instructions are stored, which when executed by a processor, implement the steps of any of the above associated account analysis methods.
附图说明Description of the drawings
此处所说明的附图用来提供对本公开的进一步理解,构成本公开的一部分,本公开的示意性实施例及其说明用于解释本公开,并不构成对本公开的不当限定。在附图中:The drawings described here are used to provide a further understanding of the present disclosure and constitute a part of the present disclosure. The exemplary embodiments of the present disclosure and their descriptions are used to explain the present disclosure, and do not constitute an improper limitation of the present disclosure. In the attached picture:
图1为本公开的关联账号分析方法的一些实施例的流程图。FIG. 1 is a flowchart of some embodiments of the associated account analysis method of the present disclosure.
图2为本公开的关联账号分析方法中一些实施例的顶点关系示意图。FIG. 2 is a schematic diagram of vertex relationships of some embodiments in the associated account analysis method of the present disclosure.
图3为本公开的关联账号分析方法的另一些实施例的流程图。FIG. 3 is a flowchart of other embodiments of the associated account analysis method of the present disclosure.
图4为本公开的关联账号分析方法的又一些实施例的流程图。FIG. 4 is a flowchart of still other embodiments of the associated account analysis method of the present disclosure.
图5为本公开的关联账号分析方法中账号转换的一些实施例的示意图。FIG. 5 is a schematic diagram of some embodiments of account conversion in the associated account analysis method of the present disclosure.
图6为本公开的关联账号分析装置的一些实施例的示意图。FIG. 6 is a schematic diagram of some embodiments of the associated account analysis device of the present disclosure.
图7为本公开的关联账号分析装置的另一些实施例的示意图。FIG. 7 is a schematic diagram of other embodiments of the associated account analysis device of the present disclosure.
图8为本公开的关联账号分析装置的又一些实施例的示意图。FIG. 8 is a schematic diagram of still other embodiments of the associated account analysis device of the present disclosure.
具体实施方式Detailed ways
下面通过附图和实施例,对本公开的技术方案做进一步的详细描述。The technical solutions of the present disclosure will be further described in detail below through the accompanying drawings and embodiments.
本公开的关联账号分析方法的一些实施例的流程图如图1所示,包括步骤101~104。The flowchart of some embodiments of the associated account analysis method of the present disclosure is shown in FIG. 1 and includes steps 101-104.
在步骤101中,根据账号信息建立账号与号码标识间的关联关系。在一些实施例中,号码标识为手机号码或身份证号码。在一些实施例中,账号信息可以包括账号,以及用户的姓名、联系方式或身份证号码中的一项或多项。In step 101, an association relationship between the account and the number identification is established according to the account information. In some embodiments, the number identification is a mobile phone number or an ID number. In some embodiments, the account information may include an account number, and one or more of the user's name, contact information, or ID number.
在步骤102中,根据登录信息建立账号与登录终端间的关联关系。In step 102, an association relationship between the account and the login terminal is established according to the login information.
在一些实施例中,步骤101、102的执行顺序不分先后。In some embodiments, the order of execution of steps 101 and 102 is in no particular order.
在步骤103中,根据账号与号码标识和登录终端间的关联关系建立关系图,账号、号码标识、登录终端为顶点,如图2所示。在关系图中,具备关联关系的顶点之间连通,例如两账号采用同一终端登录,则这两个账号的顶点均与该登录终端对应的顶点连通。In step 103, a relationship diagram is established according to the association relationship between the account number and the number identification and the login terminal, and the account number, the number identification, and the login terminal are the vertices, as shown in FIG. 2. In the relationship graph, vertices with an association relationship are connected. For example, if two accounts are logged in using the same terminal, the vertices of the two accounts are connected to the corresponding vertices of the login terminal.
在步骤104中,根据关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号。在一些实施例中,可以基于最短路径算法确定顶点之间的路径长度。In step 104, an account whose shortest path between accounts is less than a predetermined first threshold is determined as an associated account according to the relationship graph. In some embodiments, the path length between vertices may be determined based on the shortest path algorithm.
发明人发现,在相关技术中,如账号A和账号B在手机号上一致,那么账号B会被识别为账号A的关联账号(或马甲账号)。账号C和账号A在登录终端和手机号上都不匹配,那就会被识别成非关联账号。但是账号C和账号B若在登录终端上一致,此时账号C很有可能是账号A的马甲账号,这样的情况往往无法被识别出。另外,若想利用关系型数据库在大规模账号中识别出百万级别种子用户多度之内的关联账号非常复杂和耗时。The inventor found that in related technologies, if account A and account B have the same mobile phone number, then account B will be identified as an associated account (or vest account) of account A. If account C and account A do not match on the login terminal and mobile phone number, they will be recognized as unassociated accounts. However, if the account C and the account B are the same on the login terminal, the account C is likely to be the vest account of the account A at this time, and this situation is often not recognized. In addition, it is very complicated and time-consuming to use a relational database to identify related accounts within a million-level seed users in a large-scale account.
通过上文实施例中的方法,能够基于账号信息、登录信息建立账号、号码标识、登录终端的关系图,根据账号之间的最短路径确定关联账号,从而提高了识别关联账号的能力,进而提高账号管理能力,有利于网络安全管控。Through the method in the above embodiment, the relationship diagram of account, number identification, and login terminal can be established based on account information and login information, and the associated account can be determined according to the shortest path between the accounts, thereby improving the ability to identify the associated account, thereby improving Account management capabilities are conducive to network security management and control.
在一些实施例中,为了降低计算量,可以根据需要选择预定种子账号,例如将需要分析关联关系的目标账号作为预定种子账号,以预定种子账号为起点确定到其他的账号顶点的最短路径长度,从而避免由于起点遍历造成过大的计算量,提高运算效率。In some embodiments, in order to reduce the amount of calculation, a predetermined seed account can be selected as needed. For example, the target account whose association relationship needs to be analyzed is used as the predetermined seed account, and the shortest path length to the vertices of other accounts is determined from the predetermined seed account. This avoids excessive calculations due to starting point traversal and improves calculation efficiency.
在一些实施例中,可以设置预定第一阈值为最短路径长度的阈值,若账号间的最短路径长度小于预定第一阈值,则两者为关联账号关系;若大于等于预定第一阈值,则排除其关联账号关系。In some embodiments, the predetermined first threshold may be set as the threshold of the shortest path length. If the shortest path length between accounts is less than the predetermined first threshold, the two are associated account relationships; if it is greater than or equal to the predetermined first threshold, exclude Its associated account relationship.
在一些实施例中,还可以设置源账号的关联账号上限,按照最短路径长度从小到大的顺序,确定源账号的关联账号数量不大于预设的上限,从而避免对于账号间的过度关联,降低出错的概率。In some embodiments, the upper limit of the associated account of the source account can also be set. According to the order of the shortest path length, the number of associated accounts of the source account is not greater than the preset upper limit, so as to avoid excessive association between accounts and reduce Probability of error.
在一些实施例中,还可以根据与账号相关联的账号数据信息确定账号与号码标识、登录终端之间的关联关系权重。在生成关系图时,将该关联关系权重作为对应关联关系的顶点之间的路径的权重,权重的大小与关联关系强度负相关,如为关联关系强度的倒数。账号数据信息可以包括账号对应的订单信息、登录信息或收货信息中的一种或多种。在一些实施例中,可以根据关联关系在账号数据信息中出现的次数、或 关联关系在账号数据信息中出现时归属的事件的预定重要程度中的至少一项确定关联关系权重,例如,订单事件的权重为2,手机号关联的购物订单次数乘以2得到的值的倒数为因订单事件产生的手机号与账号之间边的权重;登录事件的权重为1,则用户账号在某设备上登陆次数乘以1的值的倒数为因登录事件产生的登录终端与账号之间边的权重。根据关联账号的最短路径中路径的权重之和能够确定关联账号之间的相似度,权重之和与相似度负相关,例如呈负比例相关,从而在筛选出关联账号之后,进一步确定关联账号之间的相似度,有利于进一步衡量账号之间的关联程度。In some embodiments, the weight of the association relationship between the account and the number identification, and the login terminal may also be determined according to the account data information associated with the account. When the relationship graph is generated, the weight of the relationship is used as the weight of the path between the vertices of the corresponding relationship. The size of the weight is negatively related to the strength of the relationship, such as the inverse of the strength of the relationship. The account data information may include one or more of order information, login information, or delivery information corresponding to the account. In some embodiments, the weight of the association relationship may be determined according to at least one of the number of occurrences of the association relationship in the account data information or the predetermined importance of the event to which the association relationship appears in the account data information, for example, an order event The weight of is 2, and the reciprocal of the number of shopping orders associated with the mobile phone number multiplied by 2 is the weight between the mobile phone number and the account number generated by the order event; the weight of the login event is 1, then the user account is on a certain device The reciprocal of the number of login times multiplied by 1 is the weight between the login terminal and the account due to the login event. According to the sum of the weights of the paths in the shortest path of the associated accounts, the similarity between the associated accounts can be determined. The sum of the weights is negatively correlated with the similarity, for example, in a negative ratio. After the associated accounts are filtered out, the number of associated accounts is further determined. The degree of similarity between the accounts helps to further measure the degree of association between accounts.
本公开的关联账号分析方法的另一些实施例的流程图如图3所示,包括步骤301~307。The flowcharts of other embodiments of the associated account analysis method of the present disclosure are shown in FIG. 3, and include steps 301 to 307.
在步骤301中,根据账号信息建立账号与号码标识间的关联关系。In step 301, an association relationship between the account and the number identification is established according to the account information.
在步骤302中,根据登录信息建立账号与登录终端间的关联关系。In step 302, an association relationship between the account and the login terminal is established according to the login information.
在一些实施例中,步骤301、302的执行顺序不分先后。In some embodiments, steps 301 and 302 are performed in no particular order.
在步骤303中,根据账号与号码标识和登录终端间的关联关系建立关系图,进而分别执行步骤304和步骤305。In step 303, a relationship diagram is established according to the association relationship between the account number and the number identification and the login terminal, and then step 304 and step 305 are performed respectively.
在步骤304中,根据账号数据信息确定关联关系权重,作为对应关联关系的顶点之间的路径的权重,进而执行步骤306。In step 304, the weight of the association relationship is determined according to the account data information, as the weight of the path between the vertices of the corresponding association relationship, and step 306 is performed.
在步骤305中,根据关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号,进而执行步骤306。In step 305, an account whose shortest path between accounts is less than a predetermined first threshold is determined as an associated account according to the relationship graph, and step 306 is performed.
在一些实施例中,可以基于Dijkstra算法进行改进,计算从一个顶点距离K之内前N条最短路径(K、N为正整数),将源顶点称为root,那么该算法的计算流程为:In some embodiments, the Dijkstra algorithm can be improved based on the calculation of the first N shortest paths within the distance K from a vertex (K and N are positive integers), and the source vertex is called root. Then the calculation process of the algorithm is:
先根据BFS算法求root顶点距离K之内可访问到的所有顶点,这些顶点组成的集合记作U。First, find all the vertices accessible within the distance K from the root vertex according to the BFS algorithm. The set of these vertices is denoted as U.
根据Dijkstra算法在U记录的顶点范围内计算出到达root顶点的前N条最短路径,因为Dijkstra算法是一个按路径长度递增的次序产生最短路径的算法,所以并不需要求出root顶点的所有最短路径,其N次产生的最短路径就是所有最短路径中前N条最短的最短路径,N为正整数。According to the Dijkstra algorithm, the first N shortest paths to the root vertex are calculated within the range of the vertices recorded by U. Because the Dijkstra algorithm is an algorithm that generates the shortest paths in increasing order of path length, there is no need to find all the shortest paths of the root vertex. Path, the shortest path generated N times is the first N shortest paths among all the shortest paths, and N is a positive integer.
在步骤306中,根据关联账号的最短路径中路径的权重之和确定关联账号之间的相似度。In step 306, the similarity between the associated accounts is determined according to the sum of the weights of the paths in the shortest path of the associated accounts.
在步骤307中,按照路径的权重从小到大的顺序选择重点关联账号,例如,按照 最短路径的权重从小到大的顺序选择前预定第二阈值个账号作为账号的重点关联账号。In step 307, the key associated accounts are selected in the descending order of the weight of the path, for example, the first predetermined second threshold account is selected as the key associated accounts of the accounts in the descending order of the weight of the shortest path.
通过这样的方法,能够根据事件发生的次数、权重等确定关联关系的权重,从而在筛选过程中排除偶然事件造成的账户关联的情况,进一步保证账户间关联的可靠性。Through this method, the weight of the association relationship can be determined according to the number of events, weight, etc., so as to exclude the account association caused by accidental events in the screening process, and further ensure the reliability of the association between accounts.
在一些实施例中,可以在生成关系图时,将边长设置为与路径权重正相关,如边长与路径权重相等,则边长与关联关系的强度负相关,关联强度越强则边长越短。进而采用最短路径算法计算路径长度小于第一预定阈值的第二预定阈值个关联账号,即为关联关系在第一预定阈值内的路径最短(相似度最高)的第二预定阈值个关联账号。In some embodiments, when generating the relationship graph, the side length can be set to be positively related to the path weight. For example, the side length is equal to the path weight, then the side length is negatively related to the strength of the association relationship, and the stronger the association strength, the side length Shorter. Furthermore, the shortest path algorithm is used to calculate the second predetermined threshold associated accounts whose path length is less than the first predetermined threshold, that is, the second predetermined threshold associated accounts with the shortest path (highest similarity) within the first predetermined threshold.
通过这样的方法,能够通过一次性的最短路径计算得到相似度最接近的预定数量个关联账号,且保证关联强度大于预定要求,提高运算效率,降低对设备的运算压力。Through such a method, a predetermined number of associated accounts with the closest similarity can be obtained through a one-time shortest path calculation, and the associated strength is ensured to be greater than the predetermined requirement, the calculation efficiency is improved, and the calculation pressure on the device is reduced.
本公开的关联账号分析方法的又一些实施例的流程图如图4所示,包括步骤401~405。The flowchart of still other embodiments of the associated account analysis method of the present disclosure is shown in FIG. 4, and includes steps 401-405.
在步骤401中,建立账号与号码标识、登录终端间的关联关系。In step 401, an association relationship between the account number, the number identification, and the login terminal is established.
在步骤402中,生成账号、号码标识和登录终端的抽象顶点标识,记录抽象顶点标识与账号的对应关系。In step 402, the account number, number identifier, and abstract vertex identifier of the login terminal are generated, and the correspondence between the abstract vertex identifier and the account is recorded.
在一些实施例中,如图5所示,包括与账号相关联的订单信息、登录信息或收货信息等的原始边数据文件中,每行记录代表一条边,每条边中包含起始顶点,结束顶点和边的权重值。每个边记录分为三列,第一列为用户账号,第二列为手机号或设备号,第三列为边的权重值。前两列代表边的起始顶点和结束顶点,第三列为边的权重信息。种子用户数据文件存储了所有种子用户的账号,通过指定种子用户的账号能够有目标性的获得种子用户的账号的关联账号,提高针对性性和执行效率。In some embodiments, as shown in FIG. 5, in the original edge data file including order information, login information, or receipt information associated with the account, each row of records represents an edge, and each edge contains a starting vertex. , The weight value of the ending vertex and edge. Each edge record is divided into three columns, the first column is the user account, the second column is the phone number or device number, and the third column is the weight value of the edge. The first two columns represent the starting and ending vertices of the edge, and the third column is the weight information of the edge. The seed user data file stores the accounts of all seed users. By specifying the account of the seed user, the associated account of the seed user's account can be obtained in a targeted manner, which improves the pertinence and execution efficiency.
根据原始边数据和种子用户数据生成抽象图数据,对原始边数据,为原始边中的每个用户账号、设备号、手机号生成唯一对应的连续型数值(抽象顶点),存入抽象图抽象边数据文件中作为边数据输入,并保存用户账号和其对应抽象顶点的映射关系。对种子用户数据,依据用户账号和抽象顶点的映射关系得到每个种子用户账号的抽象顶点数据,并存入抽象图顶点数据文件中作为顶点数据输入。Generate abstract graph data based on the original edge data and seed user data. For the original edge data, generate a unique corresponding continuous value (abstract vertex) for each user account, device number, and mobile phone number in the original edge, and store it in the abstract graph abstraction The edge data file is input as edge data, and the mapping relationship between the user account and its corresponding abstract vertices is saved. For seed user data, the abstract vertex data of each seed user account is obtained according to the mapping relationship between the user account and the abstract vertex, and stored in the abstract graph vertex data file as the vertex data input.
例如图5中,用户账号1抽象为标识0,手机号1抽象为标识1,手机号2抽象为标识2,设备号1(登录终端标识)抽象为标识3,用户账号2抽象为标识4,设备号2抽象为标识5,用户账号3抽象为标识6,手机号3抽象为标识7,设备号3抽象 为标识8。种子用户账号1、3对应的顶点数据标识分别为0、6。For example, in Figure 5, user account 1 is abstracted as identity 0, mobile phone number 1 is abstracted as identity 1, mobile phone number 2 is abstracted as identity 2, device number 1 (login terminal identity) is abstracted as identity 3, and user account 2 is abstracted as identity 4. Device number 2 is abstracted as logo 5, user account 3 is abstracted as logo 6, mobile phone number 3 is abstracted as logo 7, and device number 3 is abstracted as logo 8. The vertex data identifiers corresponding to seed user accounts 1 and 3 are 0 and 6, respectively.
在步骤403中,连通具备关联关系的抽象顶点标识,生成关系图。In step 403, the abstract vertex identifiers with association relationships are connected to generate a relationship graph.
在步骤404中,以预定种子账号的抽象顶点标识为起点,确定关系图中与预定种子账号距离小于预定第一阈值的与账号具备对应关系的抽象顶点标识。In step 404, using the abstract vertex identifier of the predetermined seed account as a starting point, determine the abstract vertex identifier in the relationship graph that has a corresponding relationship with the account whose distance from the predetermined seed account is less than a predetermined first threshold.
在一些实施例中,可以通过加载抽象图边数据和顶点数据在内存中以邻接矩阵的形式表示图,然后执行最短路径算法计算每个种子用户顶点K步之内前N条以用户账号为目的顶点的最短路径。每条最短路径的目的顶点就是种子用户的关联账号,路径上边的权重之和(路径权重)就是账号的相似程度。每个种子用户计算得到的最多N条最短路径就得到该种子用户的最多N个关联账号信息,将计算得到所有关联账号顶点和种子账号顶点信息输出到结果文件中。In some embodiments, the graph can be represented in the form of an adjacency matrix in the memory by loading the abstract graph edge data and vertex data, and then execute the shortest path algorithm to calculate the first N of each seed user vertex within K steps for the purpose of the user account The shortest path to the vertex. The destination vertex of each shortest path is the associated account of the seed user, and the sum of the weights on the path (path weight) is the degree of account similarity. The maximum N shortest paths calculated by each seed user obtain the maximum N associated account information of the seed user, and all the calculated associated account vertices and seed account vertices information are output to the result file.
在步骤405中,根据抽象顶点标识与账号的对应关系,将确定的抽象顶点标识还原为账号。In step 405, according to the corresponding relationship between the abstract vertex identifier and the account, the determined abstract vertex identifier is restored to the account.
通过这样的方法,能够先将原始数据抽象化后进行图计算,降低了图计算时需要处理的数据量,提高了运算的准确度,也提高了运算效率。Through this method, the original data can be abstracted and then the graph calculation can be performed, which reduces the amount of data that needs to be processed during graph calculation, improves the accuracy of the calculation, and also improves the efficiency of the calculation.
在一些实施例中,如图4所示,关联账号分析方法还可以包括步骤406、407。In some embodiments, as shown in FIG. 4, the associated account analysis method may further include steps 406 and 407.
在步骤406中,存储各个账号的关联账号、重点关联账号中的至少一种。In step 406, at least one of the associated account and the key associated account of each account is stored.
在步骤407中,根据关联账号或重点关联账号中的至少一种补充账号实名制信息。例如,对于相似度大于预定相似度阈值的两账号,可以认为其归属的用户相同,两者具有相同的实名制信息。In step 407, the account real-name system information is supplemented according to at least one of the associated account or the key associated account. For example, for two accounts whose similarity is greater than a predetermined similarity threshold, it can be considered that they belong to the same user, and both have the same real-name information.
通过这样的方法,能够补充用户实名制信息,提高网络监管力度。Through this method, it is possible to supplement user real-name information and improve network supervision.
另外,可以在查找某个不具备实名制信息的用户时,通过调查其关联账号的实名制用户确定该不具备实名制信息的用户的身份,提高网络安全追查的成功概率。In addition, when searching for a user who does not have real-name information, the identity of the user without real-name information can be determined by investigating the real-name user of its associated account, thereby increasing the probability of successful network security tracing.
在一些实施例中,还可以在账号数据应用中执行步骤408。In some embodiments, step 408 can also be performed in the account data application.
在步骤408中,减少向关联账号或重点关联账号中的至少一种的推送信息。In step 408, push information to at least one of the associated account or the key associated account is reduced.
在电商产品推广中,为了发掘更多潜在有价值用户,会根据地址信息先计算出已知高价值用户(种子用户)的同事数据,然后再以短信方式给这些同事发送营销信息。通过上文实施例中的方法,能够先识别并排除用户的马甲账号,避免对同一用户反复推送信息,在提高运营效率的同时也节省了短信费用。In the promotion of e-commerce products, in order to discover more potential valuable users, the data of colleagues of known high-value users (seed users) will be calculated based on the address information, and then marketing information will be sent to these colleagues by SMS. Through the method in the above embodiment, the user's vest account can be identified and excluded first, and repeated push of information to the same user is avoided, which not only improves operation efficiency, but also saves SMS costs.
在一些实施例中,可以根据步骤407、408中的执行效果修改上文中提到的预定第一阈值、预定第二阈值,还可以修改权重确定过程中关联关系权重的生成规则、事 件的权重等,从而在运行和应用过程中不断修正参数,进一步提高准确性。In some embodiments, the predetermined first threshold and the predetermined second threshold mentioned above can be modified according to the execution effect in steps 407 and 408, and the generation rule of the association weight in the weight determination process, the weight of the event, etc. can also be modified. , So as to continuously modify the parameters during the operation and application process to further improve the accuracy.
本公开的关联账号分析装置的一些实施例的示意图如图6所示。关联关系建立单元601能够根据账号信息建立账号与号码标识间的关联关系,并根据登录信息建立账号与登录终端间的关联关系。在一些实施例中,号码标识为手机号码或身份证号码。A schematic diagram of some embodiments of the associated account analysis device of the present disclosure is shown in FIG. 6. The association relationship establishment unit 601 can establish an association relationship between an account and a number identifier according to account information, and establish an association relationship between an account and a login terminal according to login information. In some embodiments, the number identification is a mobile phone number or an ID number.
关系图生成单元602能够根据账号与号码标识和登录终端间的关联关系建立关系图,账号、号码标识、登录终端为顶点。The relationship graph generating unit 602 can establish a relationship graph according to the association relationship between the account number, the number identification and the login terminal, and the account number, the number identification, and the login terminal are the vertices.
关联账号确定单元603能够根据账号与号码标识和登录终端间的关联关系建立关系图,账号、号码标识、登录终端为顶点。The associated account determination unit 603 can establish a relationship graph according to the associated relationship between the account, the number identification and the login terminal, and the account, the number identification, and the login terminal are the apex.
这样的关联账号分析装置能够基于账号信息、登录信息建立账号、号码标识、登录终端的关系图,根据账号之间的最短路径确定关联账号,从而提高了识别关联账号的能力,进而提高账号管理能力,有利于网络安全管控。Such an associated account analysis device can establish a relationship diagram of account, number identification, and login terminal based on account information and login information, and determine the associated account based on the shortest path between the accounts, thereby improving the ability to identify associated accounts and thereby improving account management capabilities , Which is conducive to network security management and control.
在一些实施例中,如图6所示,关联账号分析装置还可以包括权重确定单元604和相似度确定单元605,权重确定单元604能够根据与账号相关联的账号数据信息确定账号与号码标识、登录终端之间的关联关系权重,在生成关系图时,将该关联关系权重作为对应关联关系的顶点之间的路径的权重。相似度确定单元605能够根据关联账号的最短路径中路径的权重之和确定关联账号之间的相似度,权重之和与相似度负相关,例如呈负比例相关,从而在筛选出关联账号之后,进一步确定关联账号之间的相似度,有利于进一步衡量账号之间的关联程度。In some embodiments, as shown in FIG. 6, the associated account analysis device may further include a weight determination unit 604 and a similarity determination unit 605. The weight determination unit 604 can determine the account number and the number identification according to the account data information associated with the account number. The weight of the association relationship between the registration terminals is used as the weight of the path between the vertices of the corresponding association when the relationship graph is generated. The similarity determination unit 605 can determine the similarity between the associated accounts according to the sum of the weights of the paths in the shortest path of the associated accounts. The sum of the weights is negatively correlated with the similarity, for example, in a negative ratio, so that after the associated accounts are filtered out, Further determining the degree of similarity between related accounts is helpful to further measure the degree of connection between accounts.
在一些实施例中,关系图生成单元602可以在生成关系图时,将边长设置为与路径权重正相关,如边长与路径权重相等,则边长与关联关系的强度负相关,关联强度越强则边长越短。进而关联账号确定单元603采用最短路径算法计算路径长度小于第一预定阈值的第二预定阈值个关联账号即为关联关系在第一预定阈值内的路径最短(相似度最高)的第二预定阈值个关联账号。In some embodiments, the relationship graph generating unit 602 may set the side length to be positively related to the path weight when generating the relationship graph. If the side length is equal to the path weight, the side length is negatively related to the strength of the association relationship, and the relationship strength The stronger the side length the shorter. Furthermore, the associated account determination unit 603 uses the shortest path algorithm to calculate the second predetermined threshold whose path length is less than the first predetermined threshold. The number of associated accounts is the second predetermined threshold with the shortest path (highest similarity) within the first predetermined threshold. Link accounts.
这样的关联账号分析装置能够通过一次性的最短路径计算得到相似度最接近的预定数量个关联账号,且保证关联强度大于预定要求,提高运算效率,降低对设备的运算压力。Such an associated account analysis device can obtain a predetermined number of associated accounts with the closest similarity through a one-time shortest path calculation, and ensure that the associated strength is greater than a predetermined requirement, improve computing efficiency, and reduce computing pressure on equipment.
在一些实施例中,关联账号分析装置还可以包括关联信息应用单元606,能够根据关联账号、重点关联账号补充账号实名制信息。例如,当两账号的相似度大于预定相似度阈值时,可以认为其归属的用户相同,两者具有相同的实名制信息。这样的关联账号分析装置能够补充用户实名制信息,提高网络监管力度。另外,关联信息应用 单元606可以在查找某个不具备实名制信息的用户时,通过调查其关联账号的实名制用户确定该不具备实名制信息的用户的身份,提高安全追查的成功概率。In some embodiments, the associated account analysis device may further include an associated information application unit 606, which can supplement real-name account information based on associated accounts and key associated accounts. For example, when the similarity of two accounts is greater than a predetermined similarity threshold, it can be considered that the users belong to the same user, and the two accounts have the same real-name information. Such an associated account analysis device can supplement user real-name information and improve network supervision. In addition, when searching for a user who does not have real-name information, the associated information application unit 606 can determine the identity of the user without real-name information by investigating the real-name user of the associated account, thereby increasing the probability of successful security tracking.
在一些实施例中,关联信息应用单元606能够减少向关联账号或重点关联账号中的至少一种的信息推送,从而在运营过程中先识别并排除用户的马甲账号,避免对同一用户反复推送信息,在提高运营效率的同时也节省了短信费用。In some embodiments, the associated information application unit 606 can reduce information pushes to at least one of an associated account or a key associated account, so that the user’s vest account is first identified and excluded during the operation process, so as to avoid repeatedly pushing information to the same user , While improving operational efficiency, it also saves SMS costs.
在一些实施例中,关联账号分析装置还可以包括阈值调整单元607,能够根据关联信息应用单元606的运行效果修改上文中提到的预定第一阈值、预定第二阈值,还可以修改权重确定过程中关联关系权重的生成规则、事件的权重等,从而在运行和应用过程中不断修正,进一步提高准确性。In some embodiments, the associated account analysis device may further include a threshold adjustment unit 607, which can modify the predetermined first threshold and the predetermined second threshold mentioned above according to the operating effect of the associated information application unit 606, and can also modify the weight determination process. The generation rules of the correlation weights in the middle, the weights of events, etc., are continuously revised during the operation and application process to further improve the accuracy.
本公开关联账号分析装置的一个实施例的结构示意图如图7所示。关联账号分析装置包括存储器701和处理器702。其中:存储器701可以是磁盘、闪存或其它任何非易失性存储介质。存储器用于存储上文中关联账号分析方法的对应实施例中的指令。处理器702耦接至存储器701,可以作为一个或多个集成电路来实施,例如微处理器或微控制器。该处理器702用于执行存储器中存储的指令,能够提高识别关联账号的能力,进而提高账号管理能力,有利于网络安全管控。A schematic structural diagram of an embodiment of the associated account analysis device of the present disclosure is shown in FIG. 7. The associated account analysis device includes a memory 701 and a processor 702. Wherein: the memory 701 may be a magnetic disk, flash memory or any other non-volatile storage medium. The memory is used to store the instructions in the corresponding embodiment of the above associated account analysis method. The processor 702 is coupled to the memory 701 and can be implemented as one or more integrated circuits, such as a microprocessor or a microcontroller. The processor 702 is configured to execute instructions stored in the memory, which can improve the ability to identify associated accounts, thereby improving account management capabilities, and is beneficial to network security management and control.
在一个实施例中,还可以如图8所示,关联账号分析装置800包括存储器801和处理器802。处理器802通过BUS总线803耦合至存储器801。该关联账号分析装置800还可以通过存储接口804连接至外部存储装置805以便调用外部数据,还可以通过网络接口806连接至网络或者另外一台计算机系统(未标出)。此处不再进行详细介绍。In an embodiment, as shown in FIG. 8, the associated account analysis device 800 includes a memory 801 and a processor 802. The processor 802 is coupled to the memory 801 through the BUS bus 803. The associated account analysis device 800 can also be connected to an external storage device 805 via a storage interface 804 to call external data, and can also be connected to a network or another computer system (not shown) via a network interface 806. No more detailed introduction here.
在该实施例中,通过存储器存储数据指令,再通过处理器处理上述指令,能够提高识别关联账号的能力,进而提高账号管理能力,有利于网络安全管控。In this embodiment, storing data instructions through a memory and processing the above instructions through a processor can improve the ability to identify associated accounts, thereby improving account management capabilities, which is beneficial to network security management and control.
在另一个实施例中,一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现关联账号分析方法对应实施例中的方法的步骤。本领域内的技术人员应明白,本公开的实施例可提供为方法、装置、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用非瞬时性存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。In another embodiment, a computer-readable storage medium has computer program instructions stored thereon, which, when executed by a processor, implement the steps of the method in the corresponding embodiment of the associated account analysis method. Those skilled in the art should understand that the embodiments of the present disclosure may be provided as methods, devices, or computer program products. Therefore, the present disclosure may adopt the form of a complete hardware embodiment, a complete software embodiment, or an embodiment combining software and hardware. Moreover, the present disclosure may take the form of a computer program product implemented on one or more computer-usable non-transitory storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) containing computer-usable program codes. .
本公开是参照根据本公开实施例的方法、设备(系统)和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present disclosure is described with reference to flowcharts and/or block diagrams of methods, devices (systems) and computer program products according to embodiments of the present disclosure. It should be understood that each process and/or block in the flowchart and/or block diagram and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.
至此,已经详细描述了本公开。为了避免遮蔽本公开的构思,没有描述本领域所公知的一些细节。本领域技术人员根据上面的描述,完全可以明白如何实施这里公开的技术方案。So far, the present disclosure has been described in detail. In order to avoid obscuring the concept of the present disclosure, some details known in the art are not described. Based on the above description, those skilled in the art can fully understand how to implement the technical solutions disclosed herein.
可能以许多方式来实现本公开的方法以及装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本公开的方法以及装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。The method and apparatus of the present disclosure may be implemented in many ways. For example, the method and apparatus of the present disclosure can be implemented by software, hardware, firmware or any combination of software, hardware, and firmware. The above-mentioned order of the steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above, unless specifically stated otherwise. In addition, in some embodiments, the present disclosure may also be implemented as programs recorded in a recording medium, and these programs include machine-readable instructions for implementing the method according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
最后应当说明的是:以上实施例仅用以说明本公开的技术方案而非对其限制;尽管参照较佳实施例对本公开进行了详细的说明,所属领域的普通技术人员应当理解:依然可以对本公开的具体实施方式进行修改或者对部分技术特征进行等同替换;而不脱离本公开技术方案的精神,其均应涵盖在本公开请求保护的技术方案范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present disclosure and not to limit it; although the present disclosure has been described in detail with reference to preferred embodiments, those of ordinary skill in the art should understand that: The disclosed specific implementations are modified or equivalent replacements of some technical features; without departing from the spirit of the technical solutions of the present disclosure, they should all be covered in the scope of the technical solutions claimed by the present disclosure.

Claims (13)

  1. 一种关联账号分析方法,包括:An associated account analysis method, including:
    根据账号信息建立账号与号码标识间的关联关系,其中,所述号码标识为手机号码或身份证号码;Establishing an association relationship between the account number and the number identification according to the account information, where the number identification is a mobile phone number or an ID number;
    根据登录信息建立账号与登录终端间的关联关系;Establish an association relationship between the account and the login terminal according to the login information;
    根据所述账号与号码标识和所述账号与登录终端间的关联关系建立关系图,其中,账号、号码标识、登录终端为顶点,具备关联关系的顶点之间连通;和Establish a relationship graph based on the account number and the number identification and the association relationship between the account number and the login terminal, wherein the account number, the number identification, and the login terminal are vertices, and the vertices with the association relationship are connected; and
    根据所述关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号。According to the relationship graph, it is determined that an account whose shortest path between accounts is less than a predetermined first threshold is an associated account.
  2. 根据权利要求1所述的关联账号分析方法,还包括:The associated account analysis method according to claim 1, further comprising:
    根据与账号相关联的账号数据信息确定所述账号与号码标识间的关联关系、以及账号与登录终端间的关联关系的关联关系权重,作为对应关联关系的顶点之间的路径的权重,其中,所述账号数据信息包括订单信息、登录信息或收货信息中的一种或多种;和Determine the association relationship between the account and the number identification and the association relationship between the account and the login terminal according to the account data information associated with the account, as the weight of the path between the vertices of the corresponding association relationship, where: The account data information includes one or more of order information, login information or delivery information; and
    根据关联账号的最短路径中路径的权重之和确定关联账号之间的相似度。Determine the similarity between the associated accounts according to the sum of the weights of the paths in the shortest path of the associated accounts.
  3. 根据权利要求2所述的关联账号分析方法,其中,所述关联关系权重大于0,且所述关联关系与所述账号数据信息中出现的次数、或所述关联关系在所述账号数据信息中出现时归属的事件的预定重要程度中的至少一项负相关。The associated account analysis method according to claim 2, wherein the weight of the associated relationship is greater than 0, and the associated relationship and the number of occurrences in the account data information, or the associated relationship in the account data information At least one item of the predetermined importance of the attributable event at the time of occurrence is negatively correlated.
  4. 根据权利要求1所述的关联账号分析方法,其中,所述确定账号之间的最短路径小于预定第一阈值的账号为关联账号包括:The associated account analysis method according to claim 1, wherein the determining that an account whose shortest path between accounts is less than a predetermined first threshold is an associated account comprises:
    确定与预定种子账号之间的最短路径小于所述预定第一阈值的账号为目标账号;Determining that the account whose shortest path to the predetermined seed account is less than the predetermined first threshold is the target account;
    确定所述目标账号为所述预定种子账号的关联账号。It is determined that the target account is an associated account of the predetermined seed account.
  5. 根据权利要求4所述的关联账号分析方法,其中,The associated account analysis method according to claim 4, wherein:
    所述建立关系图包括:The establishment of the relationship diagram includes:
    生成所述账号、所述号码标识和所述登录终端的抽象顶点标识,记录所述抽象顶点标识与所述账号的对应关系;和Generating the account number, the number identifier, and the abstract vertex identifier of the login terminal, and recording the corresponding relationship between the abstract vertex identifier and the account; and
    连通具备关联关系的抽象顶点标识,生成所述关系图;Connecting abstract vertex identifiers with associated relationships to generate the relationship graph;
    所述确定账号之间的最短路径小于预定第一阈值的账号为关联账号包括:The determining that an account whose shortest path between accounts is less than a predetermined first threshold is an associated account includes:
    以所述预定种子账号的抽象顶点标识为起点,确定所述关系图中与所述预定 种子账号距离小于预定第一阈值、且与账号具备对应关系的抽象顶点标识;和Using the abstract vertex identifier of the predetermined seed account as a starting point, determine the abstract vertex identifier in the relationship graph whose distance from the predetermined seed account is less than a predetermined first threshold and has a corresponding relationship with the account; and
    根据所述抽象顶点标识与所述账号的对应关系,将确定的所述抽象顶点标识还原为账号。According to the corresponding relationship between the abstract vertex identifier and the account, the determined abstract vertex identifier is restored to the account.
  6. 根据权利要求2所述的关联账号分析方法,还包括:The associated account analysis method according to claim 2, further comprising:
    按照所述路径的权重从小到大的顺序,选择前预定第二阈值个账号作为账号的重点关联账号。According to the descending order of the weight of the path, the previously predetermined second threshold account is selected as the key associated account of the account.
  7. 根据权利要求3所述的方法,其中,所述关系图中的边长与所述权重正相关;The method according to claim 3, wherein the side length in the relationship graph is positively related to the weight;
    所述根据所述关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号为:按照最短路径算法确定路径长度小于所述预定低于距离的预定第二阈值个账号作为账号的重点关联账号。The determining that an account whose shortest path between accounts is less than a predetermined first threshold according to the relationship diagram is an associated account is: determining, according to the shortest path algorithm, an account whose path length is less than the predetermined second threshold below the predetermined distance as the account Focus on associated accounts.
  8. 根据权利要求6或7所述的关联账号分析方法,还包括:The associated account analysis method according to claim 6 or 7, further comprising:
    存储各个账号的关联账号或重点关联账号中的至少一种;和Store at least one of the associated account or key associated account of each account; and
    根据所述关联账号或重点关联账号中的至少一种补充账号实名制信息。Supplement the real-name account information according to at least one of the associated account or key associated account.
  9. 根据权利要求6或7所述的关联账号分析方法,还包括:The associated account analysis method according to claim 6 or 7, further comprising:
    存储各个账号的关联账号或重点关联账号中的至少一种;和Store at least one of the associated account or key associated account of each account; and
    减少向所述关联账号或重点关联账号中的至少一种推送信息,根据信息的推广效果调整所述预定第一阈值或预定第二阈值中的至少一项。Reduce pushing information to at least one of the associated account or key associated account, and adjust at least one of the predetermined first threshold or the predetermined second threshold according to the promotion effect of the information.
  10. 根据权利要求1所述的关联账号分析方法,其中,所述根据所述关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号包括:The associated account analysis method according to claim 1, wherein the determining that the account whose shortest path between accounts is less than a predetermined first threshold according to the relationship diagram is an associated account comprises:
    根据广度优先搜索BFS算法求源顶点预定第一阈值距离之内能够访问到的所有顶点;According to the breadth-first search BFS algorithm, find all the vertices that can be accessed within the predetermined first threshold distance of the source vertex;
    根据迪杰斯特拉Dijkstra算法在所述能够访问到的所有顶点范围内计算出到达所述源顶点的前预定数量个顶点;和According to the Dijkstra algorithm, calculate the predetermined number of vertices before reaching the source vertex within the range of all the vertices that can be accessed; and
    确定所述前预定数量个顶点对应的账号为源顶点账号的关联账号。It is determined that the account corresponding to the first predetermined number of vertices is the associated account of the source vertex account.
  11. 一种关联账号分析装置,包括:An associated account analysis device, including:
    关联关系建立单元,被配置为根据账号信息建立账号与号码标识间的关联关系,根据登录信息建立账号与登录终端间的关联关系;其中,所述号码标识为手机号码或身份证号码;The association relationship establishment unit is configured to establish an association relationship between an account and a number identification according to account information, and establish an association relationship between an account and a login terminal according to login information; wherein the number identification is a mobile phone number or an ID number;
    关系图生成单元,被配置为根据所述账号与号码标识和所述账号与登录终端间的 关联关系建立关系图,其中,账号、号码标识、登录终端为顶点,具备关联关系的顶点之间连通;和The relationship graph generating unit is configured to establish a relationship graph according to the association relationship between the account and the number identification and the account and the login terminal, wherein the account, the number identification, and the login terminal are vertices, and the vertices with the association relationship are connected ;with
    关联账号确定单元,被配置为根据所述关系图确定账号之间的最短路径小于预定第一阈值的账号为关联账号。The associated account determination unit is configured to determine, according to the relationship diagram, an account whose shortest path between accounts is less than a predetermined first threshold is an associated account.
  12. 一种关联账号分析装置,包括:An associated account analysis device, including:
    存储器;以及耦接至所述存储器的处理器,所述处理器被配置为基于存储在所述存储器的指令执行如权利要求1至10任一项所述的方法。A memory; and a processor coupled to the memory, the processor being configured to execute the method according to any one of claims 1 to 10 based on instructions stored in the memory.
  13. 一种计算机可读存储介质,其上存储有计算机程序指令,该指令被处理器执行时实现权利要求1至10任意一项所述的方法的步骤。A computer-readable storage medium having computer program instructions stored thereon, and when the instructions are executed by a processor, the steps of the method according to any one of claims 1 to 10 are realized.
PCT/CN2020/086930 2019-06-28 2020-04-26 Associated account analysis method and apparatus, and computer-readable storage medium WO2020259054A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910572269.X 2019-06-28
CN201910572269.XA CN110287688B (en) 2019-06-28 2019-06-28 Associated account analysis method and device and computer-readable storage medium

Publications (1)

Publication Number Publication Date
WO2020259054A1 true WO2020259054A1 (en) 2020-12-30

Family

ID=68019412

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/086930 WO2020259054A1 (en) 2019-06-28 2020-04-26 Associated account analysis method and apparatus, and computer-readable storage medium

Country Status (2)

Country Link
CN (1) CN110287688B (en)
WO (1) WO2020259054A1 (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287688B (en) * 2019-06-28 2020-11-24 京东数字科技控股有限公司 Associated account analysis method and device and computer-readable storage medium
CN111291234A (en) * 2020-03-31 2020-06-16 京东数字科技控股有限公司 Account risk probability assessment method, device and system and storage medium
CN111695019B (en) * 2020-06-11 2023-08-08 腾讯科技(深圳)有限公司 Method and device for identifying associated account
CN113760939A (en) * 2020-07-01 2021-12-07 北京沃东天骏信息技术有限公司 Account type determination method, device and equipment
CN111701247B (en) * 2020-07-13 2022-03-22 腾讯科技(深圳)有限公司 Method and equipment for determining unified account
CN112995283B (en) * 2021-02-03 2023-03-14 杭州海康威视系统技术有限公司 Object association method and device and electronic equipment
CN113254351B (en) * 2021-06-24 2022-02-15 支付宝(杭州)信息技术有限公司 Graph data generation method and device
CN114006737B (en) * 2021-10-25 2023-09-01 北京三快在线科技有限公司 Account safety detection method and detection device
CN114742479B (en) * 2022-06-10 2022-09-06 深圳竹云科技股份有限公司 Account identification method, account identification device, server and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156726A (en) * 2011-04-01 2011-08-17 中国测绘科学研究院 Geographic element querying and extending method based on semantic similarity
CN107404408A (en) * 2017-08-30 2017-11-28 北京邮电大学 A kind of virtual identity association recognition methods and device
CN107563885A (en) * 2017-08-08 2018-01-09 阿里巴巴集团控股有限公司 A kind of arbitrage recognition methods and device
US20180342173A1 (en) * 2017-05-24 2018-11-29 Winston Jordan Method of Monitoring Glucose Levels for Weight Loss
CN109086317A (en) * 2018-06-28 2018-12-25 招联消费金融有限公司 Risk control method and relevant apparatus
US10192461B2 (en) * 2017-06-12 2019-01-29 Harmony Helper, LLC Transcribing voiced musical notes for creating, practicing and sharing of musical harmonies
CN110287688A (en) * 2019-06-28 2019-09-27 京东数字科技控股有限公司 Associated account number analysis method, device and computer readable storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105262794B (en) * 2015-09-17 2018-08-17 腾讯科技(深圳)有限公司 Content put-on method and device
CN107733774B (en) * 2016-08-11 2020-06-16 北京国双科技有限公司 Account number association method and device
CN108288168A (en) * 2018-02-10 2018-07-17 张宇 Borrow or lend money register method, terminal and the computer readable storage medium of service account
CN109639719B (en) * 2019-01-07 2020-01-24 武汉稀云科技有限公司 Identity verification method and device based on temporary identifier

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102156726A (en) * 2011-04-01 2011-08-17 中国测绘科学研究院 Geographic element querying and extending method based on semantic similarity
US20180342173A1 (en) * 2017-05-24 2018-11-29 Winston Jordan Method of Monitoring Glucose Levels for Weight Loss
US10192461B2 (en) * 2017-06-12 2019-01-29 Harmony Helper, LLC Transcribing voiced musical notes for creating, practicing and sharing of musical harmonies
CN107563885A (en) * 2017-08-08 2018-01-09 阿里巴巴集团控股有限公司 A kind of arbitrage recognition methods and device
CN107404408A (en) * 2017-08-30 2017-11-28 北京邮电大学 A kind of virtual identity association recognition methods and device
CN109086317A (en) * 2018-06-28 2018-12-25 招联消费金融有限公司 Risk control method and relevant apparatus
CN110287688A (en) * 2019-06-28 2019-09-27 京东数字科技控股有限公司 Associated account number analysis method, device and computer readable storage medium

Also Published As

Publication number Publication date
CN110287688B (en) 2020-11-24
CN110287688A (en) 2019-09-27

Similar Documents

Publication Publication Date Title
WO2020259054A1 (en) Associated account analysis method and apparatus, and computer-readable storage medium
US11222285B2 (en) Feature selection method, device and apparatus for constructing machine learning model
CN107424069B (en) Wind control feature generation method, risk monitoring method and equipment
KR102175226B1 (en) Methods and devices for controlling data risk
CN107122369B (en) Service data processing method, device and system
JP2017123168A (en) Method for making entity mention in short text associated with entity in semantic knowledge base, and device
US10311288B1 (en) Determining identity of a person in a digital image
RU2617921C2 (en) Category path recognition method and system
WO2021129379A1 (en) Information sharing chain generation method and apparatus, electronic device, and storage medium
US20210335025A1 (en) Data processing method and apparatus, electronic device, and storage medium
WO2020177450A1 (en) Information merging method, transaction query method and apparatus, computer and storage medium
CN110224859B (en) Method and system for identifying a group
CN111966912A (en) Recommendation method and device based on knowledge graph, computer equipment and storage medium
CN111324883B (en) Internet-based E-commerce platform intrusion detection method and computer equipment
WO2019061667A1 (en) Electronic apparatus, data processing method and system, and computer-readable storage medium
CN111831279A (en) Interface code generation method and device
CN107016028B (en) Data processing method and apparatus thereof
CN105550240B (en) A kind of method and device of recommendation
CN105183867B (en) Data processing method and device
CN111860655B (en) User processing method, device and equipment
CN113792800B (en) Feature generation method and device, electronic equipment and storage medium
CN114818843A (en) Data analysis method and device and computing equipment
CN107248929B (en) Strong correlation data generation method of multi-dimensional correlation data
CN114490095B (en) Request result determination method and device, storage medium and electronic device
CN112667679B (en) Data relationship determination method, device and server

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20830803

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20830803

Country of ref document: EP

Kind code of ref document: A1