CN108009168B - User account identification method and device - Google Patents

User account identification method and device Download PDF

Info

Publication number
CN108009168B
CN108009168B CN201610933697.7A CN201610933697A CN108009168B CN 108009168 B CN108009168 B CN 108009168B CN 201610933697 A CN201610933697 A CN 201610933697A CN 108009168 B CN108009168 B CN 108009168B
Authority
CN
China
Prior art keywords
user account
user
information
account
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610933697.7A
Other languages
Chinese (zh)
Other versions
CN108009168A (en
Inventor
黄引刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201610933697.7A priority Critical patent/CN108009168B/en
Publication of CN108009168A publication Critical patent/CN108009168A/en
Application granted granted Critical
Publication of CN108009168B publication Critical patent/CN108009168B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a user account identification method and device, and belongs to the technical field of internet. The method comprises the following steps: acquiring attribute information of a first user account and attribute information of a second user account from an account server, wherein the attribute information comprises at least one of a bound third party account, historical login point information and user data information; detecting whether the similarity of the attribute information of the first user account and the attribute information of the second user account reaches a preset condition or not; if yes, acquiring real name information of the first user account and real name information of the second user account; detecting whether the real name information of the first user account is the same as the real name information of the second user account; and if so, determining the first user account and the second user account as corresponding to the same natural person. The invention solves the problem that the identification of whether two different user accounts correspond to the same natural person is inaccurate, and achieves the effect of more accurate identification.

Description

User account identification method and device
Technical Field
The embodiment of the invention relates to the technical field of internet, in particular to a user account identification method and device.
Background
The user account is an identification of the user using the internet service, and when the user uses the internet service provided by the network platform, the user account of the network platform usually needs to be registered.
Since there are many network platforms providing different types of internet services, the same physical person may register a plurality of different user accounts on a plurality of network platforms, such as: one user account is registered on the network platform A, and two user accounts are registered on the network platform B, wherein one user account on the network platform B is used for work, and the other user account is used for life and leisure. When it is necessary to identify whether two different user accounts correspond to the same physical person, the similarity between the information issued by the two user accounts is compared in the prior art, for example: extracting a first keyword set in the information issued by the user account a, extracting a second keyword set in the information issued by the user account b, and calculating the similarity between the two user accounts according to the occurrence probability of the same keywords in the first keyword set and the second keyword set.
The user account a and the user account b are assumed to correspond to the same natural person, but if the network platforms of the user account a and the user account b are different, the information issued by the natural person in the user account a and the user account b may also be different, so that the final comparison result of the similarity of the information is inaccurate; or, if the user account a and the user account b of the same natural person on the same network platform are used for work and leisure, the user account a and the user account b generally do not issue similar information, and the result of comparing the similarity of the information is inaccurate. The problem that the method is inaccurate in identifying whether two different user accounts correspond to the same natural person or not is caused because the comparison result of the information similarity is inaccurate.
Disclosure of Invention
In order to solve the problem that in the prior art, the identification of whether two different user accounts correspond to the same natural person is inaccurate, the embodiment of the invention provides a user account identification method and device. The technical scheme is as follows:
in a first aspect, a method for identifying a user account is provided, where the method includes:
acquiring attribute information of a first user account and attribute information of a second user account from an account server, wherein the attribute information comprises at least one of a bound third party account, historical login point information and user data information;
detecting whether the similarity of the attribute information of the first user account and the attribute information of the second user account reaches a preset condition or not;
when the similarity between the attribute information of the first user account and the attribute information of the second user account reaches the preset condition, acquiring the real name information of the first user account and the real name information of the second user account;
detecting whether the real name information of the first user account is the same as the real name information of the second user account;
and when the real name information of the first user account is the same as that of the second user account, determining the first user account and the second user account as corresponding to the same natural person.
In a second aspect, an apparatus for identifying a user account is provided, the apparatus comprising:
the first acquisition module is used for acquiring the attribute information of a first user account and the attribute information of a second user account from an account server, wherein the attribute information comprises at least one of a bound third party account, historical login point information and user data information;
the first detection module is used for detecting whether the similarity between the attribute information of the first user account acquired by the first acquisition module and the attribute information of the second user account reaches a preset condition or not;
the second obtaining module is used for obtaining the real name information of the first user account and the real name information of the second user account when the first detecting module detects that the similarity between the attribute information of the first user account and the attribute information of the second user account reaches the preset condition;
the second detection module is used for detecting whether the real name information of the first user account acquired by the second acquisition module is the same as the real name information of the second user account;
the determining module is configured to determine that the first user account and the second user account correspond to the same natural person when the second detecting module detects that the real name information of the first user account is the same as the real name information of the second user account.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
when the similarity of the attribute information of the first user account and the second user account reaches a preset condition, whether the real name information of the first user account and the second user account is the same or not is detected, and whether the first user account and the second user account correspond to the same natural person or not is determined according to the detection result of the real name information; compared with the prior art, the method and the device have the advantages that whether the two user accounts belong to the same natural person or not is identified by combining the two dimensions of the attribute information and the real name information of the two user accounts, and when the similarity of the attribute information of the two user accounts reaches the preset condition and the real name information of the two user accounts is the same, the two user accounts are determined to correspond to the same natural person, so that the identification of whether the different user accounts correspond to the same natural person or not is more accurate.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic illustration of an exemplary environment in which a method for identifying a user account is described in accordance with some embodiments of the invention;
FIG. 2 is a flowchart of a method for identifying a user account according to an embodiment of the present invention;
FIG. 3 is a flowchart of a method for identifying a user account according to another embodiment of the present invention;
FIG. 4A is a flowchart of a method for identifying a user account provided in a further embodiment of the present invention;
FIG. 4B is a flowchart of a method for calculating the association degree between the first user account and the second user account according to an embodiment of the present invention;
FIG. 4C is a schematic illustration of a weighted bipartite graph provided in an embodiment of the invention;
FIG. 4D is a flowchart of a method for calculating the association between the first user account and the second user account according to another embodiment of the present invention;
FIG. 5 is a flow chart of a method for identifying a user account provided in a further embodiment of the present invention;
FIG. 6 is a flow chart of a method for identifying a user account provided in a further embodiment of the present invention;
FIG. 7 is a flowchart of a method for identifying a user account provided in a further embodiment of the present invention;
fig. 8 is a block diagram showing the construction of a user account identifying apparatus according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a server provided in one embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.
Fig. 1 is a schematic diagram of an implementation environment related to a user account identification method according to a partial embodiment of the present invention, where the implementation environment includes: a server 110 for application a, a server 120 for application B, an account analysis server 130, and a user terminal 140.
The server 110 of application a and the server 120 of application B are used to provide account generation, management and other network services for the user terminal 140. The server 110 of application a can also obtain and store the attribute information of the user account a of application a, and the server 120 of application B can also obtain and store the attribute information of the user account B of application B. Optionally, the attribute information includes: the system comprises a third party account bound with a user account, historical login equipment information logged in by the user account, a wireless network identifier used when the user account is logged in, IP address information used when the user account is logged in and the like.
Optionally, the server 110 of the application a may obtain the real-name information of the user account a by mining according to the social information of the user account a, and the server 120 of the application B may obtain the real-name information of the user account B by mining according to the social information of the user account B. Real-name information is information used by natural people in real society to identify identities, such as names, identification numbers, driver's license numbers, passport numbers, and the like.
The account analysis server 130 is configured to obtain attribute information and real name information of a user account stored in the server 110 of the application a or the server 120 of the application B, and has an ability to analyze whether different user accounts correspond to the same natural person according to the attribute information and the real name information of the user account.
Whether the server 110 of the application a, the server 120 of the application B, or the account analysis server 130 is a server, or a server cluster composed of a plurality of servers, data between the servers in the server cluster may be shared, or a cloud computing service center may be used. The embodiment of the present invention does not limit the physical implementation manner of each server.
The user terminal 140 may be a cell phone, a tablet computer, a laptop portable computer, a desktop computer, and the like. The user terminal 140 may be used to install a client of application a or a client of application B, using a web service provided by the server 110 of application a through the client of application a, and using a web service provided by the server 120 of application B through the client of application B. Generally, when the user terminal 140 needs to use the network service provided by the server 110 of the application a, the user account a of the application a needs to be registered first, and when the user terminal 140 needs to use the network service provided by the server 120 of the application B, the user account B of the application B needs to be registered first. The user terminal 140 may also register 2 or more than 2 user accounts for the same application.
The user terminal 140 is connected to the server 110 of the application a or the server 120 of the application B via a communication network. Optionally, the communication network is a wired network or a wireless network.
Optionally, the implementation environment further includes an application server 150, where the application server 150 is configured to provide the user terminal 140 with a network service for payment or advertisement promotion according to the analysis result of the account analysis server 130.
In one possible implementation, the accounting server 130 may be combined with the server 110 of application a or the server 120 of application B, i.e., the server 110 of application a or the server 120 of application B can also provide the network service provided by the accounting server 130.
After determining that the first user account and the second user account correspond to the same physical person, the account analysis server 130 sends the recognition result to the application server 150. For example, in the payment system, the account analysis server 130 sets the same credit score for the first user account and the second user account, and the application server 150 sets a uniform credit limit for the first user account and the second user account according to the received credit scores of the first user account and the second user account. Or, the account analysis server 130 obtains information issued by the first user account and information issued by the second user account, and in combination with the information issued by the first user account and the information issued by the second user account, finds interest characteristics of natural people corresponding to the first user account and the second user account, and the application server 150 provides network services, such as advertisement promotion, application recommendation related to interest, and the like, to the first user account and/or the second user account according to the interest characteristics.
Those skilled in the art will appreciate that more or fewer devices may be included in the above described implementation environments. Such as: the application class server 150 is not necessary; for another example: the above-described implementation environment is only illustrative and does not constitute a specific limitation on the implementation environment.
Fig. 2 is a flowchart of a method for identifying a user account according to an embodiment of the present invention, where the method for identifying a user account may be applied to the account analysis server 130 shown in fig. 1, and may also be applied to other servers capable of analyzing whether different user accounts correspond to the same physical person according to attribute information of the user account. As shown in fig. 2, the user account identification method includes:
step 201, obtaining attribute information of a first user account and attribute information of a second user account from an account server.
The account server is the server 110 of the application program a or the server 120 of the application program B shown in fig. 1, and is used for acquiring and storing the attribute information of the user account.
Optionally, the first user account and the second user account are user accounts of the same application program, such as: the first user account and the second user account are both user accounts of the application program A or both user accounts of the application program B.
Optionally, the first user account and the second user account are user accounts of different applications, for example, the first user account is a user account of application a, and the second user account is a user account of application B.
Optionally, the attribute information includes a bound third party account, where the third party account is an account with a unique identifier property, and for example, the third party account is any one of a social account, a mailbox address, a mobile phone number, and a certificate number. The third party account is said to be relative to the account server, the third party account may be a user registered on another account server, different from another user account of the first user account and the second user account, and the user account has an ability to uniquely identify one of the plurality of natural persons.
Optionally, the attribute information includes historical login point information, where the historical login point information is: any one of a history login device, a history login wireless network and a history login IP address.
Optionally, the attribute information further includes user profile information, where the user profile information refers to related information actively filled or issued by the user, such as: any one of a nickname of the user, age of the user, sex of the user, address, and issued status information.
Step 202, detecting whether the similarity between the attribute information of the first user account and the attribute information of the second user account reaches a preset condition.
Illustratively, when the attribute information is a bound third party account, the preset condition is that the third party account bound to the first user account is the same as the third party account bound to the second user account, that is, the similarity between the attribute information of the first user account and the attribute information of the second user account is 100%.
Illustratively, when the attribute information is history login point information, that is, any one of history login device information, history login wireless network and history login IP address, the preset condition is that the association degree of the first user account and the second user account calculated according to the history login point information of the first user account and the history login point information of the second user account reaches a predetermined threshold, the predetermined threshold is set according to a requirement on the association degree between the first user account and the second user account, and the higher the requirement is, the larger the predetermined threshold is.
Step 203, when the similarity between the attribute information of the first user account and the attribute information of the second user account reaches a preset condition, acquiring the real name information of the first user account and the real name information of the second user account.
When the similarity between the attribute information of the first user account and the attribute information of the second user account reaches a preset condition, it indicates that a certain contact exists between the first user account and the second user account, and the first user account and the second user account may correspond to the same natural person. For example, when the attribute information is a bound third party account, the third party account bound to the first user account is the same as the third party account bound to the second user account, which indicates that the user corresponding to the first user account and the user corresponding to the second user account may be the same natural person or a person close to the relationship. For another example, when the attribute information is historical login point information, when the association degree of the first user account and the second user account calculated according to the historical login point information of the first user account and the historical login point information of the second user account reaches a predetermined threshold, it indicates that the first user account and the second user account are logged in at the same historical login point for a greater number of times, and therefore, the users corresponding to the first user account and the second user account may be the same natural person or people in close relationship.
In one possible implementation, when the user registers the user account, the user fills in the real-name information on the personal data, the account server may obtain the real-name information according to the personal data filled in by the user, and then the account analysis server 130 obtains the real-name information of the user account from the account server.
In another possible implementation, the account server searches real name information of the user account according to social information of the user account, for example, the account server obtains the real name information of the user account through a remark name of the user account in a chat group; for another example, the account server obtains the real name information of the user account according to the remark name of the user account having a friend relationship with the user account to the user account, and then the account analysis server 130 obtains the real name information of the user account from the account server.
Step 204, detecting whether the real name information of the first user account is the same as the real name information of the second user account.
Whether the real name information of the first user account is the same as that of the second user account is detected, so as to further confirm whether the first user account and the second user account correspond to the same natural person.
Step 205, when the real name information of the first user account is the same as the real name information of the second user account, determining the first user account and the second user account as corresponding to the same natural person.
In summary, in the user account identification method provided in the embodiment of the present invention, when it is detected that the similarity between the attribute information of the first user account and the attribute information of the second user account reaches the preset condition, it is detected whether the real name information of the first user account and the real name information of the second user account are the same, and it is determined whether the first user account and the second user account correspond to the same natural person according to the detection result of the real name information; compared with the prior art, the method and the device have the advantages that whether the two user accounts belong to the same natural person or not is identified by combining the two dimensions of the attribute information and the real name information of the two user accounts, and when the similarity of the attribute information of the two user accounts reaches the preset condition and the real name information of the two user accounts is the same, the two user accounts are determined to correspond to the same natural person, so that the identification of whether the different user accounts correspond to the same natural person or not is more accurate.
The attribute information in the embodiment of fig. 2 may include any one of a bound third party account, a history login device, a history login wireless network, and a history login IP address.
Please refer to the embodiment corresponding to fig. 3 when the attribute information is the bound third party account.
When the attribute information is any one of the history login device information, the history login wireless network information, and the history login IP address information, please refer to the embodiment corresponding to fig. 4A. When calculating the association between two user accounts according to the attribute information of the user accounts, the association may be calculated by two algorithms, the first algorithm is a random walk algorithm, please refer to the embodiment corresponding to fig. 4B, and the second algorithm is to calculate the association according to the vector length of the historical login point, please refer to the embodiment corresponding to fig. 4D.
Please refer to the embodiment corresponding to fig. 5 when the attribute information is the user profile information.
Fig. 3 is a flowchart of a method for identifying a user account according to another embodiment of the present invention, which is illustrated by applying the method to the account analysis server 130 shown in fig. 1. Illustratively, the attribute information in this embodiment is a bound third party account. As shown in fig. 3, the user account identification method includes:
step 301, acquiring a third party account bound to the first user account and a third party account bound to the second user account from an account server.
The third-party account is an account with unique identification property, for example, the third-party account is any one of a social account, a mailbox address, a mobile phone number and a certificate number. The third party account is said to be relative to the account server, the third party account may be a user registered on another account server, different from another user account of the first user account and the second user account, and the user account has an ability to uniquely identify one of the plurality of natural persons.
Step 302, detecting whether the third party account bound to the first user account is the same as the third party account bound to the second user account.
When the attribute information is the bound third party account, corresponding to step 202, the similarity of the attribute information reaches the preset condition, that is, the bound third party accounts are the same.
Step 303, when the third party account bound to the first user account is the same as the third party account bound to the second user account, acquiring the real name information of the first user account and the real name information of the second user account.
When the third party account bound by the first user account and the second user account is the same, it indicates that the users corresponding to the first user account and the second user account may be the same natural person or a close person.
Step 304, detecting whether the real name information of the first user account is the same as the real name information of the second user account.
Step 305, when the real name information of the first user account is the same as the real name information of the second user account, determining the first user account and the second user account as corresponding to the same natural person.
Steps 303 to 305 are similar to steps 203 to 205, and the embodiment of the present invention is not described again.
In summary, in the user account identification method provided in the embodiment of the present invention, when it is detected that the third party account bound to the first user account is the same as the third party account bound to the second user account, whether the real-name information of the first user account is the same as the real-name information of the second user account is detected, and whether the first user account and the second user account correspond to the same natural person is determined according to the detection result of the real-name information; compared with the prior art, the method and the device have the advantages that whether the two user accounts belong to the same natural person or not is identified by combining the two dimensions of the third party account and the real name information bound by the two user accounts, and the two user accounts correspond to the same natural person or not is determined when the third party accounts bound by the two user accounts are the same and the real name information of the two user accounts is the same, so that the identification of whether the different user accounts correspond to the same natural person or not is more accurate.
Fig. 4A is a flowchart of a method for identifying a user account according to another embodiment of the present invention, which is illustrated by applying the method to the account analysis server 130 shown in fig. 1. For example, in this embodiment, the attribute information is any one of a history login device, a history login wireless network, and a history login IP address, and in this embodiment, the algorithm for calculating the association between two user accounts is a random walk algorithm. As shown in fig. 4A, the user account identification method includes:
step 401, obtaining the historical login point information of the first user account and the historical login point information of the second user account from the account server.
The historical login point information is any one of a historical login device, a historical login wireless network and a historical login IP address.
Optionally, if the historical login point information is the historical login device, the account server obtains unique identification information of the historical login device, such as an IMEI (International Mobile Equipment Identity) of the Mobile device.
Optionally, if the historical login point information is a historical logged-in WIreless network, the account server obtains a WIreless network Identifier of the WIreless network used when the user account is logged in historically, and the WIreless network Identifier may be an SSID (Service Set Identifier) name of WiFi (WIreless-Fidelity), or hardware Identifier information of the WIreless router. Optionally, when the historical login point information is a historical logged-in wireless network, the influence of the wireless network used in the public place needs to be eliminated, which is specifically implemented as follows: detecting the number of user accounts simultaneously using the same wireless network, and determining the wireless network as the wireless network used in a public place when the number of the user accounts simultaneously using the wireless network exceeds a threshold value. For example, WiFi that has more than 10 user accounts that use WiFi at the same time is excluded.
Optionally, if the historical login point is the historical login IP address, the account server obtains the historical login IP address used when the user account is logged in. When dialing to access the network, the terminal device generally uses the IP address assigned by the operator. Optionally, when the historical login point information is historical login IP address information, a situation that IP addresses used in surfing the internet in a public place are the same needs to be eliminated, and the specific implementation is as follows: detecting the number of user accounts simultaneously using the same IP address, and determining the IP address as the IP address used when surfing the Internet in a public place when the number of the user accounts simultaneously using the IP address reaches a threshold value. For example, the IP addresses with the number of user accounts which use the IP addresses to surf the internet at the same time exceeding 5 are excluded.
Alternatively, when the attribute information is historical login point information, step 202 is replaced with steps 402 to 404.
Step 402, acquiring the login times of the first user account on each historical login point and the login times of the second user account on each historical login point.
Optionally, in order to reduce the amount of calculation, the number of times that the user account has been logged in at each historical login point within a predetermined time period may be obtained. Such as obtaining the number of times the user has logged on each historical login point within a month.
Step 403, calculating the association degree between the first user account and the second user account according to the login times of the first user account on each historical login point and the login times of the second user account on each historical login point.
If the first user account and the second user account correspond to the same natural person, the historical login points of the first user account and the second user account are similar, and the calculated association degree of the first user account and the second user account is higher.
For example, when the historical login point information is the historical login device, the calculated higher association degree indicates that two user accounts often use the same login device, and the two user accounts may correspond to the same natural person or a person with close relationship. When the historical login point information is the historical logged-in wireless network, the calculated association degree is higher, which indicates that the two user accounts use the same wireless route more times, and the two user accounts may correspond to the same natural person or a person with close relationship. When the historical login point information is the IP address used in historical login, the calculated association degree is higher, which indicates that the two user accounts use the same IP address more times, and the two user accounts may correspond to the same natural person or a close-related person.
Alternatively, the association degree between the first user account and the second user account may be calculated by a random walk algorithm as shown in fig. 4B:
step 403a, establishing a first matrix with the user account as a matrix row and the historical login point as a matrix column.
The matrix elements in the first matrix are first weights, the first weights are obtained by dividing the login times of the ith user account on the kth historical login point by the total login times of the ith user account on each historical login point, i is equal to 1 or 2, and k represents an integer greater than 0.
That is to say that the first and second electrodes,
Figure BDA0001138163590000111
wherein A is0(i, k) represents the weight of the ith user account logged in the kth historical login point, namely the first weight, and m represents the total number of the historical login points logged in by the ith user account.
A0And (i, k) is a first matrix which takes the ith user account as a matrix row and the kth historical login point as a matrix column.
Step 403b, establishing a second matrix with the historical login points as matrix rows and the user account as matrix columns.
The matrix elements in the second matrix are second weights, and the second weights are obtained by dividing the login times of the ith user account on the kth historical login point by the total login times of all user accounts logged in the kth historical login point.
That is to say that the first and second electrodes,
Figure BDA0001138163590000112
wherein, B0(i, k) represents the weight of the ith user account logged on the kth historical login point, namely the second weight, and n represents the total number of the user accounts logged on the kth historical login point.
B0And (i, k) is a second matrix which takes the kth historical login point as a matrix row and the ith user account as a matrix column.
Illustratively, a weighted bipartite graph as shown in fig. 4C may be obtained according to the calculated first and second weights. As shown in FIG. 4C, the 1 st user account has been logged in a total of 15 times, wherein the number of times of logging in to the 1 st historical login point is 5 times, the number of times of logging in to the 2 nd historical login point is 10 times, and the 2 nd user account has been logged in a total of 50 times, wherein the number of times of logging in to the 2 nd historical login point is 20 times, and the number of times of logging in to the 3 rd historical login point is 30 times. The total number of times of logging in the user account on the 2 nd historical login point is 30, the number of times of logging in the 1 st user account is 10, and the number of times of logging in the 2 nd user account is 20. The weight from the 1 st user account to the 1 st historical login point is 5/15, the weight from the 1 st user account to the 2 nd historical login point is 10/15, the weight from the 1 st user account to the 3 rd historical login point is 0, the weight from the 2 nd user account to the 1 st historical login point is 0, the weight from the 2 nd user account to the 2 nd historical login point is 20/50, the weight from the 2 nd user account to the 3 rd historical login point is 30/50, the weight from the 2 nd historical login point to the 1 st user account is 10/30, and the weight from the 2 nd historical login point to the 2 nd user account is 20/30.
And 403c, performing iterative operation on the first matrix and the second matrix through a preset algorithm to obtain a third matrix taking the user account as a matrix row and a matrix column.
The third matrix is used for representing the association degree between the first user account and the second user account.
Optionally, if the predetermined algorithm is a random walk algorithm, the specific implementation manner of the iterative operation is as follows:
s1, applying the first formula to the first matrix A in the ith iteration operationiPerforming iterative operation, wherein the first formula is as follows: a. thei+1=s*Ai+(1-s)*Ai*Bi*Ai
S2, using the second formula to perform the second matrix B in the ith iteration operationiPerforming iterative operation, wherein the second formula is as follows: b isi+1=s*Bi+(1-s)*Bi*Ai*Bi
S3, applying a third formula to the third matrix R in the ith iteration operationiPerforming iterative operation, wherein the third formula is as follows: ri+1=s*Ri+(1-s)*Ri*Ai*Bi
I in the three formulas represents the number of iterative operations, the value is an integer larger than 0, and s represents a constant between 0 and 1.
A obtained by iterative calculation of steps S1 and S2i+1And Bi+1Is substituted into the (i + 2) th iteration.
Optionally, in order to prevent the value of the matrix element in the matrix from being too large, one iteration needs to normalize the obtained first matrix and second matrix, where the normalization is specifically implemented as: dividing a matrix element in the matrix by the sum of all matrix elements of the row in which the matrix element is located, or dividing a matrix element in the matrix by the sum of all matrix elements of the column in which the matrix element is located, the matrix elements may be defined between 0 and 1 by way of normalization. The embodiment does not limit the specific implementation manner of normalization.
And S4, detecting whether the number of iterative operation reaches a preset number.
Alternatively, the predetermined number of times is set according to an empirical value, and generally the greater the number of iterations, the higher the accuracy of the obtained result. In practical applications, the operation is usually iterated 3 to 4 times.
And S5, when the number of iterative operation reaches the preset number, obtaining a third matrix for representing the association degree between the first user account and the second user account.
And stopping the iterative operation after the number of iterative operation reaches a preset number, and determining the value of the matrix element of the matrix after the iterative operation as the association degree between the user accounts.
For example, the third matrix obtained after n iterations is
Figure BDA0001138163590000121
Third matrix RnFor the calculated matrix of the degree of association between the first and second user accounts, RnRespectively taking the first user account number and the second user account number as matrix rows and matrix columns, then x1Representing the degree of association, x, between the first user account and the first user account2Representing the degree of association, x, between the first user account and the second user account3Indicating the degree of association, x, between the second user account and the first user account4Indicating the degree of association between the second user account and the second user account.
Step 404, it is detected whether the degree of association is greater than a predetermined threshold.
The preset threshold is set according to actual requirements, and the larger the value set by the preset threshold is, the higher the requirement on the association degree between the first user account and the second user account is.
Step 405, when the degree of association is greater than a predetermined threshold, acquiring real name information of the first user account and real name information of the second user account.
When the correlation degree of the first user account and the second user account calculated according to the historical login point information of the first user account and the historical login point information of the second user account is larger than a preset threshold value, it indicates that the first user account and the second user account log on the same historical login point for a large number of times, and the first user account and the second user account may correspond to the same natural person or a person with close relationship.
Step 406, detecting whether the real name information of the first user account is the same as the real name information of the second user account.
Step 407, when the real name information of the first user account is the same as the real name information of the second user account, determining the first user account and the second user account as corresponding to the same natural person.
Steps 405 to 407 are similar to steps 203 to 205, and are not described herein again.
In practical implementation, the historical login point information may also be any other behavior data that has occurred on the network platform through the user account, such as: a public number of a history login, a website of a history login, news of a history browsing, and the like. For different types of historical login point information, the association degree between the first user account and the second user account can be calculated by adopting the random walk algorithm.
In summary, in the user account identification method provided in the embodiment of the present invention, when it is detected that the association degree between the first user account and the second user account calculated according to the historical login point information of the first user account and the historical login point information of the second user account reaches a predetermined threshold, it is detected whether the real-name information of the first user account and the second user account is the same, and it is determined whether the first user account and the second user account correspond to the same natural person according to the detection result of the real-name information; compared with the prior art, the method and the device have the advantages that whether the two user accounts belong to the same natural person or not is identified by combining the two dimensions of the historical login point information and the real name information of the two user accounts, and when the association degree of the two user accounts reaches a preset threshold value and the real name information of the two user accounts is the same, the two user accounts are determined to correspond to the same natural person, so that the identification of whether the different user accounts correspond to the same natural person or not is more accurate.
In the above embodiment, the random walk algorithm is used to calculate the association degree between two user accounts, and in this embodiment, the association degree between two user accounts is calculated according to the vector length of the historical login point, which refers to the steps described in fig. 4D.
Fig. 4D is a flowchart of a method for calculating the association degree between the first user account and the second user account according to still another embodiment of the present invention, which is illustrated in the embodiment by applying the method to the account analysis server 130 shown in fig. 1. The attribute information in this embodiment is any one of a history login device, a history login wireless network, and a history login IP address. As shown in fig. 4D, the method includes:
and step 403d, acquiring the login times of the first user account on the jth historical login point and the login times of the second user account on the jth historical login point.
Wherein j represents an integer greater than 0.
The history login point information is any one of a history login device, a history login wireless network, and a history login IP address.
Step 403e, the number of times that the first user account logs in the jth historical login point is multiplied by the number of times that the second user account logs in the jth historical login point, so as to obtain the jth product.
The larger the calculated jth product is, the more times the first user account and the second user account have logged on the jth historical login point are, that is, the more times the first user account and the second user account have logged on the same historical login point are.
Since there may be more than one historical login point where the first user account or the second user account has logged in, the product of the login times of the first user account and the second user account at each historical login point needs to be calculated. Obviously, if the historical login points of the first user account and the second user account are completely different, the calculated product is 0.
In step 403f, the sum of the j-th products is calculated.
Step 403g, acquiring a first number of the historical login points logged in by the first user account and a second number of the historical login points logged in by the second user account.
Step 403h, calculate the product of the first number and the second number.
And 403i, dividing the sum of the j-th products by the product of the first quantity and the second quantity to obtain the association degree between the first user account and the second user account.
And dividing the sum of the j-th products by the product of the first quantity and the second quantity to obtain a value for representing the correlation degree of the first user account and the second user account.
And if the login times of the first user account and the second user account at the same historical login point are more, calculating to obtain a larger value of the association degree between the first user account and the second user account.
Optionally, the method for calculating the association degree between the first user account and the second user account provided in this embodiment may be represented by the following formula:
the 1 st user account (the 1 st history login point: number of times, the 2 nd history login point: number of times, …, the nth history login point: number of times),
the 2 nd user account (the 1 st history login point: number of times, the 2 nd history login point: number of times, …, the nth history login point: number of times),
Figure BDA0001138163590000151
optionally, the method for calculating the association degree between the first user account and the second user account provided in this embodiment and the method for calculating the association degree between the first user account and the second user account provided in the previous embodiment may be selected according to a practical use effect.
Optionally, the same calculation method may be used when calculating the association degree between the first user account and the second user account according to the historical login device, the historical login wireless network, or the historical login IP address, or different calculation methods may be used.
Optionally, in order to reduce the amount of calculation, when obtaining the historical login point information, the number of times that the user account logs in at each historical login point within a predetermined time period is obtained, for example, the number of times that the user account logs in at each historical login point within a half year or three months is obtained.
In practical implementation, the historical login point information may also be any other behavior data that has occurred on the network platform through the user account, such as: a public number of a history login, a website of a history login, news of a history browsing, and the like. For different types of historical login point information, the association degree between the first user account and the second user account can be calculated by adopting the algorithm provided in the embodiment.
In summary, in the method for calculating the association degree between the first user account and the second user account provided in the embodiment of the present invention, the association degree between the first user account and the second user account is obtained by summing the product of the number of times that the first user account and the second user account have been logged on the same historical login point, and then dividing by the product of the number of the historical login points that the first user account has been logged on and the number of the historical login points that the second user account has been logged on.
Fig. 5 is a flowchart of a method for identifying a user account according to another embodiment of the present invention, which is illustrated by applying the method to the account analysis server 130 shown in fig. 1. For example, the attribute information in the present embodiment is user profile information. As shown in fig. 5, the user account identification method includes:
step 501, obtaining user data information of a first user account and user data information of a second user account from an account server.
The user profile information refers to relevant information actively filled or issued by a user, such as: any one of a nickname of the user, age of the user, sex of the user, address, and issued status information.
Step 502, extracting keywords in the user data information of the first user account to form a first keyword group.
And arranging the keywords extracted from the user data information of the first user account into a first keyword group according to a preset sequence. For example, the predetermined sequence is: the nickname of the user, the age of the user, the address and the issued state information, and the extracted keywords are as follows: the nickname of the user is cqq, the age of the user is 24 years, the address is Nanjing City in Jiangsu province, and the issued state information is spring swimming in the tomorrow, so that the obtained first key phrase is { cqq, 24 years, Nanjing City in Jiangsu province, spring swimming in the tomorrow }.
For the extraction of the keywords, the extraction may be realized by a keyword extraction method based on TextRank, a keyword extraction method based on SDA, or a keyword extraction method based on sparsepsvm, which is not described herein again.
Step 503, extracting the keywords in the user data information of the second user account to form a second keyword group.
The extraction of the keywords in the user profile information of the second user account uses the same extraction rule as the extraction of the keywords of the first user account, the extraction rule includes the type and the predetermined sequence of the extracted keywords, which corresponds to the example in step 502, and the extracted keywords are: the nickname of the user is ldp, the age of the user is 24 years, the address is Nanjing City in Jiangsu province, and the issued status information is that the user eats a big meal, and the obtained second key phrase is { ldp, 24 years, Nanjing City in Jiangsu province, and eat a big meal }.
Step 504, the number of the same keywords in the first keyword group and the second keyword group is determined.
For example, the first keyword group is { cqq, 24 years old, south Beijing City of Jiangsu province, spring tour in tomorrow }, the second keyword group is { ldp, 24 years old, south Beijing City of Jiangsu province, and dinner }, the same keywords are 24 years old and south Beijing City of Jiangsu province, and the number of the same keywords is 4.
And 505, dividing the number of the same keywords by the total number of the keywords in the first keyword group and the second keyword group to obtain the association degree of the first user account and the second user account.
For example, the first key word group is { cqq, 24 years old, south beijing city, jiangsu province, spring trip in tomorrow }, the second key word group is { ldp, 24 years old, south beijing city, jiangsu province, and lunch }, the same key words are 24 years old and south beijing city, jiangsu province, the number of the same key words is 4, the total number of the key words in the first key word group and the second key word group is 8, and the calculated association degree is 4/8.
Step 506, it is detected whether the degree of association is greater than a predetermined threshold.
The preset threshold is set according to actual requirements, and the larger the value set by the preset threshold is, the higher the requirement on the association degree between the first user account and the second user account is.
Step 507, when the degree of association is greater than a predetermined threshold, acquiring real name information of the first user account and real name information of the second user account.
When the degree of association is greater than the predetermined threshold, it indicates that there are more keywords that are the same in the user profile information of the first user account and the user profile information of the second user account, and the first user account and the second user account may correspond to the same natural person.
Step 508, detecting whether the real name information of the first user account is the same as the real name information of the second user account.
Step 509, when the real name information of the first user account is the same as the real name information of the second user account, determining that the first user account and the second user account correspond to the same natural person.
Steps 507 to 509 are similar to steps 203 to 205, and are not described herein again.
Optionally, another implementation of calculating the association degree of the first user account and the second user account is as follows: and configuring corresponding scores for the nickname of the user, the age of the user, the sex of the user, the address of the user and the issued state information, respectively calculating corresponding association degree values according to the name of the user, the age of the user, the sex of the user, the address of the user and the issued state information, multiplying the calculated association degrees by the corresponding scores, summing up to obtain a final score, and taking the final score as the association degree between the first user account and the second user account. When the association degree is calculated according to the name of the user, the age of the user, the sex of the user and the address, if the association degree is the same, the association degree is 100 percent, and if the association degree is different, the association degree is 0; when calculating the association degree according to the issued status information, the keyword in the issued status information can be extracted by a keyword extraction method, and then the association degree between the first user account and the second user account is calculated by adopting an algorithm similar to that in the embodiment.
When the attribute information is user data information, the association degree between the first user account and the second user account can be calculated in the two ways, and other algorithms can be adopted for realizing the association degree.
It should be noted that the user profile information does not include real name information of the user account.
In summary, in the user account identification method provided in the embodiment of the present invention, when it is detected that the association degree between the first user account and the second user account calculated according to the user data information of the first user account and the user data information of the second user account reaches the predetermined threshold, it is detected whether the real-name information of the first user account and the second user account is the same, and it is determined whether the first user account and the second user account correspond to the same natural person according to the detection result of the real-name information; compared with the prior art, the method and the device have the advantages that the two dimensions of the user data information and the real name information of the two user accounts are combined to identify whether the two user accounts belong to the same natural person, when the association degree of the two user accounts reaches a preset threshold value and the real name information of the two user accounts is the same, the two user accounts are determined to correspond to the same natural person, and therefore the identification of whether the different user accounts correspond to the same natural person is more accurate.
In the embodiments of fig. 2, fig. 3, fig. 4A, fig. 4B, fig. 4D, and fig. 5, the attribute information is any one of a bound third party account, a historical login device, a historical login wireless network, a historical login IP address, and user profile information, in this embodiment, the attribute information may include any two of a bound third party account, a historical login device, a historical login wireless network, a historical login IP address, and user profile information, and it may be identified whether the first user account and the second user account correspond to the same natural person by using the user account identification method shown in fig. 6.
Fig. 6 is a flowchart of a method for identifying a user account according to another embodiment of the present invention, where the method for identifying a user account may be applied to the account analysis server 130 shown in fig. 1, or may be applied to other servers capable of analyzing whether different user accounts correspond to the same physical person according to attribute information of the user accounts. As shown in fig. 6, the user account identification method includes:
step 601, judging whether the association degree of the first user account and the second user account reaches a threshold value according to the historical login equipment.
Step 602, when the association degree of the first user account and the second user account obtained according to the historical login device does not reach the threshold, determining whether the association degree of the first user account and the second user account reaches the threshold according to the historical login wireless network.
Step 603, when the association degree of the first user account and the second user account obtained according to the historical login device does not reach a threshold value, and the association degree of the first user account and the second user account obtained according to the historical login wireless network does not reach the threshold value, determining the first user account and the second user account as corresponding different natural persons.
Step 604, when the association degree of the first user account and the second user account obtained according to the historical login device reaches a threshold value, or the association degree of the first user account and the second user account obtained according to the historical login wireless network reaches a threshold value, detecting whether the real name information of the first user account is the same as the real name information of the second user account.
Step 605, when it is detected that the real name information of the first user account is the same as the real name information of the second user account, determining the first user account and the second user account as corresponding to the same natural person.
Optionally, the attribute information selected in this embodiment is a combination of a history login device and a history login wireless network, in practical application, if the attribute information includes any two of the bound third party account, the historical login device, the historical login wireless network, the historical login IP address and the user profile information, the combination mode of the attribute information further comprises any combination mode of a bound third party account and a historical login device, a bound third party account and a historical login wireless network, a bound third party account and a historical login IP address, a historical login device and a historical login IP address, a historical login wireless network and a historical login IP address, user profile information and a bound third party account, user profile information and a historical login device, a user profile information and a historical login wireless network, and a user profile information and a historical login IP address. The embodiment of the invention does not limit the specific combination mode of the attribute information.
Optionally, in the embodiment of the present invention, when detecting whether the two types of attribute information reach the preset condition, a serial execution mode is adopted, and the embodiment of the present invention does not limit the sequence of executing the two types of attribute information in the serial execution process.
Optionally, the server may also adopt a parallel execution mode when detecting whether the two types of attribute information reach the preset condition. Illustratively, the same scores are respectively configured for the two types of attribute information, the scores of the attribute information are multiplied by the corresponding weights to obtain the scores of the attribute information, then the scores of the two types of attribute information are added to obtain a final score, the final score is compared with a threshold, and the condition that the final score is greater than the threshold is determined as a preset condition. When the attribute information is a bound third party account, if the third party account bound to the first user account is the same as the third party account bound to the second user account, the weight corresponding to the attribute information is 100%, and if the third party account bound to the first user account is different from the third party account bound to the second user account, the weight corresponding to the attribute information is 0. When the attribute information is history login point information, the weight corresponding to the attribute information is a value of the association degree calculated according to the history login point information. And when the attribute information is the user profile information, the weight corresponding to the attribute information is the value of the correlation degree calculated according to the user profile information. The embodiment of the invention does not limit the concrete implementation mode of parallel execution.
In summary, according to the user account identification method provided in the embodiment of the present invention, when the similarity determined according to the two attribute information does not meet the preset condition, the two user accounts are determined as corresponding to different natural persons, and when the similarity determined according to one of the attribute information meets the preset condition, whether the real name information of the two user accounts is the same is detected, so that a situation that the similarity determined according to only one of the attribute information is inaccurate is avoided, and thus, whether the different user accounts correspond to the same natural person is more accurately identified.
In the above embodiment, the attribute information includes any two of the bound third party account, the historical login device, the historical logged wireless network, the historical logged IP address, and the user profile information, and in this embodiment, the attribute information includes any three or any four or five of the bound third party account, the historical login device, the historical logged wireless network, the historical logged IP address, and the user profile information, so that it is possible to identify whether the first user account and the second user account correspond to the same natural person by using the user account identification method shown in fig. 7.
Fig. 7 is a flowchart of a method for identifying a user account according to another embodiment of the present invention, where the method for identifying a user account may be applied to the account analysis server 130 shown in fig. 1, or may be applied to other servers capable of analyzing whether different user accounts correspond to the same physical person according to attribute information of the user accounts. As shown in fig. 7, the user account identification method includes:
step 701, judging whether the association degree of the first user account and the second user account reaches a threshold value according to the historical login equipment.
Step 702, when the association degree of the first user account and the second user account obtained according to the historical login device does not reach the threshold, judging whether the association degree of the first user account and the second user account reaches the threshold according to the historical login wireless network.
Step 703, when the association degree of the first user account and the second user account obtained according to the wireless network logged in according to the history does not reach the threshold, determining whether the association degree of the first user account and the second user account reaches the threshold according to the IP address logged in according to the history.
Step 704, when the association degree of the first user account and the second user account obtained according to the historical login device does not reach the threshold, the association degree of the first user account and the second user account obtained according to the historical login wireless network does not reach the threshold, and the association degree of the first user account and the second user account obtained according to the historical login IP address does not reach the threshold, the first user account and the second user account are determined to correspond to different natural persons.
Step 705, when the association degree of the first user account and the second user account obtained according to the historical login device reaches a threshold, or the association degree of the first user account and the second user account obtained according to the historical login wireless network reaches a threshold, or the association degree of the first user account and the second user account obtained according to the historical login IP address reaches a threshold, detecting whether the real name information of the first user account and the real name information of the second user account are the same.
Step 706, when it is detected that the real name information of the first user account is the same as the real name information of the second user account, determining the first user account and the second user account as corresponding to the same natural person.
Optionally, the attribute information selected in this embodiment is a combination of the historical login device, the historical login wireless network, and the historical login IP address, and in practical applications, if the attribute information includes any three of the bound third party account, the historical login device, the historical login wireless network, the historical login IP address, and the user profile information, the combination of the attribute information further includes the bound third party account, the historical login device, the historical login wireless network, the bound third party account, the historical login wireless network, the historical login IP address, the bound third party account, the historical login device, the historical login IP address, the user profile information, the bound third party account, the historical login wireless network, the user profile information, the bound third party account, the historical login IP address, the user profile information, the historical login profile information, the bound third party account, the historical login wireless, Any combination of the user profile information and historical login equipment and historical login wireless network, the user profile information and historical login equipment and historical login IP address, and the user profile information and historical login wireless network and historical login IP address. The embodiment of the invention does not limit the specific combination mode of the attribute information.
Optionally, in the embodiment of the present invention, when detecting whether the three types of attribute information reach the preset condition, a serial execution mode is adopted, and the embodiment of the present invention does not limit the sequence of executing the three types of attribute information in the serial execution process.
Optionally, in this embodiment, a parallel execution manner is adopted to detect whether the three types of attribute information reach the preset condition, for example, the same scores are respectively configured for the three types of attribute information, the scores of the attribute information are multiplied by the corresponding weights to obtain scores of the attribute information, then the scores of the three types of attribute information are added to obtain a final score, the final score is compared with the threshold, and it is determined that the final score is greater than the threshold as the preset condition. When the attribute information is a bound third party account, if the third party account bound to the first user account is the same as the third party account bound to the second user account, the weight corresponding to the attribute information is 100%, and if the third party account bound to the first user account is different from the third party account bound to the second user account, the weight corresponding to the attribute information is 0. When the attribute information is history login point information, the weight corresponding to the attribute information is a value of the association degree calculated according to the history login point information. And when the attribute information is the user profile information, the weight corresponding to the attribute information is the value of the correlation degree calculated according to the user profile information. The embodiment of the invention does not limit the concrete implementation mode of parallel execution.
Optionally, when the attribute information includes any four or five of the bound third-party account, the historical login device, the historical login wireless network, the historical login IP address, and the user profile information, the implementation manner is similar to the implementation manner of detecting whether the similarity of the attribute information of the two user accounts reaches the preset condition according to the three attribute information, and it is necessary to detect whether the similarity of the attribute information of the two user accounts reaches the preset condition according to any four attribute information or five attribute information, and similarly, the detection may be performed in a serial execution manner, or in a parallel execution manner.
In summary, according to the user account identification method provided in the embodiment of the present invention, when the similarity determined according to the three attribute information does not meet the preset condition, the two user accounts are determined as corresponding to different natural persons, and when the similarity determined according to one of the attribute information meets the preset condition, whether the real name information of the two user accounts is the same is detected, so that a situation that the similarity determined according to only one of the attribute information is inaccurate is avoided, and thus, whether the different user accounts correspond to the same natural person is more accurately identified.
Fig. 8 is a block diagram illustrating a user account recognition apparatus according to an embodiment of the present invention, where the user account recognition apparatus may be applied to the account analysis server 130 shown in fig. 1, or may be applied to other servers capable of analyzing whether different user accounts correspond to the same physical person according to attribute information of the user accounts. As shown in fig. 8, the user account identifying apparatus includes: a first acquisition module 810, a first detection module 820, a second acquisition module 830, a second detection module 840, and a determination module 850.
A first obtaining module 810, configured to implement the above step 201, step 301, step 401, step 501, and any other implicit or public obtaining related functions.
A first detection module 820 for implementing the above-mentioned step 202 and any other implicit or disclosed detection-related functions.
A second obtaining module 830, configured to implement the above-mentioned step 203, step 303, step 405, step 507, and any other implicit or public functions related to obtaining.
A second detection module 840, configured to implement the above-mentioned step 204, step 304, step 406, step 508, and any other implicit or public detection-related functions.
A determination module 850 for implementing the above-mentioned steps 205, 305, 407, 509 and any other implicit or public determination-related functions.
In one possible implementation, the first detection module 820 is further configured to implement the above-mentioned step 302 and any other implicit or disclosed detection-related functions.
In one possible implementation, the first detection module 820 includes an obtaining unit, a first calculating unit, and a first detecting unit.
An obtaining unit, configured to implement the foregoing step 402 and any other implicit or disclosed obtaining related functions.
A first computing unit for implementing the above step 403 and any other implicit or disclosed computing related functions.
A first detection unit for implementing the above step 404 and any other implicit or disclosed detection related functions.
In one possible implementation, the first computing unit includes: the device comprises a first establishing subunit, a second establishing subunit and an iterative operation subunit.
A first setup subunit, configured to implement the above step 403a and any other implicit or disclosed setup related functions.
A second establishing subunit, configured to implement the above step 403b and any other implicit or public establishing-related functions.
An iterative operation subunit, configured to implement the foregoing step 403c and any other implicit or disclosed functions related to iterative operations.
In a possible implementation manner, the iterative operation subunit is further configured to implement the above step S1 to step S5 and any other implicit or disclosed functions related to the iterative operation.
In one possible implementation manner, the first computing unit further includes: the device comprises a first acquisition subunit, a first calculation subunit, a second acquisition subunit, a third calculation subunit and a fourth calculation subunit.
A first acquiring subunit, configured to implement the above step 403d and any other implicit or disclosed acquiring related functions.
A first computing subunit for implementing the above step 403e and any other implicit or disclosed computing-related functions.
A second computing subunit, configured to implement the above step 403f and any other implicit or disclosed computing-related functions.
A second obtaining subunit, configured to implement the foregoing step 403g and any other implicit or disclosed obtaining related functions.
A third computing subunit for implementing the above step 403h and any other implicit or disclosed computing-related functions.
A fourth computing subunit, configured to implement the above step 403i and any other implicit or disclosed computing-related functions.
In one possible implementation manner, the first detection module 820 includes a first extraction unit, a second extraction unit, a determination unit, a second calculation unit, and a second detection unit.
A first extraction unit, configured to implement the above-mentioned step 502 and any other implicit or disclosed extraction-related functions.
A second extraction unit for implementing the above step 503 and any other implicit or disclosed extraction related functions.
A determination unit for implementing the above step 504 and any other implicit or disclosed determination related functions.
A second computing unit, for implementing the above step 505 and any other implicit or disclosed computing related functions.
A second detection unit for implementing the above step 506 and any other implicit or disclosed detection related functions.
In a possible implementation manner, the user account identification apparatus further includes: the device comprises a setting module and a digging module.
And the setting module is used for setting the same credit score and other arbitrary implicit or public functions related to the setting for the first user account and the second user account in the credit payment system.
The mining module is used for mining interest characteristics of natural people corresponding to the first user account and the second user account by combining the information issued by the first user account and the information issued by the second user account, and providing network service and other arbitrary hidden or open functions related to mining for the first user account and/or the second user account according to the interest characteristics.
In summary, in the user account identification apparatus provided in the embodiment of the present invention, when it is detected that the similarity between the attribute information of the first user account and the attribute information of the second user account reaches the preset condition, it is detected whether the real name information of the first user account and the real name information of the second user account are the same, and it is determined whether the first user account and the second user account correspond to the same natural person according to the detection result of the real name information; compared with the prior art, the method and the device have the advantages that whether the two user accounts belong to the same natural person or not is identified by combining the two dimensions of the attribute information and the real name information of the two user accounts, and when the similarity of the attribute information of the two user accounts reaches the preset condition and the real name information of the two user accounts is the same, the two user accounts are determined to correspond to the same natural person, so that the identification of whether the different user accounts correspond to the same natural person or not is more accurate.
It should be noted that: the user account identification device provided in the above embodiment is only illustrated by the division of the functional modules when identifying the user account, and in practical applications, the function distribution may be completed by different functional modules as needed, that is, the internal structure of the server is divided into different functional modules to complete all or part of the functions described above. In addition, the user account identification apparatus and the user account identification method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
Fig. 9 is a schematic structural diagram of a server provided in one embodiment of the present invention. The server may be the application a server 110 or the application B server 120 shown in fig. 1, or may be the account analysis server 130 or the application class server 150. Specifically, the method comprises the following steps: the server 900 includes a Central Processing Unit (CPU)901, a system memory 904 including a Random Access Memory (RAM)902 and a Read Only Memory (ROM)903, and a system bus 905 connecting the system memory 904 and the central processing unit 901. The server 900 also includes a basic input/output system (I/O system) 906 for facilitating information transfer between devices within the computer, and a mass storage device 907 for storing an operating system 913, application programs 911, and other program modules 915.
The basic input/output system 906 includes a display 908 for displaying information and an input device 909 such as a mouse, keyboard, etc. for user input of information. Wherein the display 908 and the input device 909 are connected to the central processing unit 901 through an input/output controller 910 connected to the system bus 905. The basic input/output system 906 may also include an input/output controller 910 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, an input/output controller 910 also provides output to a display screen, a printer, or other type of output device.
The mass storage device 907 is connected to the central processing unit 901 through a mass storage controller (not shown) connected to the system bus 905. The mass storage device 907 and its associated computer-readable media provide non-volatile storage for the server 900. That is, the mass storage device 907 may include a computer-readable medium (not shown) such as a hard disk or CD-ROM drive.
Without loss of generality, the computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, EPROM, EEPROM, flash memory or other solid state memory technology, CD-ROM, DVD, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer storage media is not limited to the foregoing. The system memory 904 and mass storage device 907 described above may be collectively referred to as memory.
The server 900 may also operate as a remote computer connected to a network via a network, such as the internet, in accordance with various embodiments of the invention. That is, the server 900 may be connected to the network 912 through the network interface unit 911 coupled to the system bus 905, or the network interface unit 911 may be used to connect to other types of networks or remote computer systems (not shown).
An embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium may be a computer-readable storage medium contained in the memory in the foregoing embodiment; or it may be a separate computer-readable storage medium not incorporated in the terminal. The computer-readable storage medium stores one or more programs, which are used by one or more processors to perform the user account identification method.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (15)

1. A method for identifying a user account, the method comprising:
acquiring attribute information of a first user account and attribute information of a second user account from an account server, wherein the attribute information comprises a bound third party account, historical login point information and user data information;
detecting whether the similarity between the attribute information of the first user account and the attribute information of the second user account reaches a preset condition, including:
acquiring the login times of the first user account on each historical login point and the login times of the second user account on each historical login point; establishing a first matrix which takes the user account as a matrix row and the historical login points as a matrix column, wherein matrix elements in the first matrix are first weights, the first weights are obtained by dividing the login times of the ith user account on the kth historical login point by the total login times of the ith user account on each historical login point, i is equal to 1 or 2, and k represents an integer greater than 0; establishing a second matrix which takes the historical login points as matrix rows and the user accounts as matrix columns, wherein matrix elements in the second matrix are second weights, and the second weights are obtained by dividing the login times of the ith user account on the kth historical login point by the total login times of all the user accounts logged on the kth historical login point; performing iterative operation on the first matrix and the second matrix through a predetermined algorithm to obtain a third matrix which takes the user account as a matrix row and a matrix column, wherein the third matrix is used for representing the association degree between the first user account and the second user account; detecting whether the association degree is greater than a predetermined threshold value;
or the like, or, alternatively,
acquiring the login times of the first user account on each historical login point and the login times of the second user account on each historical login point; acquiring the login times of the first user account on the jth historical login point and the login times of the second user account on the jth historical login point, wherein j represents an integer larger than 0; multiplying the login times of the first user account on the jth historical login point by the login times of the second user account on the jth historical login point to obtain a jth product; calculating the sum of each of said jth products; acquiring a first number of historical login points which are logged in by the first user account and a second number of historical login points which are logged in by the second user account; calculating a product of the first number and the second number; dividing the sum of the j-th products by the product of the first quantity and the second quantity to obtain the association degree between the first user account and the second user account; detecting whether the association degree is greater than a predetermined threshold value;
when the similarity between the attribute information of the first user account and the attribute information of the second user account reaches the preset condition, acquiring real name information of the first user account and real name information of the second user account, wherein the real name information comprises information mined according to social information of the user accounts;
detecting whether the real name information of the first user account is the same as the real name information of the second user account;
and when the real name information of the first user account is the same as that of the second user account, determining the first user account and the second user account as corresponding to the same natural person.
2. The method according to claim 1, wherein the detecting whether the similarity between the attribute information of the first user account and the attribute information of the second user account meets a preset condition further comprises:
detecting whether the third party account bound by the first user account is the same as the third party account bound by the second user account;
the third-party account is any one of a social account, a mailbox address, a mobile phone number and a certificate number.
3. The method of claim 1, wherein the iterative operation of the first matrix and the second matrix through a predetermined algorithm to obtain a third matrix with the user account as a matrix row and a matrix column comprises:
a first matrix A in the ith iteration operation is subjected to a first formulaiPerforming iterative operation, wherein the first formula is as follows: a. thei+1=s*Ai+(1-s)*Ai*Bi*Ai
Applying a second formula to a second matrix B in the ith iterationiPerforming iterative operation, wherein the second formula is as follows: b isi+1=s*Bi+(1-s)*Bi*Ai*Bi
Applying a third formula to a third matrix R in the ith iterative operationiPerforming iterative operation, wherein the third formula is: ri+1=s*Ri+(1-s)*Ri*Ai*Bi
Detecting whether the number of times of the iterative operation reaches a preset number of times;
when the number of times of the iterative operation reaches the preset number of times, obtaining a third matrix for representing the association degree between the first user account and the second user account;
wherein i represents the number of iterative operations, the value of which is an integer greater than 0, and s represents a constant between 0 and 1.
4. The method according to claim 1, wherein the detecting whether the similarity between the attribute information of the first user account and the attribute information of the second user account meets a preset condition further comprises:
extracting key words in the user data information of the first user account to form a first key word group;
extracting key words in the user data information of the second user account to form a second key word group;
determining the number of the same key words in the first key word group and the second key word group;
dividing the number of the same keywords by the total number of the keywords in the first keyword group and the second keyword group to obtain the association degree of the first user account and the second user account;
detecting whether the association degree is greater than a predetermined threshold value.
5. The method according to any one of claims 1 to 3,
the historical login point information is: any one of a history login device, a history login wireless network and a history login IP address.
6. The method according to any one of claims 1 to 3,
the user profile information is: any one of a nickname of the user, age of the user, sex of the user, address, and issued status information.
7. The method of any of claims 1 to 3, wherein after the determining that the first user account and the second user account correspond to the same physical person, the method further comprises:
setting the same credit score for the first user account and the second user account in a credit payment system;
alternatively, the first and second electrodes may be,
acquiring information issued by the first user account and information issued by the second user account; and mining interest characteristics of natural people corresponding to the first user account and the second user account by combining the information issued by the first user account and the information issued by the second user account, and providing network service for the first user account and/or the second user account according to the interest characteristics.
8. An apparatus for identifying a user account, the apparatus comprising:
the first acquisition module is used for acquiring attribute information of a first user account and attribute information of a second user account from an account server, wherein the attribute information comprises a bound third party account, historical login point information and user data information;
a first detecting module, configured to detect whether a similarity between the attribute information of the first user account and the attribute information of the second user account, acquired by the first acquiring module, meets a preset condition, where the detecting module includes:
acquiring the login times of the first user account on each historical login point and the login times of the second user account on each historical login point; establishing a first matrix which takes the user account as a matrix row and the historical login points as a matrix column, wherein matrix elements in the first matrix are first weights, the first weights are obtained by dividing the login times of the ith user account on the kth historical login point by the total login times of the ith user account on each historical login point, i is equal to 1 or 2, and k represents an integer greater than 0; establishing a second matrix which takes the historical login points as matrix rows and the user accounts as matrix columns, wherein matrix elements in the second matrix are second weights, and the second weights are obtained by dividing the login times of the ith user account on the kth historical login point by the total login times of all the user accounts logged on the kth historical login point; performing iterative operation on the first matrix and the second matrix through a predetermined algorithm to obtain a third matrix which takes the user account as a matrix row and a matrix column, wherein the third matrix is used for representing the association degree between the first user account and the second user account; detecting whether the association degree is greater than a predetermined threshold value;
or the like, or, alternatively,
acquiring the login times of the first user account on each historical login point and the login times of the second user account on each historical login point; acquiring the login times of the first user account on the jth historical login point and the login times of the second user account on the jth historical login point, wherein j represents an integer larger than 0; multiplying the login times of the first user account on the jth historical login point by the login times of the second user account on the jth historical login point to obtain a jth product; calculating the sum of each of said jth products; acquiring a first number of historical login points which are logged in by the first user account and a second number of historical login points which are logged in by the second user account; calculating a product of the first number and the second number; dividing the sum of the j-th products by the product of the first quantity and the second quantity to obtain the association degree between the first user account and the second user account; detecting whether the association degree is greater than a predetermined threshold value;
the second obtaining module is used for obtaining real name information of the first user account and real name information of the second user account when the first detecting module detects that the similarity between the attribute information of the first user account and the attribute information of the second user account reaches the preset condition, wherein the real name information comprises information mined according to social information of the user accounts;
the second detection module is used for detecting whether the real name information of the first user account acquired by the second acquisition module is the same as the real name information of the second user account;
the determining module is configured to determine that the first user account and the second user account correspond to the same natural person when the second detecting module detects that the real name information of the first user account is the same as the real name information of the second user account.
9. The apparatus of claim 8, wherein the first detection module is further configured to:
detecting whether the third party account bound by the first user account is the same as the third party account bound by the second user account;
the third-party account is any one of a social account, a mailbox address, a mobile phone number and a certificate number.
10. The apparatus of claim 8, wherein the iterative operation subunit is further configured to:
a first matrix A in the ith iteration operation is subjected to a first formulaiPerforming iterative operation, wherein the first formula is as follows: a. thei+1=s*Ai+(1-s)*Ai*Bi*Ai
Applying a second formula to a second matrix B in the ith iterationiPerforming iterative operation, wherein the second formula is as follows: b isi+1=s*Bi+(1-s)*Bi*Ai*Bi
Applying a third formula to a third matrix R in the ith iterative operationiPerforming iterative operation, wherein the third formula is: ri+1=s*Ri+(1-s)*Ri*Ai*Bi
Detecting whether the number of times of the iterative operation reaches a preset number of times;
when the number of times of the iterative operation reaches the preset number of times, obtaining a third matrix for representing the association degree between the first user account and the second user account;
wherein i represents the number of iterative operations, the value of which is an integer greater than 0, and s represents a constant between 0 and 1.
11. The apparatus of claim 8, wherein the first detection module is further configured to:
extracting key words in the user data information of the first user account to form a first key word group;
extracting key words in the user data information of the second user account to form a second key word group;
determining the number of the same key words in the first key word group and the second key word group;
dividing the number of the same keywords determined by the determining unit by the total number of the keywords in the first keyword group and the second keyword group to obtain the association degree of the first user account and the second user account;
detecting whether the association degree is greater than a predetermined threshold value.
12. The apparatus according to any one of claims 8 to 10,
the historical login point information is: any one of a history login device, a history login wireless network and a history login IP address.
13. The apparatus according to any one of claims 8 to 10,
the user profile information is: any one of a nickname of the user, age of the user, sex of the user, address, and issued status information.
14. The apparatus of any one of claims 8 to 10, further comprising:
the setting module is used for setting the same credit score for the first user account and the second user account in a credit payment system;
the mining module is used for acquiring the information issued by the first user account and the information issued by the second user account; and mining interest characteristics of natural people corresponding to the first user account and the second user account by combining the information issued by the first user account and the information issued by the second user account, and providing network service for the first user account and/or the second user account according to the interest characteristics.
15. A computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, which is loaded and executed by a processor to implement the method of any one of claims 1 to 7.
CN201610933697.7A 2016-10-31 2016-10-31 User account identification method and device Active CN108009168B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610933697.7A CN108009168B (en) 2016-10-31 2016-10-31 User account identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610933697.7A CN108009168B (en) 2016-10-31 2016-10-31 User account identification method and device

Publications (2)

Publication Number Publication Date
CN108009168A CN108009168A (en) 2018-05-08
CN108009168B true CN108009168B (en) 2020-12-01

Family

ID=62047251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610933697.7A Active CN108009168B (en) 2016-10-31 2016-10-31 User account identification method and device

Country Status (1)

Country Link
CN (1) CN108009168B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110311891B (en) * 2019-05-23 2023-04-18 平安普惠企业管理有限公司 Account management method and device, computer equipment and storage medium
CN112418294A (en) * 2020-11-18 2021-02-26 青岛海尔科技有限公司 Method, device, storage medium and electronic device for determining account type

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104184705B (en) * 2013-05-23 2019-05-07 腾讯科技(深圳)有限公司 Verification method, device, server, subscriber data center and system
CN105227307A (en) * 2014-06-03 2016-01-06 阿里巴巴集团控股有限公司 Auth method and system and server data processing method and server
CN104967603B (en) * 2015-04-17 2019-06-11 腾讯科技(成都)有限公司 Using account number safety verification method and device
CN105184561A (en) * 2015-08-24 2015-12-23 小米科技有限责任公司 Safety payment method and device
CN105978717A (en) * 2016-05-09 2016-09-28 深圳市永兴元科技有限公司 Network account recognition method and device

Also Published As

Publication number Publication date
CN108009168A (en) 2018-05-08

Similar Documents

Publication Publication Date Title
US11710054B2 (en) Information recommendation method, apparatus, and server based on user data in an online forum
US9785989B2 (en) Determining a characteristic group
US9965522B2 (en) Mobile application search ranking
US9210189B2 (en) Method, system and client terminal for detection of phishing websites
US20160125432A1 (en) Identifying influential users of a social networking service
US9443034B2 (en) Estimating influence using sketches
US11275748B2 (en) Influence score of a social media domain
US20150032504A1 (en) Influence scores for social media profiles
CN106874253A (en) Recognize the method and device of sensitive information
US20140279803A1 (en) Disambiguating data using contextual and historical information
US9756063B1 (en) Identification of host names generated by a domain generation algorithm
CN111178949B (en) Service resource matching reference data determining method, device, equipment and storage medium
CN110135978B (en) User financial risk assessment method and device, electronic equipment and readable medium
CN105847127A (en) User attribute information determination method and server
CN108182633B (en) Loan data processing method, loan data processing device, loan data processing program, and computer device and storage medium
US9866454B2 (en) Generating anonymous data from web data
JP2009098964A (en) Network service system, server, method and program
CN114363019B (en) Training method, device, equipment and storage medium for phishing website detection model
CN108009168B (en) User account identification method and device
CN106910135A (en) User recommends method and device
US20140244641A1 (en) Holistic customer record linkage via profile fingerprints
CN108804501B (en) Method and device for detecting effective information
JP7092194B2 (en) Information processing equipment, judgment method, and program
CN110598949A (en) User interest degree analysis method and device, electronic equipment and storage medium
CN112182390B (en) Mail pushing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant