CN109284589A - Across the social networks entity identities analytic method of one kind - Google Patents

Across the social networks entity identities analytic method of one kind Download PDF

Info

Publication number
CN109284589A
CN109284589A CN201811031997.1A CN201811031997A CN109284589A CN 109284589 A CN109284589 A CN 109284589A CN 201811031997 A CN201811031997 A CN 201811031997A CN 109284589 A CN109284589 A CN 109284589A
Authority
CN
China
Prior art keywords
user
match
select
account
social networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811031997.1A
Other languages
Chinese (zh)
Inventor
王中元
黄志兵
何政
胡瑞敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201811031997.1A priority Critical patent/CN109284589A/en
Publication of CN109284589A publication Critical patent/CN109284589A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses across the social networks entity identities analytic methods of one kind, on the basis of making full use of username information and link information, have used a kind of identification that user identity is carried out based on the method for the cross-matched of ranking.Firstly, selecting the highest user of present score as candidate matching user by utilizing user name similarity (Username Similarity, UNS) and user environment score (User Surrounding Score, USS).Then, user's matching score (User Matching Score, UMS) is calculated, i.e. the synthesis of user name similarity and user environment scoring, the user to match with candidate user is determined further according to this score.Finally, the method using cross-matched improves matched accuracy rate.

Description

Across the social networks entity identities analytic method of one kind
Technical field
The invention belongs to identity identification technical fields, are related to a kind of user identification method, and in particular to one kind is across social network Network entity identities analytic method.
Technical background
In recent years, as social networks is popularized on a large scale, social networks plays more and more important in people's lives Role.According to statistics, 5.3 hundred million, QQ sky has been reached using Chinese netizen's scale of social network sites, microblogging and each vertical social application Between, netizen's utilization rate of microblogging be respectively 65.1%, 33.5%.Social networks possesses the userbase of magnanimity, but carries out real name The user of certification but only accounts for the ratio of very little, this allows malicious user to disseminate various rumours and flame wantonly, to mutual Networking monitoring brings huge challenge.Since user uses virtual name account, rather than real name, same use in social network space Family different social spaces account without directly name meaning on connection.Therefore the entity user across social networks is carried out Association identification facilitates the identification and the supervision problem that solve user.
The challenge of across social networks entity user association identification is that an identical user is on different social networks It is owned by social account, and there may be differences for the behavior on these social networks, can not efficiently identify these accounts From same subscriber, lead to not further study behavioral difference of the user on different social networks, while user information Redundancy is made troubles for network supervision.
People are usually owned by the account of oneself in multiple social networks, due to personal habit, everyone different societies Hand over network account that can all have certain similitude, such as user name and respective friend circle etc..We can pass through these phases As attribute to judge whether the account of heterogeneous networks belongs to the same person identity identification is carried out to people in cyberspace. And then it can be analyzed in the speech of different social platforms, judge that its speech is inclined to.
Summary of the invention
In order to judge whether the account of heterogeneous networks belongs to the same person, identity is carried out to a people in cyberspace and is recognized Fixed, the present invention proposes a kind of across social networks entity identities analytic method pioneeringly.
The technical scheme adopted by the invention is that: a kind of across social networks entity identities analytic method, which is characterized in that packet Include following steps:
Step 1: it is similar to the user name of each user in another network to calculate separately out each user in a network Degree;
Step 2: according to good friend's linking relationship of user in each network, calculating the user environment scoring of each user;
Step 3: the user environment scoring in conjunction with the user name similarity in step 1 and in step 2, in each network User carries out descending sort according to scoring;
Step 4: according to step 3 as a result, the user of highest scoring is selected in two networks, as candidate user account vselectIt returns, while returning to network label belonging to the user;
Step 5: according to the candidate user account v returned in step 4selectWith network label, it is public that scoring is matched in conjunction with user Formula calculates user's similarity score of each user in another network;
Step 6: in conjunction with the user name similarity in step 1 and user's similarity in step 5, calculating user's matching and comment Point, select user to match the most preceding account v of scoring rankingmatchIt is returned as possible matching user account;
Step 7: for matching user account v obtained in step 6match, using it as new candidate user account, repeat Step 5 and step 6 obtain a matching user account v ' againmatch
Step 8: for v ' obtained in step 7matchIf v 'matchIt just is exactly candidate user obtained in step 4 Account vselect, then it is assumed that vselectWith vmatchSuccessful match, otherwise it is assumed that it fails to match, by vselectReset to vmatch, enter New round matching.
Compared with existing social network user personal identification method, the present invention is had the advantages that:
(1) other than considering the account attribute of social networks, good friend's linking relationship for being also fully utilized by social networks Improve the accuracy of identification of virtual identity;
(2) a kind of method for having used cross-matched based on ranking, ensure that matched accuracy.
Detailed description of the invention
Fig. 1: the process flow diagram of the embodiment of the present invention.
Specific embodiment
Understand for the ease of those of ordinary skill in the art and implement the present invention, with reference to the accompanying drawings and embodiments to this hair It is bright to be described in further detail, it should be understood that implementation example described herein is merely to illustrate and explain the present invention, not For limiting the present invention.
Referring to Fig.1, the present invention provides across the social networks entity identities analytic methods of one kind, comprising the following steps:
Step 1: each user and each user in another network in a network are calculated separately out using VMN algorithm User name similarity;
VMN algorithm is specially designed to calculate the algorithm of fuzzy matching between name, which will accurately match and portion Divide matching to combine, and then obtains final matching result.
Step 2: according to good friend's linking relationship of user in each network, calculating the user environment scoring of each user;
The relationship of two users both may be friend relation in the same network, it is also possible to be non-friend relation.If It is greatly identical in the good friend of two people, then the relationship of the two people should be fine, otherwise in reality, The two people are likely to stranger.
The present embodiment evaluates the degree of relationship of two people using the concept of good friend's degree.
Good friend's degree calculation formula is as follows:
In formula, deg () indicates the number of a user good friend, va, vb∈ V is two users of consolidated network, it is assumed that Virtual User vm, all good friends are va, vbCommon friend, fa∈Fa, fb∈Fb, fm∈FmRespectively va, vb, vmGood friend Set.
Good friend's degree is for calculating friend relation between two social network users, and corresponding is non-good friend's degree. Non- good friend's degree is for calculating the non-friend relation between two users, and calculation formula is as follows:
NFD(va, vb)=1- (deg (fm)-1+1)*(deg(va)-1+deg(vb)-1);
Good friend's degree and non-good friend's degree only indicate the degree of relationship between two users, can not be directly by good friend's degree and non- Good friend's degree judges that two accounts are non-friend relations.Calculating both is for the calculating for next step.
User environment scoring be to a user whether in another network there are the estimation that account is made, give one The environment scoring calculation formula of user v ∈ V, the user are as follows:
In formula, SseedSeed user is the use that possesses account in different social networks and can identify in advance Family, w are weight factors, for improving the weight of cohesion.η is increment factor,It indicates to close between two users for good friend System, "-" indicate that between two users be non-friend relation.
Step 3: the user environment scoring in conjunction with the user name similarity in step 1 and in step 2, the two are weighted phase After adding, descending sort is carried out according to scoring to the user in each network;
Step 4: according to step 3 as a result, the user of highest scoring is selected in two networks, as candidate user account vselectIt returns, while returning to network label belonging to the user;
Step 5: according to the candidate user account v returned in step 4selectWith network label, it is public that scoring is matched in conjunction with user Formula calculates user's similarity score of each user in another network;
Customer relationship include friend relation and non-friend relation, so calculate user's similarity when need to comprehensively consider this two Kind relationship.In the iterative process of each step of algorithm, a selected user account v is givenselect, which possesses maximum User environment score value, then calculate vselect(assuming that it is from social networks V0) and arbitrarily do not match account v1∈V1 User's similarity between (social networks 1).
Two user v in given heterogeneous networks0∈V0And v1∈V1, the calculation formula of customer relationship similarity is as follows:
In formula, SseedSeed user set, FC () are two user's good friend's degree scorings, and NFD () is that two users are non-good Friendly degree scoring,It indicates to be friend relation between two users, "-" indicates that between two users be non-friend relation.
Step 6: in conjunction with the user name similarity in step 1 and user's similarity in step 5, calculating user's matching and comment Point, select user to match the most preceding account v of scoring rankingmatchIt is returned as possible matching user account;
In conjunction with user name similarity and customer relationship similarity, a total score is calculated, for evaluating a social activity The matching degree of the user of the user and another social networks of network;vselect∈V0With v ∈ V1It is the two of different social networks A user, it is as follows that user between the two matches scoring calculation formula:
UMS(vselect, v) and=USS (vselect, v) and * | sseed|*w1+URS(vselect, v) and * w2
Wherein, sseedSeed user set, w1And w2For weight factor.
Step 7: for matching user account v obtained in step 6match, using it as new candidate user account, repeat Step 5 and step 6 obtain a matching user account v ' againmatch
Step 8: for v ' obtained in step 7matchIf v 'matchIt just is exactly candidate user obtained in step 4 Account vselect, then it is assumed that vselectWith vmatchSuccessful match, otherwise it is assumed that it fails to match, by vselectReset to vmatch, enter New round matching;
The matching user account v returned from account matching processmatch, algorithm using it as new candidate user account, And a user is obtained by matching algorithm again and matches account v 'match.If v 'matchJust the candidate user before being exactly Account vselect, at this time it is exactly vselectMost probable matching be vmatch, vmatchMost probable matching be vSelect,So this is One stable matching.Otherwise, it is considered as that it fails to match, by vselectBeing put into not match waits chance to match again, and by vselect Reset to vmatch, matched into a new round.
The present invention is using Twitter and Facebook account data as analysis object.The user name of user is not only used Attribute also uses good friend's link information of user, measures the pass between two users by good friend's degree and non-good friend's degree etc. It is degree.It is matched in conjunction with user name similarity and linking relationship, while also using a kind of intersection based on ranking The method matched is to guarantee the accuracy rate identified.
It should be understood that the part that this specification does not elaborate belongs to the prior art.
It should be understood that the above-mentioned description for preferred embodiment is more detailed, can not therefore be considered to this The limitation of invention patent protection range, those skilled in the art under the inspiration of the present invention, are not departing from power of the present invention Benefit requires to make replacement or deformation under protected ambit, fall within the scope of protection of the present invention, this hair It is bright range is claimed to be determined by the appended claims.

Claims (7)

1. a kind of across social networks entity identities analytic method, which comprises the following steps:
Step 1: calculating separately out the user name similarity of each user and each user in another network in a network;
Step 2: according to good friend's linking relationship of user in each network, calculating the user environment scoring of each user;
Step 3: the user environment scoring in conjunction with the user name similarity in step 1 and in step 2, to the user in each network Descending sort is carried out according to scoring;
Step 4: according to step 3 as a result, the user of highest scoring is selected in two networks, as candidate user account vselectIt returns, while returning to network label belonging to the user;
Step 5: according to the candidate user account v returned in step 4selectWith network label, scoring formula is matched in conjunction with user, Calculate user's similarity score of each user in another network;
Step 6: in conjunction with the user name similarity in step 1 and user's similarity in step 5, calculates user and match scoring, User is selected to match the most preceding account v of scoring rankingmatchIt is returned as possible matching user account;
Step 7: for matching user account v obtained in step 6match, using it as new candidate user account, repeat step 5 and step 6 obtain a matching user account v ' againmatch
Step 8: for v ' obtained in step 7matchIf v 'matchIt just is exactly candidate user account obtained in step 4 vselect, then it is assumed that vselectWith vmatchSuccessful match, otherwise it is assumed that it fails to match, by vselectReset to vmatch, into new one Wheel matching.
2. across social networks entity identities analytic method according to claim 1, it is characterised in that: in step 1, use VMN algorithm calculates separately out the user name similarity of each user and each user in another network in a network.
3. across social networks entity identities analytic method according to claim 1, which is characterized in that counted described in step 2 The user environment scoring of each user is calculated, specific implementation process is:
The relationship of two users both may be friend relation in the same network, it is also possible to be non-friend relation;Use good friend It spends to evaluate the degree of relationship of two people;
Good friend's degree calculation formula is as follows:
In formula, deg () indicates the number of a user good friend, va、vb∈ V is two users of consolidated network;That assumes is virtual User vm, all good friends are va、vbCommon friend, fa∈Fa, fb∈Fb, fm∈FmRespectively va, vb, vmGood friend set;
Good friend's degree is for calculating friend relation between two social network users, and corresponding is non-good friend's degree;It is non-good Friendly degree is for calculating the non-friend relation between two users, and calculation formula is as follows:
NFD(va, vb)=1- (deg (fm)-1+1)*(deg(va)-1+deg(vb)-1);
User environment scoring be to a user whether in another network there are the estimation that account is made, give a user The environment scoring calculation formula of v ∈ V, the user are as follows:
In formula, SseedSeed user set, seed user are to possess account in different social networks and can identify in advance The user come;W is weight factor, for improving the weight of cohesion;η is increment factor;Indicate be between two users Friend relation;"-" indicates that between two users be non-friend relation.
4. across social networks entity identities analytic method according to claim 1, it is characterised in that: in step 3, in conjunction with step User name similarity in rapid 1 and the user environment scoring in step 2, after the two is weighted addition, are carried out in descending order Sequence.
5. across social networks entity identities analytic method according to claim 3, which is characterized in that the specific reality of step 5 Existing process are as follows:
In the iterative process of each step, a selected user account v is givenselect, which possesses maximum user's ring Border score value, then calculates vselectAccount v is not matched arbitrarily1∈V1Between user's similarity;Assuming that vselectIts from Social networks V0, V1For social networks 1;
Two user v in given heterogeneous networks0∈V0And v1∈V1, the calculation formula of customer relationship similarity is as follows:
In formula, SseedSeed user set, FC () are two user's good friend's degree scorings, and NFD () is the non-good friend's degree of two users Scoring,It indicates to be friend relation between two users, "-" indicates that between two users be non-friend relation.
6. across social networks entity identities analytic method according to claim 5, which is characterized in that described in step 6 User matches scoring, implements process are as follows:
In conjunction with user name similarity and customer relationship similarity, a total score is calculated, for evaluating a social networks User and another social networks user matching degree;vselect∈V0With v ∈ V1It is two use of different social networks Family, it is as follows that user between the two matches scoring calculation formula:
UMS(vselect, v) and=USS (vselect, v) and * | sseed|*w1+URS(vselect, v) and * w2
Wherein, sseedSeed user set, w1And w2For weight factor.
7. across social networks entity identities analytic method described in -6 any one according to claim 1, it is characterised in that: step In 8, the matching user account v that is returned from account matching processmatch, using it as new candidate user account, and by It obtains a user again with algorithm and matches account v 'match;If v 'matchJust the candidate user account before being exactly vselect, at this time it is exactly vselectMost probable matching be vmatch, vmatchMost probable matching be vselect, so this is one Stable matching;Otherwise, it is considered as that it fails to match, by vselectBeing put into not match waits chance to match again, and by vselectResetting For vmatch, matched into a new round.
CN201811031997.1A 2018-09-05 2018-09-05 Across the social networks entity identities analytic method of one kind Pending CN109284589A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811031997.1A CN109284589A (en) 2018-09-05 2018-09-05 Across the social networks entity identities analytic method of one kind

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811031997.1A CN109284589A (en) 2018-09-05 2018-09-05 Across the social networks entity identities analytic method of one kind

Publications (1)

Publication Number Publication Date
CN109284589A true CN109284589A (en) 2019-01-29

Family

ID=65184523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811031997.1A Pending CN109284589A (en) 2018-09-05 2018-09-05 Across the social networks entity identities analytic method of one kind

Country Status (1)

Country Link
CN (1) CN109284589A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197207A (en) * 2019-05-13 2019-09-03 腾讯科技(深圳)有限公司 To not sorting out the method and relevant apparatus that user group is sorted out
CN110222790A (en) * 2019-06-17 2019-09-10 南京中孚信息技术有限公司 Method for identifying ID, device and server
CN110413900A (en) * 2019-08-01 2019-11-05 电子科技大学 More social networks account matching process based on viterbi algorithm
CN110598126A (en) * 2019-09-05 2019-12-20 河南科技大学 Cross-social network user identity recognition method based on behavior habits
CN115048563A (en) * 2022-08-15 2022-09-13 中国电子科技集团公司第三十研究所 Cross-social-network user identity matching method, medium and device based on entropy weight method

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330798A (en) * 2017-06-05 2017-11-07 大连理工大学 Method for identifying ID between a kind of social networks propagated based on seed node

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107330798A (en) * 2017-06-05 2017-11-07 大连理工大学 Method for identifying ID between a kind of social networks propagated based on seed node

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孟波: "多社交网络用户身份识别算法研究", 《中国优秀硕士学位论文全文数据库_信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110197207A (en) * 2019-05-13 2019-09-03 腾讯科技(深圳)有限公司 To not sorting out the method and relevant apparatus that user group is sorted out
CN110222790A (en) * 2019-06-17 2019-09-10 南京中孚信息技术有限公司 Method for identifying ID, device and server
CN110413900A (en) * 2019-08-01 2019-11-05 电子科技大学 More social networks account matching process based on viterbi algorithm
CN110598126A (en) * 2019-09-05 2019-12-20 河南科技大学 Cross-social network user identity recognition method based on behavior habits
CN110598126B (en) * 2019-09-05 2023-04-18 河南科技大学 Cross-social network user identity recognition method based on behavior habits
CN115048563A (en) * 2022-08-15 2022-09-13 中国电子科技集团公司第三十研究所 Cross-social-network user identity matching method, medium and device based on entropy weight method

Similar Documents

Publication Publication Date Title
CN109284589A (en) Across the social networks entity identities analytic method of one kind
Adikari et al. Identifying fake profiles in linkedin
US11323347B2 (en) Systems and methods for social graph data analytics to determine connectivity within a community
JP6438135B2 (en) Data mining method and apparatus based on social platform
Bartunov et al. Joint link-attribute user identity resolution in online social networks
Akehurst et al. CCR—a content-collaborative reciprocal recommender for online dating
US11361045B2 (en) Method, apparatus, and computer-readable storage medium for grouping social network nodes
CN111031017B (en) Abnormal business account identification method, device, server and storage medium
CN103188139B (en) A kind of information displaying method of recommending friends and device
CN106162544B (en) A kind of generation method and equipment of geography fence
CN103164416A (en) Identification method and device of user relationship
US20170300580A1 (en) System and method for identifying contacts of a target user in a social network
CN105978729B (en) A kind of cellphone information supplying system and method based on user's internet log and position
Elyusufi et al. Social networks fake profiles detection based on account setting and activity
CN112446736A (en) Click through rate CTR prediction method and device
Peng et al. Seed and grow: An attack against anonymized social networks
CN102902674B (en) Bundle of services component class method and system
Qin et al. Mining user's real social circle in microblog
CN105577434A (en) Multi-association mining method and device based on social network
Yang et al. On detecting growing-up behaviors of malicious accounts in privacy-centric mobile social networks
CN111062345A (en) Training method and device of vein recognition model and vein image recognition device
CN106549914B (en) identification method and device for independent visitor
CN109885760A (en) Information source tracing method and system based on user interest
CN108920925A (en) A kind of intelligent mobile terminal
Fogues et al. Exploring the viability of tie strength and tags in access controls for photo sharing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190129