CN106126654A - A kind of inter-network station based on user name similarity user-association method - Google Patents

A kind of inter-network station based on user name similarity user-association method Download PDF

Info

Publication number
CN106126654A
CN106126654A CN201610479968.6A CN201610479968A CN106126654A CN 106126654 A CN106126654 A CN 106126654A CN 201610479968 A CN201610479968 A CN 201610479968A CN 106126654 A CN106126654 A CN 106126654A
Authority
CN
China
Prior art keywords
user
feature
user name
self
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610479968.6A
Other languages
Chinese (zh)
Other versions
CN106126654B (en
Inventor
柳厅文
王玉斌
时金桥
亚静
李全刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Information Engineering of CAS
Original Assignee
Institute of Information Engineering of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Information Engineering of CAS filed Critical Institute of Information Engineering of CAS
Priority to CN201610479968.6A priority Critical patent/CN106126654B/en
Publication of CN106126654A publication Critical patent/CN106126654A/en
Application granted granted Critical
Publication of CN106126654B publication Critical patent/CN106126654B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The present invention provides a kind of inter-network station based on user name similarity user-association method, and step includes: 1) filter the character in multiple user names, only retains English alphabet and numeral;2) find out the feature of the above-mentioned user name processed, and obtain the self-information value of this feature, be worth to self-information vector according to this self-information;3) according to above-mentioned self-information vector, obtain the similarity between the plurality of user name, if this similarity is more than given threshold tau, then judge that the plurality of user name belongs to same user.By the similarity between multiple user names, this method judges whether it belongs to same user, the account on the different web sites belonging to same user can realize association.

Description

A kind of inter-network station based on user name similarity user-association method
Technical field
The present invention relates to computer realm, be specifically related to a kind of inter-network station based on user name similarity user-association side Method.
Background technology
At present increasing company provides information retrieval, resource downloading, virtual by setting up oneself website to user The network services such as social activity.People are when using these network services, it usually needs register account number obtain phase on each website The user name answered is as disclosed identity.If same user account relating on different web sites can be got up, permissible Promote the Consumer's Experience of many websites application.Such as, if same user can not had society in Dangdang.com, store, Jingdone district etc. Hand over the account on the shopping website of function, get up with the user-association of the social network sites such as Sina microblogging, Renren Network, then the most permissible The precise degrees that the individual character of shopping website is recommended is promoted by user's social network structure on social network sites.Therefore, will Belong to the account on the different web sites of same user be associated significant and be worth.
Existing inter-network station user-association method is broadly divided into following three classes:
1, by user fill on one's own initiative on the website at oneself certain account place or on other third-party application oneself The link of the personal homepage on each website, thus reach to be associated the account on the different web sites belonging to same user Purpose.
The Email address filled in when 2, registering by user or cell-phone number realize user-association, if at different web sites Two or more accounts are to use same Email address or cell-phone number registration, then these accounts are very likely belonging to Same user's.
3, some personal informations of user are got (such as sex, age by the open interface of web crawlers or website in advance Deng) and the information (such as microblogging, the model etc. of forum) issued, more therefrom extract and user-dependent feature modeling, finally The account on the different web sites of same user is belonged to by model solution.
The actual effect of the method actively being filled in the link of oneself personal homepage on each website by user depends on The integrity of the information that user fills in, if user is unwilling to fill in or fail to fill in, or user after having filled in again at it He registers new account on website, all can affect the integrity of the information that user fills in, and then causes associating some and originally belong to The account of same user, the practicality of the most this kind of method is the strongest.
The protection of individual privacy in the Internet is paid attention to by further at present, and the Email filled in during user's login account Address or cell-phone number broadly fall into the privacy information that user is more sensitive, so major part website is all without open individual subscriber Email address and cell-phone number.In other words, in most of websites, it is Email address and the mobile phone that can not get individual subscriber Number, therefore associate the method versatility of user with cell-phone number based on Email address the strongest.
Information retrieval feature based on individual subscriber data and issue the method modeled depend on user related information Authenticity and integrity, and owing to each website requires that when user registers the information filled in is not quite similar, some user simultaneously Can fill out partial information for the purpose deliberately mistake of protection individual privacy, these all may cause user related information untrue or not Completely, and then affecting the interrelating effect of this kind of method, the most this kind of method also has certain limitation.
Summary of the invention
In view of above-mentioned deficiency, the present invention provides a kind of inter-network station based on user name similarity user-association method, passes through Similarity between multiple user names judges whether it belongs to same user, to the account on the different web sites belonging to same user Association can be realized.
For solving above-mentioned technical problem, the technical solution used in the present invention is:
A kind of inter-network station based on user name similarity user-association method, according to the account of any number of different web sites User name, judges whether these accounts belong to same user, just can be associated if belonged to, and step includes:
1) character in multiple user names is filtered, only retain English alphabet and numeral;
2) find out the feature of the above-mentioned user name processed, and obtain the self-information value of this feature, according to this self-information value Obtain self-information vector;
3) according to above-mentioned self-information vector, the similarity between the plurality of user name is obtained, if this similarity is more than giving Fixed threshold tau, then judge that the plurality of user name belongs to same user.
Further, it is converted into lower case or upper case form by unified for the English alphabet of reservation.
Further, described feature includes the built-up sequence feature between substring content feature, letter and number, alphanumeric data Feature and keyboard layout feature.
Further, estimate the probability that described feature occurs in an assigned username set, obtain institute according to this probability State the self-information value of feature.
Further, the dimension of described self-information vector is equal with the quantity of described feature.
Further, calculate the cosine similarity of described self-information vector, obtain the plurality of according to this cosine similarity Similarity between user name.
Further, described threshold tau is a τ-value giving that in training set, acquirement F1 value is corresponding time optimum.
Further, described threshold tau is 0.15.
The invention has the beneficial effects as follows, the method utilizing the present invention to provide judges that multiple user names belong to same user, enters And realize inter-network station user-association, compared with prior art, the method only used user name this and user-dependent information, It is information disclosed in website and is not related to privacy of user, and this information is more easy to obtain, and it is relevant to user to be not only restricted to other The integrity of information, thus the method has higher versatility and practicality.It addition, the method utilizes the general of self-information Read, to user name in terms of content, the feature of the first-class multiple angles of pattern carry out unified tolerance, and be fused in a model, Higher accuracy rate can be reached compared to only using user name feature in terms of content.
Accompanying drawing explanation
Fig. 1 is the flow chart of a kind of inter-network station based on user name similarity user-association method in embodiment.
Detailed description of the invention
Features described above and advantage for making the present invention can become apparent, special embodiment below, and coordinate institute's accompanying drawing to make Describe in detail as follows.
There is provided a kind of inter-network station based on user name similarity user-association method, as shown in Figure 1, it is assumed that give two not User name a and b with the account on website, it is judged that whether a and b belongs to same user, comprises following step:
1, user name pretreatment
Generally website may require that the user name that user selects when login account, can only comprise English alphabet, numeral and Indivedual spcial character such as underscore.Preprocessing process can remove the spcial character in user name, only retains English alphabet and numeral, And English alphabet is converted into lower case or upper case form by unification, the present embodiment is as a example by lowercase versions.
2, the self-information of user name feature is calculated
For some user name feature λ, the feature identification function agreed as follows is in order to indicate whether certain user name u wraps Containing feature λ:
A sufficiently large user name set U is selected to estimate the probability that feature λ occurs:
Above-mentioned formula is meant that, the probability that feature λ occurs is equal to all user names of feature λ that comprise divided by user name collection Closing U, user name set U should be all of user name in theory, but the most unlikely takes all of user name, but can To be obtained the user name set U of a part of user name by certain way, estimate, with this, the probability that certain feature occurs, quite It is a sampling of all user names in U, estimates entirety with sampling.General U is the bigger the better, because the biggest closer to entirety, More can reflect the rule of entirety.
By the probability of above-mentioned estimation, and then the self-information value of each user name feature can be calculated:
Feature λ of the user name considered here includes:
Substring content feature: consider whether user name comprises the substring of certain shape such as α β, wherein α and β all represents arbitrarily Small English alphabet and numeral, owing to α β mono-has a combination possible in 1296, actual comprise 1296 features.
Built-up sequence feature between letter and number: consider whether user name is the one in following built-up sequence, " only bag Containing English alphabet " " only comprising numeral " " English alphabet+numeral " " numeral+English alphabet " " English alphabet+numeral+English alphabet " And " numeral+English alphabet+numeral ".In real data, other types can arrive less and neglect, and actual comprise 6 Individual feature.
Alphanumeric data feature: in user name, to describe the situation in certain period the most common for string number, it is considered in user name Whether comprise the date of some common format, including " year+moon+day " " moon+day+year " " day+moon+year " " moon+day " " year " etc., because of This is actual comprises 5 features.
Keyboard layout feature: consider three features relevant with keyboard layout of user name: the most all characters are equal on keyboard It is in same a line, except being full the situation of numeral;2. any two adjacent character is all adjacent and different rows on keyboard;③ Any two adjacent character position on keyboard is all equal or adjacent, actual comprises 3 features.
3, user name is expressed as self-information vector
Have selected altogether m feature before assuming, respectively two given user names a and b are expressed as the confidence of correspondence Breath vector:
V a = < W &lambda; 1 . J &lambda; 1 ( a ) , W &lambda; 2 . J &lambda; 2 ( a ) , ... , W &lambda; m . J &lambda; m ( a ) >
V b = < W &lambda; 1 . J &lambda; 1 ( b ) , W &lambda; 2 . J &lambda; 2 ( b ) , ... , W &lambda; m . J &lambda; m ( b ) >
4, the similarity between user name is calculated
Calculate the cosine similarity of the user name a self-information vector corresponding with b, as the similarity of user name a Yu b:
s i m ( a , b ) = cos ( V a , V b ) = V a &CenterDot; V b || V a || &CenterDot; || V b ||
5, judge whether user name belongs to same user
For given threshold tau (0 < τ≤1), if meeting sim (a, b) > τ, then it is assumed that the two account belongs to same User, otherwise it is assumed that the two account belongs to different users.
It is pointed out that threshold tau can have the training set clearly marked to obtain by one, i.e. in this training set Obtain the value of τ corresponding during F1 value optimum as threshold value.So-called have the training set clearly marked be exactly several tlv triple < a, B, c >, wherein a is certain user name of a website, and b is certain user name on another website, and c takes 0 or 1,1 expression two Individual user name belongs to same user, and 0 represents that user name is not belonging to same user.According to experiment, τ takes about 0.15 and can reach relatively Good effect, detailed content is as follows.
Now collect the user data 6302988 of csdn.net, the user data of 17173.com 2500264, The user data of 178.com 3827603, is denoted as CSDN, 17173 and 178 account data collection respectively, and these user data include The information such as user name and registration mailbox, it should be noted that these data belong to random collecting, and did not do the most pre-place Reason, to ensure objectivity and the accuracy of this experimental result.Owing to these user data comprising registration this information of mailbox, can To think if registration mailbox corresponding to two accounts belonging to different web sites is identical, then the two account is belonging to same use Family, otherwise the two account is not belonging to same user, constructs experimental data set with this.
155878 identical user names of registration mailbox have been focused to find out it to as just from the account data of CSDN and 17173 Example, randomly draws two data simultaneously and concentrates 155878 user names registering the different user name composition of mailbox to as negative Example, is denoted as CSDN+17173 experimental data set.112603 registration mailbox phases have been focused to find out it from the account data of CSDN and 178 Same user name, to as positive example, randomly draws the 112603 of two different user names compositions of data concentration registration mailbox simultaneously Individual user name, to as negative example, is denoted as CSDN+178 experimental data set.It is focused to find out from the account data of 17173 and 178 145849 registration identical user names of mailbox, to as positive example, are randomly drawed two data simultaneously and are concentrated registration mailboxes different 145849 user names of user name composition, to as negative example, are denoted as 17173+178 experimental data set.
This experiment uses accuracy rate, recall rate, F1 value to weigh the effect of the method method that the present invention provides.For experiment That data are grouped as positive example and this method, by thinking the quantity of the user name pair belonging to same user after calculating, is denoted as TP;Right In experimental data concentrate for positive example and this method by thinking the quantity of the user name pair being not belonging to same user after calculating, note Make FN;For experimental data concentrate for negative example and this method by thinking the number of the user name pair belonging to same user after calculating Amount, is denoted as FP;For experimental data concentrate for negative example and this method by thinking the user being not belonging to same user after calculating Name to quantity, be denoted as TN.So accuracy rate, recall rate, computing formula of F1 value is respectively as follows: accuracy rate=TP/ (TP+FP), Recall rate=TP/ (TP+FN), F1 value=(2* accuracy rate * recall rate)/(accuracy rate+recall rate).
Above three experimental data set carries out ten folding cross validations, is found through experiments, when threshold tau takes about 0.15 Preferable experiment effect can be obtained.The experiment effect obtained when τ=0.15 is as shown in the table:
Table 1
In table 1, Name-Match method is control methods, and the thought of the method is, if two user names are identical, Think that it belongs to same user, otherwise it is assumed that it is not belonging to same user.
The detailed process below calculated with two example explanations:
Embodiment one: judge whether user name a=ye2dai and b=ye8023dai belong to same user
Randomly drawing in data set that 1657320 user names are as user name set U, threshold tau takes 0.15.
First user name a is expressed as self-information vector.The substring content feature that user name a comprises include ye, e2,2d, Da and ai, the built-up sequence between letter and number is characterized as " English alphabet+numeral+English alphabet ", does not has alphanumeric data feature With keyboard layout feature.The self-information value of each feature that user name a has is calculated as follows shown in table:
Table 2
Due in self-information vector major part item be 0, therefore only with the form of " feature: self-information value " describe self-information to Not being the item of 0 in amount, the self-information vector obtaining user name a corresponding is:
Va=< ye:4.660, e2:5.607,2d:8.429, da:3.915, ai:3.179, English alphabet+numeral+English Letter: 3.490 >
Then user name b is expressed as self-information vector.The substring content feature that user name b comprises include ye, e8,80, 02,23,3d, da and ai, the built-up sequence between letter and number is characterized as " English alphabet+numeral+English alphabet ", does not has numeral Date feature and keyboard layout feature.The self-information value of each feature that user name b has is calculated as follows shown in table:
Table 3
And then the self-information vector obtaining user name b corresponding is:
Vb=< ye:4.660, e8:6.089,80:3.595,02:3.307,23:2.767,3d:8.052, da:3.915, Ai:3.179, English alphabet+numeral+English alphabet: 3.490 >
Calculate VaAnd VbCosine similarity:
cos(Va,Vb)=(4.660*4.660+3.915*3.915+3.179*3.179+3.490*3.490)/[sqrt (4.6602+5.6072+8.4292+3.9152+3.1792+3.4902)*sqrt(4.6602+6.0892+3.5952+3.3072+ 2.7672+8.0522+3.9152+3.1792+3.4902)]=0.336
Wherein, sqrt is extraction of square root computing.
So there being sim, (a, b)=0.336 > τ, therefore this method thinks that user name a=ye2dai and b=ye8023dai belong to In same user.
Embodiment two: judge whether user name a=asdfjk and b=as1001 belong to same user
Randomly drawing in data set that 1657320 user names are as user name set U, threshold tau takes 0.15.
First user name a is expressed as self-information vector.The substring content feature that user name a comprises include as, sd, df, Fj and jk, the built-up sequence between letter and number is characterized as " only comprising English alphabet ", does not has alphanumeric data feature, meet keyboard Spatial layout feature is 1..The self-information value of each feature that user name a has is calculated as follows shown in table:
Table 4
Due in self-information vector major part item be 0, therefore only with the form of " feature: self-information value " describe self-information to Not being the item of 0 in amount, the self-information vector obtaining user name a corresponding is:
Va=< as:3.847, sd:4.422, df:5.359, fj:6.544, jk:5.996, only comprise English alphabet: 1.183, keyboard layout feature 1.: 5.314 >
Then user name b is expressed as self-information vector.The substring content feature that user name b comprises include as, s1,10, 00 and 01, the built-up sequence between letter and number is characterized as " English alphabet+numeral ", and alphanumeric data is characterized as " moon+day " (1001 Can be regarded as October 1), there is no keyboard layout feature.The self-information value of each feature that user name b has is calculated as follows table Shown in:
Table 5
And then the self-information vector obtaining user name b corresponding is:
Vb=< as:3.847, s1:5.281,10:2.813,00:2.616,01:2.955, English alphabet+numeral: 0.552, the moon+day: 3.449 >
Calculate VaAnd VbCosine similarity:
cos(Va,Vb)=(3.847*3.847)/[sqrt (3.8472+4.4222+5.3592+6.5442+5.9962+ 1.1832+5.3142)*sqrt(3.8472+5.2812+2.8132+2.6162+2.9552+0.5522+3.4492)]=0.128
Wherein, sqrt is extraction of square root computing.
So have sim (a, b)=0.128 < τ, therefore this method thinks that user name a=asdfjk and b=as1001 do not belong to In same user.
As seen from the above embodiment, the method that the present invention provides only used user name this and user-dependent information, Just judge that multiple user name belongs to same user, and then realize inter-network station user-association.User name be information disclosed in website and It is not related to privacy of user, is more easy to obtain, and is not only restricted to the integrity of other and user-dependent information, thus the method tool There are higher versatility and practicality.It addition, this method utilizes the concept of self-information, to user name in terms of content, pattern first-class The feature of multiple angles carries out unified tolerance, and be fused in a model, compared to only using user name in terms of content Feature can reach higher accuracy rate.
Last it should be noted that, although the present invention is open as above with embodiment, but these embodiments are not intended to limit Determining the present invention, in art, it can be modified or replace by those of ordinary skill, without deviating from the essence of the present invention God and scope, therefore protection scope of the present invention is as the criterion with claims.

Claims (8)

1. inter-network station based on a user name similarity user-association method, step includes:
1) character in multiple user names is filtered, only retain English alphabet and numeral;
2) find out the feature of the above-mentioned user name processed, and obtain the self-information value of this feature, be worth to according to this self-information Self-information vector;
3) according to above-mentioned self-information vector, the similarity between the plurality of user name is obtained, if this similarity is more than given Threshold tau, then judge that the plurality of user name belongs to same user.
Method the most according to claim 1, it is characterised in that be converted into lower case or upper case by unified for the English alphabet of reservation Form.
Method the most according to claim 1, it is characterised in that described feature includes substring content feature, letter and number Between built-up sequence feature, alphanumeric data feature and keyboard layout feature.
Method the most according to claim 1, it is characterised in that estimate that described feature occurs in an assigned username set Probability, obtain the self-information value of described feature according to this probability.
Method the most according to claim 1, it is characterised in that the dimension of described self-information vector and the quantity of described feature Equal.
Method the most according to claim 1, it is characterised in that calculate the cosine similarity of described self-information vector, according to This cosine similarity obtains the similarity between the plurality of user name.
Method the most according to claim 1, it is characterised in that described threshold tau is one to give and obtain F1 value in training set τ-value corresponding time excellent.
Method the most according to claim 1, it is characterised in that described threshold tau is 0.15.
CN201610479968.6A 2016-06-27 2016-06-27 A kind of inter-network station user-association method based on user name similarity Active CN106126654B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610479968.6A CN106126654B (en) 2016-06-27 2016-06-27 A kind of inter-network station user-association method based on user name similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610479968.6A CN106126654B (en) 2016-06-27 2016-06-27 A kind of inter-network station user-association method based on user name similarity

Publications (2)

Publication Number Publication Date
CN106126654A true CN106126654A (en) 2016-11-16
CN106126654B CN106126654B (en) 2019-10-18

Family

ID=57266694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610479968.6A Active CN106126654B (en) 2016-06-27 2016-06-27 A kind of inter-network station user-association method based on user name similarity

Country Status (1)

Country Link
CN (1) CN106126654B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107066616A (en) * 2017-05-09 2017-08-18 北京京东金融科技控股有限公司 Method, device and electronic equipment for account processing
CN107070702A (en) * 2017-03-13 2017-08-18 中国人民解放军信息工程大学 User account correlating method and its device based on cooperative game SVMs
CN107358075A (en) * 2017-07-07 2017-11-17 四川大学 A kind of fictitious users detection method based on hierarchical clustering
CN108846422A (en) * 2018-05-28 2018-11-20 中国人民公安大学 Account relating method and system across social networks
CN109087140A (en) * 2018-08-07 2018-12-25 广州航海学院 A kind of closed loop target client's recognition methods based on spark big data

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102768659A (en) * 2011-05-03 2012-11-07 阿里巴巴集团控股有限公司 Method and system for identifying repeated account
CN104239490A (en) * 2014-09-05 2014-12-24 电子科技大学 Multi-account detection method and device for UGC (user generated content) website platform
CN104317784A (en) * 2014-09-30 2015-01-28 苏州大学 Cross-platform user identification method and cross-platform user identification system
CN104765729A (en) * 2014-01-02 2015-07-08 中国人民大学 Cross-platform micro-blogging community account matching method
CN104899267A (en) * 2015-05-22 2015-09-09 中国电子科技集团公司第二十八研究所 Integrated data mining method for similarity of accounts on social network sites

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102768659A (en) * 2011-05-03 2012-11-07 阿里巴巴集团控股有限公司 Method and system for identifying repeated account
CN104765729A (en) * 2014-01-02 2015-07-08 中国人民大学 Cross-platform micro-blogging community account matching method
CN104239490A (en) * 2014-09-05 2014-12-24 电子科技大学 Multi-account detection method and device for UGC (user generated content) website platform
CN104317784A (en) * 2014-09-30 2015-01-28 苏州大学 Cross-platform user identification method and cross-platform user identification system
CN104899267A (en) * 2015-05-22 2015-09-09 中国电子科技集团公司第二十八研究所 Integrated data mining method for similarity of accounts on social network sites

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘东,等: "基于用户名特征的用户身份同一性判定方法", 《计算机学报》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107070702A (en) * 2017-03-13 2017-08-18 中国人民解放军信息工程大学 User account correlating method and its device based on cooperative game SVMs
CN107070702B (en) * 2017-03-13 2019-12-10 中国人民解放军信息工程大学 User account correlation method and device based on cooperative game support vector machine
CN107066616A (en) * 2017-05-09 2017-08-18 北京京东金融科技控股有限公司 Method, device and electronic equipment for account processing
CN107358075A (en) * 2017-07-07 2017-11-17 四川大学 A kind of fictitious users detection method based on hierarchical clustering
CN108846422A (en) * 2018-05-28 2018-11-20 中国人民公安大学 Account relating method and system across social networks
CN108846422B (en) * 2018-05-28 2021-08-31 中国人民公安大学 Account number association method and system across social networks
CN109087140A (en) * 2018-08-07 2018-12-25 广州航海学院 A kind of closed loop target client's recognition methods based on spark big data

Also Published As

Publication number Publication date
CN106126654B (en) 2019-10-18

Similar Documents

Publication Publication Date Title
CN106126654A (en) A kind of inter-network station based on user name similarity user-association method
CN104899273B (en) A kind of Web Personalization method based on topic and relative entropy
CN103176982B (en) The method and system that a kind of e-book is recommended
CN107169873B (en) Multi-feature fusion microblog user authority evaluation method
CN105005594B (en) Abnormal microblog users recognition methods
KR101708508B1 (en) Method for calculating semantic similarities between messages and conversations based on enhanced entity extraction
CN104899267B (en) A kind of integrated data method for digging of social network sites account similarity
CN103886067B (en) Method for recommending books through label implied topic
CN104462547B (en) A kind of method and system of configurable collecting webpage data
CN106484764A (en) User&#39;s similarity calculating method based on crowd portrayal technology
CN103823893A (en) User comment-based product search method and system
CN106682152A (en) Recommendation method for personalized information
CN101706812B (en) Method and device for searching documents
CN102750375A (en) Service and tag recommendation method based on random walk
CN104834640A (en) Webpage identification method and apparatus
Ju et al. Relationship strength estimation based on Wechat Friends Circle
CN109905873A (en) A kind of network account correlating method based on signature identification information
CN101493818A (en) Network information searching method based on human relation network
CN104573081B (en) A kind of personal social relationships data digging method based on SNS
CN106202312A (en) A kind of interest point search method for mobile Internet and system
CN102750288B (en) A kind of internet content recommend method and device
CN103593360A (en) Internet information publishing time extraction method based on page analysis
CN102693284A (en) Extraction method of information in personal address list
CN104573076B (en) A kind of Chinese remark names system recommendation method of social network sites user
CN106528595B (en) Realm information based on website homepage content is collected and correlating method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant