CN106126654A - A kind of inter-network station based on user name similarity user-association method - Google Patents
A kind of inter-network station based on user name similarity user-association method Download PDFInfo
- Publication number
- CN106126654A CN106126654A CN201610479968.6A CN201610479968A CN106126654A CN 106126654 A CN106126654 A CN 106126654A CN 201610479968 A CN201610479968 A CN 201610479968A CN 106126654 A CN106126654 A CN 106126654A
- Authority
- CN
- China
- Prior art keywords
- user
- feature
- user name
- self
- similarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 230000000694 effects Effects 0.000 description 7
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 238000002474 experimental method Methods 0.000 description 6
- 239000012141 concentrate Substances 0.000 description 4
- 230000001419 dependent effect Effects 0.000 description 4
- 239000000203 mixture Substances 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 244000097202 Rathbunia alamosensis Species 0.000 description 1
- 235000009776 Rathbunia alamosensis Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Primary Health Care (AREA)
- Marketing (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Economics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of inter-network station based on user name similarity user-association method, and step includes: 1) filter the character in multiple user names, only retains English alphabet and numeral;2) find out the feature of the above-mentioned user name processed, and obtain the self-information value of this feature, be worth to self-information vector according to this self-information;3) according to above-mentioned self-information vector, obtain the similarity between the plurality of user name, if this similarity is more than given threshold tau, then judge that the plurality of user name belongs to same user.By the similarity between multiple user names, this method judges whether it belongs to same user, the account on the different web sites belonging to same user can realize association.
Description
Technical field
The present invention relates to computer realm, be specifically related to a kind of inter-network station based on user name similarity user-association side
Method.
Background technology
At present increasing company provides information retrieval, resource downloading, virtual by setting up oneself website to user
The network services such as social activity.People are when using these network services, it usually needs register account number obtain phase on each website
The user name answered is as disclosed identity.If same user account relating on different web sites can be got up, permissible
Promote the Consumer's Experience of many websites application.Such as, if same user can not had society in Dangdang.com, store, Jingdone district etc.
Hand over the account on the shopping website of function, get up with the user-association of the social network sites such as Sina microblogging, Renren Network, then the most permissible
The precise degrees that the individual character of shopping website is recommended is promoted by user's social network structure on social network sites.Therefore, will
Belong to the account on the different web sites of same user be associated significant and be worth.
Existing inter-network station user-association method is broadly divided into following three classes:
1, by user fill on one's own initiative on the website at oneself certain account place or on other third-party application oneself
The link of the personal homepage on each website, thus reach to be associated the account on the different web sites belonging to same user
Purpose.
The Email address filled in when 2, registering by user or cell-phone number realize user-association, if at different web sites
Two or more accounts are to use same Email address or cell-phone number registration, then these accounts are very likely belonging to
Same user's.
3, some personal informations of user are got (such as sex, age by the open interface of web crawlers or website in advance
Deng) and the information (such as microblogging, the model etc. of forum) issued, more therefrom extract and user-dependent feature modeling, finally
The account on the different web sites of same user is belonged to by model solution.
The actual effect of the method actively being filled in the link of oneself personal homepage on each website by user depends on
The integrity of the information that user fills in, if user is unwilling to fill in or fail to fill in, or user after having filled in again at it
He registers new account on website, all can affect the integrity of the information that user fills in, and then causes associating some and originally belong to
The account of same user, the practicality of the most this kind of method is the strongest.
The protection of individual privacy in the Internet is paid attention to by further at present, and the Email filled in during user's login account
Address or cell-phone number broadly fall into the privacy information that user is more sensitive, so major part website is all without open individual subscriber
Email address and cell-phone number.In other words, in most of websites, it is Email address and the mobile phone that can not get individual subscriber
Number, therefore associate the method versatility of user with cell-phone number based on Email address the strongest.
Information retrieval feature based on individual subscriber data and issue the method modeled depend on user related information
Authenticity and integrity, and owing to each website requires that when user registers the information filled in is not quite similar, some user simultaneously
Can fill out partial information for the purpose deliberately mistake of protection individual privacy, these all may cause user related information untrue or not
Completely, and then affecting the interrelating effect of this kind of method, the most this kind of method also has certain limitation.
Summary of the invention
In view of above-mentioned deficiency, the present invention provides a kind of inter-network station based on user name similarity user-association method, passes through
Similarity between multiple user names judges whether it belongs to same user, to the account on the different web sites belonging to same user
Association can be realized.
For solving above-mentioned technical problem, the technical solution used in the present invention is:
A kind of inter-network station based on user name similarity user-association method, according to the account of any number of different web sites
User name, judges whether these accounts belong to same user, just can be associated if belonged to, and step includes:
1) character in multiple user names is filtered, only retain English alphabet and numeral;
2) find out the feature of the above-mentioned user name processed, and obtain the self-information value of this feature, according to this self-information value
Obtain self-information vector;
3) according to above-mentioned self-information vector, the similarity between the plurality of user name is obtained, if this similarity is more than giving
Fixed threshold tau, then judge that the plurality of user name belongs to same user.
Further, it is converted into lower case or upper case form by unified for the English alphabet of reservation.
Further, described feature includes the built-up sequence feature between substring content feature, letter and number, alphanumeric data
Feature and keyboard layout feature.
Further, estimate the probability that described feature occurs in an assigned username set, obtain institute according to this probability
State the self-information value of feature.
Further, the dimension of described self-information vector is equal with the quantity of described feature.
Further, calculate the cosine similarity of described self-information vector, obtain the plurality of according to this cosine similarity
Similarity between user name.
Further, described threshold tau is a τ-value giving that in training set, acquirement F1 value is corresponding time optimum.
Further, described threshold tau is 0.15.
The invention has the beneficial effects as follows, the method utilizing the present invention to provide judges that multiple user names belong to same user, enters
And realize inter-network station user-association, compared with prior art, the method only used user name this and user-dependent information,
It is information disclosed in website and is not related to privacy of user, and this information is more easy to obtain, and it is relevant to user to be not only restricted to other
The integrity of information, thus the method has higher versatility and practicality.It addition, the method utilizes the general of self-information
Read, to user name in terms of content, the feature of the first-class multiple angles of pattern carry out unified tolerance, and be fused in a model,
Higher accuracy rate can be reached compared to only using user name feature in terms of content.
Accompanying drawing explanation
Fig. 1 is the flow chart of a kind of inter-network station based on user name similarity user-association method in embodiment.
Detailed description of the invention
Features described above and advantage for making the present invention can become apparent, special embodiment below, and coordinate institute's accompanying drawing to make
Describe in detail as follows.
There is provided a kind of inter-network station based on user name similarity user-association method, as shown in Figure 1, it is assumed that give two not
User name a and b with the account on website, it is judged that whether a and b belongs to same user, comprises following step:
1, user name pretreatment
Generally website may require that the user name that user selects when login account, can only comprise English alphabet, numeral and
Indivedual spcial character such as underscore.Preprocessing process can remove the spcial character in user name, only retains English alphabet and numeral,
And English alphabet is converted into lower case or upper case form by unification, the present embodiment is as a example by lowercase versions.
2, the self-information of user name feature is calculated
For some user name feature λ, the feature identification function agreed as follows is in order to indicate whether certain user name u wraps
Containing feature λ:
A sufficiently large user name set U is selected to estimate the probability that feature λ occurs:
Above-mentioned formula is meant that, the probability that feature λ occurs is equal to all user names of feature λ that comprise divided by user name collection
Closing U, user name set U should be all of user name in theory, but the most unlikely takes all of user name, but can
To be obtained the user name set U of a part of user name by certain way, estimate, with this, the probability that certain feature occurs, quite
It is a sampling of all user names in U, estimates entirety with sampling.General U is the bigger the better, because the biggest closer to entirety,
More can reflect the rule of entirety.
By the probability of above-mentioned estimation, and then the self-information value of each user name feature can be calculated:
Feature λ of the user name considered here includes:
Substring content feature: consider whether user name comprises the substring of certain shape such as α β, wherein α and β all represents arbitrarily
Small English alphabet and numeral, owing to α β mono-has a combination possible in 1296, actual comprise 1296 features.
Built-up sequence feature between letter and number: consider whether user name is the one in following built-up sequence, " only bag
Containing English alphabet " " only comprising numeral " " English alphabet+numeral " " numeral+English alphabet " " English alphabet+numeral+English alphabet "
And " numeral+English alphabet+numeral ".In real data, other types can arrive less and neglect, and actual comprise 6
Individual feature.
Alphanumeric data feature: in user name, to describe the situation in certain period the most common for string number, it is considered in user name
Whether comprise the date of some common format, including " year+moon+day " " moon+day+year " " day+moon+year " " moon+day " " year " etc., because of
This is actual comprises 5 features.
Keyboard layout feature: consider three features relevant with keyboard layout of user name: the most all characters are equal on keyboard
It is in same a line, except being full the situation of numeral;2. any two adjacent character is all adjacent and different rows on keyboard;③
Any two adjacent character position on keyboard is all equal or adjacent, actual comprises 3 features.
3, user name is expressed as self-information vector
Have selected altogether m feature before assuming, respectively two given user names a and b are expressed as the confidence of correspondence
Breath vector:
4, the similarity between user name is calculated
Calculate the cosine similarity of the user name a self-information vector corresponding with b, as the similarity of user name a Yu b:
5, judge whether user name belongs to same user
For given threshold tau (0 < τ≤1), if meeting sim (a, b) > τ, then it is assumed that the two account belongs to same
User, otherwise it is assumed that the two account belongs to different users.
It is pointed out that threshold tau can have the training set clearly marked to obtain by one, i.e. in this training set
Obtain the value of τ corresponding during F1 value optimum as threshold value.So-called have the training set clearly marked be exactly several tlv triple < a,
B, c >, wherein a is certain user name of a website, and b is certain user name on another website, and c takes 0 or 1,1 expression two
Individual user name belongs to same user, and 0 represents that user name is not belonging to same user.According to experiment, τ takes about 0.15 and can reach relatively
Good effect, detailed content is as follows.
Now collect the user data 6302988 of csdn.net, the user data of 17173.com 2500264,
The user data of 178.com 3827603, is denoted as CSDN, 17173 and 178 account data collection respectively, and these user data include
The information such as user name and registration mailbox, it should be noted that these data belong to random collecting, and did not do the most pre-place
Reason, to ensure objectivity and the accuracy of this experimental result.Owing to these user data comprising registration this information of mailbox, can
To think if registration mailbox corresponding to two accounts belonging to different web sites is identical, then the two account is belonging to same use
Family, otherwise the two account is not belonging to same user, constructs experimental data set with this.
155878 identical user names of registration mailbox have been focused to find out it to as just from the account data of CSDN and 17173
Example, randomly draws two data simultaneously and concentrates 155878 user names registering the different user name composition of mailbox to as negative
Example, is denoted as CSDN+17173 experimental data set.112603 registration mailbox phases have been focused to find out it from the account data of CSDN and 178
Same user name, to as positive example, randomly draws the 112603 of two different user names compositions of data concentration registration mailbox simultaneously
Individual user name, to as negative example, is denoted as CSDN+178 experimental data set.It is focused to find out from the account data of 17173 and 178
145849 registration identical user names of mailbox, to as positive example, are randomly drawed two data simultaneously and are concentrated registration mailboxes different
145849 user names of user name composition, to as negative example, are denoted as 17173+178 experimental data set.
This experiment uses accuracy rate, recall rate, F1 value to weigh the effect of the method method that the present invention provides.For experiment
That data are grouped as positive example and this method, by thinking the quantity of the user name pair belonging to same user after calculating, is denoted as TP;Right
In experimental data concentrate for positive example and this method by thinking the quantity of the user name pair being not belonging to same user after calculating, note
Make FN;For experimental data concentrate for negative example and this method by thinking the number of the user name pair belonging to same user after calculating
Amount, is denoted as FP;For experimental data concentrate for negative example and this method by thinking the user being not belonging to same user after calculating
Name to quantity, be denoted as TN.So accuracy rate, recall rate, computing formula of F1 value is respectively as follows: accuracy rate=TP/ (TP+FP),
Recall rate=TP/ (TP+FN), F1 value=(2* accuracy rate * recall rate)/(accuracy rate+recall rate).
Above three experimental data set carries out ten folding cross validations, is found through experiments, when threshold tau takes about 0.15
Preferable experiment effect can be obtained.The experiment effect obtained when τ=0.15 is as shown in the table:
Table 1
In table 1, Name-Match method is control methods, and the thought of the method is, if two user names are identical,
Think that it belongs to same user, otherwise it is assumed that it is not belonging to same user.
The detailed process below calculated with two example explanations:
Embodiment one: judge whether user name a=ye2dai and b=ye8023dai belong to same user
Randomly drawing in data set that 1657320 user names are as user name set U, threshold tau takes 0.15.
First user name a is expressed as self-information vector.The substring content feature that user name a comprises include ye, e2,2d,
Da and ai, the built-up sequence between letter and number is characterized as " English alphabet+numeral+English alphabet ", does not has alphanumeric data feature
With keyboard layout feature.The self-information value of each feature that user name a has is calculated as follows shown in table:
Table 2
Due in self-information vector major part item be 0, therefore only with the form of " feature: self-information value " describe self-information to
Not being the item of 0 in amount, the self-information vector obtaining user name a corresponding is:
Va=< ye:4.660, e2:5.607,2d:8.429, da:3.915, ai:3.179, English alphabet+numeral+English
Letter: 3.490 >
Then user name b is expressed as self-information vector.The substring content feature that user name b comprises include ye, e8,80,
02,23,3d, da and ai, the built-up sequence between letter and number is characterized as " English alphabet+numeral+English alphabet ", does not has numeral
Date feature and keyboard layout feature.The self-information value of each feature that user name b has is calculated as follows shown in table:
Table 3
And then the self-information vector obtaining user name b corresponding is:
Vb=< ye:4.660, e8:6.089,80:3.595,02:3.307,23:2.767,3d:8.052, da:3.915,
Ai:3.179, English alphabet+numeral+English alphabet: 3.490 >
Calculate VaAnd VbCosine similarity:
cos(Va,Vb)=(4.660*4.660+3.915*3.915+3.179*3.179+3.490*3.490)/[sqrt
(4.6602+5.6072+8.4292+3.9152+3.1792+3.4902)*sqrt(4.6602+6.0892+3.5952+3.3072+
2.7672+8.0522+3.9152+3.1792+3.4902)]=0.336
Wherein, sqrt is extraction of square root computing.
So there being sim, (a, b)=0.336 > τ, therefore this method thinks that user name a=ye2dai and b=ye8023dai belong to
In same user.
Embodiment two: judge whether user name a=asdfjk and b=as1001 belong to same user
Randomly drawing in data set that 1657320 user names are as user name set U, threshold tau takes 0.15.
First user name a is expressed as self-information vector.The substring content feature that user name a comprises include as, sd, df,
Fj and jk, the built-up sequence between letter and number is characterized as " only comprising English alphabet ", does not has alphanumeric data feature, meet keyboard
Spatial layout feature is 1..The self-information value of each feature that user name a has is calculated as follows shown in table:
Table 4
Due in self-information vector major part item be 0, therefore only with the form of " feature: self-information value " describe self-information to
Not being the item of 0 in amount, the self-information vector obtaining user name a corresponding is:
Va=< as:3.847, sd:4.422, df:5.359, fj:6.544, jk:5.996, only comprise English alphabet:
1.183, keyboard layout feature 1.: 5.314 >
Then user name b is expressed as self-information vector.The substring content feature that user name b comprises include as, s1,10,
00 and 01, the built-up sequence between letter and number is characterized as " English alphabet+numeral ", and alphanumeric data is characterized as " moon+day " (1001
Can be regarded as October 1), there is no keyboard layout feature.The self-information value of each feature that user name b has is calculated as follows table
Shown in:
Table 5
And then the self-information vector obtaining user name b corresponding is:
Vb=< as:3.847, s1:5.281,10:2.813,00:2.616,01:2.955, English alphabet+numeral:
0.552, the moon+day: 3.449 >
Calculate VaAnd VbCosine similarity:
cos(Va,Vb)=(3.847*3.847)/[sqrt (3.8472+4.4222+5.3592+6.5442+5.9962+
1.1832+5.3142)*sqrt(3.8472+5.2812+2.8132+2.6162+2.9552+0.5522+3.4492)]=0.128
Wherein, sqrt is extraction of square root computing.
So have sim (a, b)=0.128 < τ, therefore this method thinks that user name a=asdfjk and b=as1001 do not belong to
In same user.
As seen from the above embodiment, the method that the present invention provides only used user name this and user-dependent information,
Just judge that multiple user name belongs to same user, and then realize inter-network station user-association.User name be information disclosed in website and
It is not related to privacy of user, is more easy to obtain, and is not only restricted to the integrity of other and user-dependent information, thus the method tool
There are higher versatility and practicality.It addition, this method utilizes the concept of self-information, to user name in terms of content, pattern first-class
The feature of multiple angles carries out unified tolerance, and be fused in a model, compared to only using user name in terms of content
Feature can reach higher accuracy rate.
Last it should be noted that, although the present invention is open as above with embodiment, but these embodiments are not intended to limit
Determining the present invention, in art, it can be modified or replace by those of ordinary skill, without deviating from the essence of the present invention
God and scope, therefore protection scope of the present invention is as the criterion with claims.
Claims (8)
1. inter-network station based on a user name similarity user-association method, step includes:
1) character in multiple user names is filtered, only retain English alphabet and numeral;
2) find out the feature of the above-mentioned user name processed, and obtain the self-information value of this feature, be worth to according to this self-information
Self-information vector;
3) according to above-mentioned self-information vector, the similarity between the plurality of user name is obtained, if this similarity is more than given
Threshold tau, then judge that the plurality of user name belongs to same user.
Method the most according to claim 1, it is characterised in that be converted into lower case or upper case by unified for the English alphabet of reservation
Form.
Method the most according to claim 1, it is characterised in that described feature includes substring content feature, letter and number
Between built-up sequence feature, alphanumeric data feature and keyboard layout feature.
Method the most according to claim 1, it is characterised in that estimate that described feature occurs in an assigned username set
Probability, obtain the self-information value of described feature according to this probability.
Method the most according to claim 1, it is characterised in that the dimension of described self-information vector and the quantity of described feature
Equal.
Method the most according to claim 1, it is characterised in that calculate the cosine similarity of described self-information vector, according to
This cosine similarity obtains the similarity between the plurality of user name.
Method the most according to claim 1, it is characterised in that described threshold tau is one to give and obtain F1 value in training set
τ-value corresponding time excellent.
Method the most according to claim 1, it is characterised in that described threshold tau is 0.15.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610479968.6A CN106126654B (en) | 2016-06-27 | 2016-06-27 | A kind of inter-network station user-association method based on user name similarity |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610479968.6A CN106126654B (en) | 2016-06-27 | 2016-06-27 | A kind of inter-network station user-association method based on user name similarity |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106126654A true CN106126654A (en) | 2016-11-16 |
CN106126654B CN106126654B (en) | 2019-10-18 |
Family
ID=57266694
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610479968.6A Active CN106126654B (en) | 2016-06-27 | 2016-06-27 | A kind of inter-network station user-association method based on user name similarity |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106126654B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107066616A (en) * | 2017-05-09 | 2017-08-18 | 北京京东金融科技控股有限公司 | Method, device and electronic equipment for account processing |
CN107070702A (en) * | 2017-03-13 | 2017-08-18 | 中国人民解放军信息工程大学 | User account correlating method and its device based on cooperative game SVMs |
CN107358075A (en) * | 2017-07-07 | 2017-11-17 | 四川大学 | A kind of fictitious users detection method based on hierarchical clustering |
CN108846422A (en) * | 2018-05-28 | 2018-11-20 | 中国人民公安大学 | Account relating method and system across social networks |
CN109087140A (en) * | 2018-08-07 | 2018-12-25 | 广州航海学院 | A kind of closed loop target client's recognition methods based on spark big data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102768659A (en) * | 2011-05-03 | 2012-11-07 | 阿里巴巴集团控股有限公司 | Method and system for identifying repeated account |
CN104239490A (en) * | 2014-09-05 | 2014-12-24 | 电子科技大学 | Multi-account detection method and device for UGC (user generated content) website platform |
CN104317784A (en) * | 2014-09-30 | 2015-01-28 | 苏州大学 | Cross-platform user identification method and cross-platform user identification system |
CN104765729A (en) * | 2014-01-02 | 2015-07-08 | 中国人民大学 | Cross-platform micro-blogging community account matching method |
CN104899267A (en) * | 2015-05-22 | 2015-09-09 | 中国电子科技集团公司第二十八研究所 | Integrated data mining method for similarity of accounts on social network sites |
-
2016
- 2016-06-27 CN CN201610479968.6A patent/CN106126654B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102768659A (en) * | 2011-05-03 | 2012-11-07 | 阿里巴巴集团控股有限公司 | Method and system for identifying repeated account |
CN104765729A (en) * | 2014-01-02 | 2015-07-08 | 中国人民大学 | Cross-platform micro-blogging community account matching method |
CN104239490A (en) * | 2014-09-05 | 2014-12-24 | 电子科技大学 | Multi-account detection method and device for UGC (user generated content) website platform |
CN104317784A (en) * | 2014-09-30 | 2015-01-28 | 苏州大学 | Cross-platform user identification method and cross-platform user identification system |
CN104899267A (en) * | 2015-05-22 | 2015-09-09 | 中国电子科技集团公司第二十八研究所 | Integrated data mining method for similarity of accounts on social network sites |
Non-Patent Citations (1)
Title |
---|
刘东,等: "基于用户名特征的用户身份同一性判定方法", 《计算机学报》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107070702A (en) * | 2017-03-13 | 2017-08-18 | 中国人民解放军信息工程大学 | User account correlating method and its device based on cooperative game SVMs |
CN107070702B (en) * | 2017-03-13 | 2019-12-10 | 中国人民解放军信息工程大学 | User account correlation method and device based on cooperative game support vector machine |
CN107066616A (en) * | 2017-05-09 | 2017-08-18 | 北京京东金融科技控股有限公司 | Method, device and electronic equipment for account processing |
CN107358075A (en) * | 2017-07-07 | 2017-11-17 | 四川大学 | A kind of fictitious users detection method based on hierarchical clustering |
CN108846422A (en) * | 2018-05-28 | 2018-11-20 | 中国人民公安大学 | Account relating method and system across social networks |
CN108846422B (en) * | 2018-05-28 | 2021-08-31 | 中国人民公安大学 | Account number association method and system across social networks |
CN109087140A (en) * | 2018-08-07 | 2018-12-25 | 广州航海学院 | A kind of closed loop target client's recognition methods based on spark big data |
Also Published As
Publication number | Publication date |
---|---|
CN106126654B (en) | 2019-10-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106126654A (en) | A kind of inter-network station based on user name similarity user-association method | |
CN104899273B (en) | A kind of Web Personalization method based on topic and relative entropy | |
CN102929959B (en) | A kind of book recommendation method based on user behavior | |
CN107169873B (en) | Multi-feature fusion microblog user authority evaluation method | |
CN105005594B (en) | Abnormal microblog users recognition methods | |
CN102254038B (en) | System and method for analyzing network comment relevance | |
CN103218400B (en) | Based on link and network community user group's division methods of content of text | |
CN103176982A (en) | Recommending method and recommending system of electronic book | |
CN106033415A (en) | A text content recommendation method and device | |
CN104239331A (en) | Method and device for ranking comment search engines | |
KR20110115543A (en) | Method for calculating entity similarities | |
CN103077190A (en) | Hot event ranking method based on order learning technology | |
CN106682152A (en) | Recommendation method for personalized information | |
CN105630884A (en) | Geographic position discovery method for microblog hot event | |
Ju et al. | Relationship strength estimation based on Wechat Friends Circle | |
CN106302849A (en) | A kind of method carrying out moving solid fusion by carrier data | |
CN109905873A (en) | A kind of network account correlating method based on signature identification information | |
CN101493818A (en) | Network information searching method based on human relation network | |
CN106227866A (en) | A kind of hybrid filtering film based on data mining recommends method | |
CN106202312A (en) | A kind of interest point search method for mobile Internet and system | |
CN102750288B (en) | A kind of internet content recommend method and device | |
CN104111926B (en) | The generation method and device of the concern recommendation list of address list | |
CN103593360A (en) | Internet information publishing time extraction method based on page analysis | |
Feitosa et al. | Social recommendation in location-based social network using text mining | |
CN102693284A (en) | Extraction method of information in personal address list |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |