CN108268762A - The mobile social networking user identity of Behavior-based control modeling knows fake method - Google Patents

The mobile social networking user identity of Behavior-based control modeling knows fake method Download PDF

Info

Publication number
CN108268762A
CN108268762A CN201810043919.7A CN201810043919A CN108268762A CN 108268762 A CN108268762 A CN 108268762A CN 201810043919 A CN201810043919 A CN 201810043919A CN 108268762 A CN108268762 A CN 108268762A
Authority
CN
China
Prior art keywords
user
behavior
value
identity
true
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810043919.7A
Other languages
Chinese (zh)
Other versions
CN108268762B (en
Inventor
王成
洛婧
杨波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201810043919.7A priority Critical patent/CN108268762B/en
Publication of CN108268762A publication Critical patent/CN108268762A/en
Application granted granted Critical
Publication of CN108268762B publication Critical patent/CN108268762B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Human Resources & Organizations (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The mobile social networking user identity of present invention research Behavior-based control modeling knows fake method.So as to unanimously judge user behavior, the legitimacy of user is judged with this.The mobile social networking user identity of Behavior-based control modeling knows fake method, have benefited from the abundant information data that current mobile social networking provides, we get a large number of users social data, including position, Move Mode, social networks, user-generated content and shopping record etc..Using multivariate data as research object, the feature of these information is extracted.Using mixing Density Estimator, LDA text modelings the methods of realize to user behavior following three dimensions modeling:(1) geographical location (2) user generates text (3) social networks, the interception rate of detection is gone out by analog subscriber identity theft experimental calculation and bothers rate and precision ratio, thus evaluation system performance, the disadvantage of traditional identity identifying system is solved, the solution for information age safety problem provides new approaches and analysis method.

Description

The mobile social networking user identity of Behavior-based control modeling knows fake method
Technical field
The present invention relates to mobile social networking user identity to know fake method.
Background technology
While the Internet, applications obtain universal, information security issue is also more and more prominent, and attacker can attempt to steal User account steals individual privacy information or even malicious attack server.To ensure user security, traditional identification side Method is easily stolen under the network attack to increasingly sharpen, therefore only can not meet reality by the system of simple user name password Border needs.
In today that internet intelligent chemical conversion is development trend, the identity sensitive information of user will be made by extensive acquisition With how to ensure the information security of user becomes a problem urgently to be resolved hurrily.In contrast, it is used to have behavioural characteristic with user As the foundation of identification, there is uniqueness, non-reproduction and non repudiation, as long as method is designed When just with very high accuracy, and independent of other ancillary equipments, will not being generated to the normal use of user dry It disturbs, it is only necessary to the certification to user identity can be realized in the form of plug-in unit, and in real time, accurately user identity can be stolen With being monitored.The user identity of Behavior-based control anomaly is known pseudo problem and is provided newly for mobile Internet information security issue Visual angle, using the magnanimity information generated in mobile social networking, consider user behavior in physics-network-social space Feature can obtain the daily geographical location of user according to user's record of registering using Density Estimator modeling user place of registering Distributed model;User interest profile is extracted using text modeling mode widely applied in data mining, mark can be used as to use A kind of behavioral data of family identity;User social contact preference pattern etc. is established using user social contact relationship.
The significant challenge that the modeling of mobile social networking user behavior often faces is Sparse sex chromosome mosaicism.Due to various Condition limits, and the collection of user's specific behavior is recorded often than sparse, this is to influence user behavior modeling accuracy Key factor.We intend by be based on user behavior projection (behavior of the user in different behavior subspaces) complementary effect come The space-time cavity of behavioral data is filled up, with the modeling of this complete user behavior.
For this purpose, the mobile social networking user identity of present invention research Behavior-based control modeling knows fake method.So as to user Behavior is unanimously judged, the legitimacy of user is judged with this.
Invention content
Have benefited from the abundant information data that current mobile social networking provides, we can get a large number of users social activity number According to including position, Move Mode, social networks, user-generated content and shopping record etc..Using multivariate data as research object, Extract the feature of these information.It is realized using the methods of mixing Density Estimator, LDA text modelings to user behavior following three The modeling of a dimension:(1) geographical location (2) user generates text (3) social networks, is tested and counted by analog subscriber identity theft It calculates the interception rate of detection and bothers rate and precision ratio, and thus carry out evaluation system performance, design is a kind of to be based on anomaly Mobile social networking user behavior identity know fake method, solve traditional identity identifying system disadvantage, for the information age pacify The solution of full problem provides new thinking and analysis method.
The invention reside in overcome the deficiencies in the prior art, for analyze user behavior in expected model threshold value it is consistent Property, study the problem of user account is with the presence or absence of exception in mobile social networking.
For this purpose, the technical solution provided is:
A kind of user identity of Behavior-based control modeling knows pseudo Algorithm, which is characterized in that process is as follows:
Input:Information of registering, text message and the social networks of user's generation
Output:The true value of user identity
(1) it is registered in (check-in) geographical location information, the user text that action space is sent out on line according to user Appearance and social networks of the user in entire social networks calculate Characteristic Distribution of the user's history data in three dimensionsAndPerform step (2).
(2) for the newly-generated behavior record of user, it is assessed in three dimensions:(1) data of newly registering are calculated Probability density value logarithmic average(2) the theme probability distribution of newly-generated text is calculated(3) the social pass of new foundation is calculated The interest distribution of systemPerform step (3).
(3) in information dimension of registering, by calculatingValue with by training, obtain can be by geographical position Put the probability density logarithmic average threshold value S for distinguishing the user identity true and falsecIt is compared, if more than threshold value, then returns to UC=true, Otherwise U is returnedC=false performs step (4).
(4) in text message dimension, for user's history Data subject probability distributionWith the master of the newly-generated text of user Inscribe probability distributionCalculate the Jensen-Shannon divergences of two probability distributionBy its value and pass through training The threshold value D of JS divergences between two obtained probability distributionTIt is compared, if less than threshold value, returns to UT=true, is otherwise returned Return UT=false performs step (5).
(5) in social networks dimension, the text that interest can be expressed in friend's information is extracted according to the social networks of user Information similarly calculates user friend's historical data theme probability distributionWith the theme probability distribution of the newly-generated text of user And JS divergencesBy its value and threshold value DFIt is compared, if less than threshold value, returns to UT=true, is otherwise returned Return UT=false performs step (6).
(6) the judgement U for user identity true value returned according to above (3) (4) (5)C,UT,UF, by three dimensions pair The judgement of the user identity true and false takes union to obtain the final judge value U of user identityI=g (UC,UT,UF)。
The social network user Behavior law of position is primarily based on, user is being established just in multiple dimensions using valid data collection Normal behavior model carries out Feature Selection, determines abnormal behaviour threshold value, and pass through its accuracy of experimental verification and validity.Then Based on the research that early period models user behavior feature extraction and normal behaviour, behavior under bonding wire, on line behavior with it is social Feature of the behavioural analysis user in cyberspace, considers the relevance of user behavior blended space, and design utilizes multiple The user identity that the complementary effect of dimension is realized knows fake method and system.
The present invention it was proved that, this method is in accuracy rate and calculates better than previous research on the time.
Description of the drawings
The present invention is based on the mobile Internet user identity knowledge fake method system construction drawings of behavior modeling by Fig. 1
Fig. 2 inventive algorithm flow charts
Specific embodiment
(case)
The mobile Internet user identity of Behavior-based control modeling knows fake method system construction drawing, as shown in Figure 1.Entire scheme It is divided into three phases:
First stage establishes user's off-line model for historical data, is responsible for the collection and processing of user's history data, raw Into the regularity of distribution of the user behavior feature of corresponding dimension;(belonging to this field routine techniques)
Second stage be it is online collect the behavior record generation current signature stage, be responsible for the collection of user's current data with The regularity of distribution of the behavioural characteristic of corresponding dimension is calculated in processing;(belonging to this field routine techniques)
Three phases know the pseudo- stage for user identity, are responsible for the data characteristics delivered according to the stage one and two, pass throughIt is more Dimension mergesProvide the judgement to user identity.
First stage specific implementation step:
Step 1-1, pre-processes user data, screens the validated user of each dimension.Three effective items of dimension Part is respectively:User's history, which registers to record, will at least 5 times or more records of registering;User's text that action space is sent out on line Content must enough fully can train to obtain the theme probability distribution of user's history data, that is, protect after removing stop-word Demonstrate,prove the user of 200 or more effective words;The friend's number of user in the entire network will reach ten or more, based on this feature, note The content of text of user friend is recorded, obtains the interest distribution of circle of friends in user's history data.
Step 1-2 registers data according to the history of user and the friend of user, using the method for Density Estimator (at this Field has been the prior art) the probability density function f (x) of the position distribution that obtains registering.
Step 1-3, for the history text data of each user, accumulation is as a document, the textual data of whole users According to just constituting a large-scale corpus.Model is generated by LDA document subject matters, user's history text data is established and is led Model is inscribed, the corresponding theme probability distribution of every document is to form the theme probability distribution of the document user
For user social contact relation data, the text envelope of user friend's generation is obtained according to user social contact relationship by step 1-4 Breath similarly obtains the interest distribution of user friend's circle
Wherein:Above step 1-2,1-3 and 1-4 are to carry out side by side.
Second stage specific implementation step:
Step 2-1 for the data that user newly registers, substitutes into f (x) and obtains probability density, calculate its logarithmic average S= 1/n∑logf(x)。
Step 2-2 for the text data that user is newly-generated, passes through formulaIt can be calculated The theme probability distribution of new textWord numbers of the wherein n (k) to belong to theme k in text, αkIt is that LDA training process obtains The model parameter arrived.
Step 2-3 for the friend that user newly makes friends with, extracts the text data of new friend's generation, similarly can to new friend The interest distribution of friend
Wherein:Above step 2-1,2-2 and 2-3 are to carry out side by side.
Phase III specific implementation step:
Step 3-1, the probability density logarithm for obtaining to distinguish the user identity true and false by geographical location by training are equal It is worth threshold value Sc;Obtain to distinguish the Jensen- of two probability distribution of the user identity true and false by text message by training The threshold value D of Shannon divergencesTAnd DF
Step 3-2, for user version historical data theme probability distributionWith the theme probability of the newly-generated text of user DistributionCalculate the JS divergences of two probability distribution
Step 3-3 is distributed user social contact relationship historical data interest probabilitiesWith the interest probabilities of the new friend of user DistributionCalculate the JS divergences of two probability distribution
Step 3-4, for all mobile interchange network users, each feature being calculated according to step 3-1,3-2 and 3-3 Value, respectively from three geographical location, text message and social networks dimensions according to corresponding judgment criteria Sc、DTAnd DFIt gives Go out the judgement U of user identityC,UT,UF, three dimensions are merged to obtain user identity to the judgement of the user identity true and false Final judge value UI=g (UC,UT,UF)。
Algorithm
The user identity of Behavior-based control modeling knows pseudo Algorithm (idiographic flow is shown in Fig. 2)
Input:Information of registering, text message and the social networks of user's generation
Output:The true value of user identity
(1) it is registered in (check-in) geographical location information, the user text that action space is sent out on line according to user Appearance and social networks of the user in entire social networks calculate Characteristic Distribution of the user's history data in three dimensionsAndPerform step (2).
(2) for the newly-generated behavior record of user, it is assessed in three dimensions:(1) data of newly registering are calculated Probability density value logarithmic average(2) the theme probability distribution of newly-generated text is calculated(3) the social pass of new foundation is calculated The interest distribution of systemPerform step (3).
(3) in information dimension of registering, by calculatingValue with by training, obtain can be by geographical position Put the probability density logarithmic average threshold value S for distinguishing the user identity true and falsecIt is compared, if more than threshold value, then returns to UC=true, Otherwise U is returnedC=false performs step (4).
(4) in text message dimension, for user's history Data subject probability distributionWith the master of the newly-generated text of user Inscribe probability distributionCalculate the Jensen-Shannon divergences of two probability distributionBy its value and pass through training The threshold value D of JS divergences between two obtained probability distributionTIt is compared, if less than threshold value, returns to UT=true, is otherwise returned Return UT=false performs step (5).
(5) in social networks dimension, the text that interest can be expressed in friend's information is extracted according to the social networks of user Information similarly calculates user friend's historical data theme probability distributionWith the theme probability distribution of the newly-generated text of user And JS divergencesBy its value and threshold value DFIt is compared, if less than threshold value, returns to UT=true, is otherwise returned Return UT=false performs step (6).
(6) the judgement U for user identity true value returned according to above (3) (4) (5)C,UT,UF, by three dimensions pair The judgement of the user identity true and false takes union to obtain the final judge value U of user identityI=g (UC,UT,UF)。
It was proved that this method is in accuracy rate and calculates research better than previous on the time.
1 this item purpose innovative point
1. by user's historical behavior data, user's normal behaviour model is established.
2. using the complementary effect of multiple dimension behaviors, more accurate identity is obtained and has known fake method.
3. the previous identity of difference knows pseudo- mode, independent of hardware device, according to the behavioural characteristic conduct of user itself A kind of mark of user identity, and with very high confidence level.
Annotation:Related term in the present invention and following data can be found in for previous major technique.
[1]Bao J,Zheng Y,Wilkie D,et al.A survey on recommendations in location-based social networks[J].ACM Transaction on Intelligent Systems and Technology,2013.
[2]David M.Blei,Andrew Y.Ng,Michael I.Jordan.Latent Dirichlet Allocation[J].//Journal of machine learning research,2003,993-1022
[3]Wang X,Mccallum A,Wei X.Topical N-Grams:Phrase And Topic Discovery,With An Application To Information Retrieval[C]//Data Mining, 2007.ICDM 2007.Seventh IEEE International Conference on.2007:697–702
[4]Jie Bao,Yu Zheng,Mohamed F.Mokbel.Location-based and preference- aware recommendation using sparse geo-social networking data.[J]// International Conference on Advances in Geographic Information Systems,ACM, 2012
[5]Lichman M,Smyth P.Modeling human location data with mixtures of kernel densities[C]//Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining.ACM,2014:35-44.
[6]Cho E,Myers S A,Leskovec J.Friendship and mobility:user movement in location-based social networks[C]//Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining.ACM,2011: 1082-1090.
[7] Yang Song, Zheng Hu, Xiaoming Leng.Friendship influence on mobile behavior of location based social network users.[J]//Journal of Communications and Networks,2015.

Claims (1)

1. a kind of user identity of Behavior-based control modeling knows pseudo Algorithm, which is characterized in that process is as follows:
Input:Information of registering, text message and the social networks of user's generation
Output:The true value of user identity
(1) according to user register (check-in) geographical location information, the user content of text that action space is sent out on line and Social networks of the user in entire social networks calculate Characteristic Distribution of the user's history data in three dimensionsAndPerform step (2).
(2) for the newly-generated behavior record of user, it is assessed in three dimensions:(1) the general of data of newly registering is calculated The logarithmic average of rate density value(2) the theme probability distribution of newly-generated text is calculated(3) it calculates and newly establishes social networks Interest is distributedPerform step (3).
(3) in information dimension of registering, by calculatingValue with by training, obtain can be by geographical location area Divide the probability density logarithmic average threshold value S of the user identity true and falsecIt is compared, if more than threshold value, then returns to UC=true, otherwise Return to UC=false performs step (4).
(4) in text message dimension, for user's history Data subject probability distributionIt is general with the theme of the newly-generated text of user Rate is distributedCalculate the Jensen-Shannon divergences of two probability distributionBy its value with being obtained by training Two probability distribution between JS divergences threshold value DTIt is compared, if less than threshold value, returns to UT=true, otherwise returns to UT =false performs step (5).
(5) in social networks dimension, the text message that interest can be expressed in friend's information is extracted according to the social networks of user, Similarly calculate user friend's historical data theme probability distributionWith the theme probability distribution of the newly-generated text of userAnd JS DivergenceBy its value and threshold value DFIt is compared, if less than threshold value, returns to UT=true, otherwise returns to UT= False performs step (6).
(6) the judgement U for user identity true value returned according to above (3) (4) (5)C,UT,UF, by three dimensions to user The judgement of the identity true and false takes union to obtain the final judge value U of user identityI=g (UC,UT,UF)。
CN201810043919.7A 2018-01-17 2018-01-17 Mobile social network user identity identification method based on behavior modeling Active CN108268762B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810043919.7A CN108268762B (en) 2018-01-17 2018-01-17 Mobile social network user identity identification method based on behavior modeling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810043919.7A CN108268762B (en) 2018-01-17 2018-01-17 Mobile social network user identity identification method based on behavior modeling

Publications (2)

Publication Number Publication Date
CN108268762A true CN108268762A (en) 2018-07-10
CN108268762B CN108268762B (en) 2021-04-30

Family

ID=62775799

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810043919.7A Active CN108268762B (en) 2018-01-17 2018-01-17 Mobile social network user identity identification method based on behavior modeling

Country Status (1)

Country Link
CN (1) CN108268762B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109598110A (en) * 2018-12-10 2019-04-09 北京羽扇智信息科技有限公司 A kind of recognition methods of user identity and device
CN111475738A (en) * 2020-05-22 2020-07-31 哈尔滨工程大学 Heterogeneous social network location anchor link identification method based on meta-path
CN116260715A (en) * 2023-05-09 2023-06-13 广东卓柏信息科技有限公司 Account safety early warning method, device, medium and computing equipment based on big data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103853841A (en) * 2014-03-19 2014-06-11 北京邮电大学 Method for analyzing abnormal behavior of user in social networking site
US20150264079A1 (en) * 2013-03-06 2015-09-17 Facebook, Inc. Detection of lockstep behavior
CN106202488A (en) * 2016-07-19 2016-12-07 西北工业大学 Estimation user is to the method for physical event distance
CN106600052A (en) * 2016-12-12 2017-04-26 西安交通大学 User attribute and social network detection system based on space-time locus

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150264079A1 (en) * 2013-03-06 2015-09-17 Facebook, Inc. Detection of lockstep behavior
CN103853841A (en) * 2014-03-19 2014-06-11 北京邮电大学 Method for analyzing abnormal behavior of user in social networking site
CN106202488A (en) * 2016-07-19 2016-12-07 西北工业大学 Estimation user is to the method for physical event distance
CN106600052A (en) * 2016-12-12 2017-04-26 西安交通大学 User attribute and social network detection system based on space-time locus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
郭昊: "基于位置社交网络的用户行为建模与研究", 《中国优秀硕士学位论文全文数据库(电子期刊)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109598110A (en) * 2018-12-10 2019-04-09 北京羽扇智信息科技有限公司 A kind of recognition methods of user identity and device
CN111475738A (en) * 2020-05-22 2020-07-31 哈尔滨工程大学 Heterogeneous social network location anchor link identification method based on meta-path
CN111475738B (en) * 2020-05-22 2022-05-17 哈尔滨工程大学 Heterogeneous social network location anchor link identification method based on meta-path
CN116260715A (en) * 2023-05-09 2023-06-13 广东卓柏信息科技有限公司 Account safety early warning method, device, medium and computing equipment based on big data
CN116260715B (en) * 2023-05-09 2023-09-01 国品优选(北京)品牌管理有限公司 Account safety early warning method, device, medium and computing equipment based on big data

Also Published As

Publication number Publication date
CN108268762B (en) 2021-04-30

Similar Documents

Publication Publication Date Title
Dahal et al. Topic modeling and sentiment analysis of global climate change tweets
Zannettou et al. On the origins of memes by means of fringe web communities
Hou et al. Survey on data analysis in social media: A practical application aspect
Steiger et al. Twitter as an indicator for whereabouts of people? Correlating Twitter with UK census data
Nakano et al. Analysis of cyber aggression and cyber-bullying in social networking
CN103324636B (en) The system and method for commending friends in social networks
CN103970752B (en) Independent access person's quantity survey (surveying) method and system
CN106027577A (en) Exception access behavior detection method and device
Wang et al. Confidence-aware truth estimation in social sensing applications
Korayem et al. De-anonymizing users across heterogeneous social computing platforms
CN106874253A (en) Recognize the method and device of sensitive information
CN108268762A (en) The mobile social networking user identity of Behavior-based control modeling knows fake method
CN104166726A (en) Microblog text stream oriented sudden keyword detecting method
CN110134876A (en) A kind of cyberspace Mass disturbance perception and detection method based on gunz sensor
Sun et al. Efficient event detection in social media data streams
Shi et al. Rumor detection of COVID-19 pandemic on online social networks
Wang et al. Fusing behavioral projection models for identity theft detection in online social networks
Elyusufi et al. Social networks fake profiles detection based on account setting and activity
Murakami et al. Privacy-preserving multiple tensor factorization for synthesizing large-scale location traces with cluster-specific features
Qian et al. Social network de-anonymization: More adversarial knowledge, more users re-identified?
WO2022028131A1 (en) Data processing model acquisition method and apparatus based on privacy protection, terminal device, and storage medium
Sun et al. Anomaly subgraph detection with feature transfer
Ajesh et al. A hybrid method for fake profile detection in social networkusing artificial intelligence
Wang et al. Co-location social networks: Linking the physical world and cyberspace
CN116383786B (en) Big data information supervision system and method based on Internet of things

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant