CN103853841A - Method for analyzing abnormal behavior of user in social networking site - Google Patents

Method for analyzing abnormal behavior of user in social networking site Download PDF

Info

Publication number
CN103853841A
CN103853841A CN201410101728.3A CN201410101728A CN103853841A CN 103853841 A CN103853841 A CN 103853841A CN 201410101728 A CN201410101728 A CN 201410101728A CN 103853841 A CN103853841 A CN 103853841A
Authority
CN
China
Prior art keywords
message
user
abnormal
data
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410101728.3A
Other languages
Chinese (zh)
Inventor
闫丹凤
吴海莉
徐佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201410101728.3A priority Critical patent/CN103853841A/en
Publication of CN103853841A publication Critical patent/CN103853841A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/552Detecting local intrusion or implementing counter-measures involving long-term monitoring or reporting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/01Social networking

Abstract

The invention discloses a method for analyzing an abnormal behavior of a user in a social networking site. The method can be used for analyzing abnormal events such as advertising by account stealing, link spamming, network recreation and defrauding of social networking friends in the social networking site. The method comprises the following steps: acquiring user behavior data based on a network crawler technology; analyzing and detecting the data by using a user behavior analysis technology; when abnormality is detected, giving an alarm. Each of three functional units namely a data acquisition unit, an analysis and detection unit and an abnormality alarm unit is adopted to complete a function of the method. The data acquisition unit is used for acquiring the user behavior data by using the network crawler technology; the analysis and detection unit is used for analyzing and detecting the acquired user behavior data by using the user behavior analysis technology; the abnormality alarm unit is used for sending an alarm message when the abnormality is detected. According to the method, the abnormal events widely existing in the social networking site can be conveniently, flexibly and intelligently detected; a social networking site provider can timely find malicious users by using the method, so that the losses of net citizens are reduced.

Description

A kind of analytical approach of social network user's abnormal behaviour
Technical field
The present invention relates to a kind of analytical approach of social network user's abnormal behaviour, for detection of user's abnormal behaviour of issuing malice link, waste advertisements, swindle message etc. in social network sites, belong to network security detection technique field.
Background technology
The demonstration of CNNIC statistics, within 2013, China's microblog users quantity reaches 5.36 hundred million, in addition, uses the number of users of Renren Network also to reach 2.8 hundred million more than.Due to the existence of important entity indispensable in social networks (being mass users), impelling the social development of commercial class and a mankind, and be accompanied by the flourish of network social intercourse, various information resources also constantly exchange and propagate in social process, and because these information not only may comprise user's privacy information, and may be the trade secret of some company, thereby its information value be more and more approved.Be accompanied by microblogging, everybody etc. being surging forward of social activity application, the safety problem based on social networks is also more and more outstanding, for example, the fishing fraud quantity of utilizing in recent years social networks to implement just sharply increases.
Trust between social network good friend's relation and approval, be the starting point that lawless person implements rogue activity, and this is also the root that social networks produces safety problem.Lawless person implements to steal user profile by stealing user account number, inveigles ad click, the unlawful activities such as swindle of borrowing money.In recent years, in the report that many security firms provide, all show, have the rogue activity of the borrow money phishings such as swindle, virtual prize drawing of 1/4 left and right to propagate by social networks, and the analyses and prediction of these security firms also claim, the social safety of comprehensive improvement will become the new problem of network security.
Summary of the invention
Given this, target of the present invention is for the stolen rear issue swindle of the normal account number of social networks, fishing, this class anomalous event of the malicious messages such as junk information, a kind of accident detection method is proposed, the method crawler technology Network Based crawls user behavior data, carry out behavior modeling and analyzing and testing based on user behavior analysis technology and Mathematical Modeling Thought, in the time abnormal account being detected, send short message alarm, can be social network supplier abnormal user list is provided, thereby greatly reduce network defraud, fishing and the harm of junk information to netizen, the method is as a part for Web safety detection simultaneously, safety problem under research Web environment is also had to certain reference value and directive significance.
Social network accident detection method crawler technology Network Based and Web analytic technique that the present invention proposes are obtained the message data that user issues in social network, then these data are carried out to user behavior analysis, thereby detect abnormal user, and carry out alarm.Use this method can detect the anomalous event that target social network sites (Renren Network, microblogging etc.) exists, comprise steal account number sending advertisement, issue malice link, social good friend's wealth etc. " is poured water ", defrauded of to network.The present invention is mainly made up of three major function unit, i.e. data capture unit, analyzing and testing unit and abnormal alarm unit.
The functional characteristics of described data capture unit is as follows:
Obtain the operating right of target detection social network, complete the crawl to user message data (issued state, daily record, photo, share, the information such as comment) by web crawlers technology, to classifying by user after the Data Analysis capturing and depositing file in, these files are exactly the input of analyzing and testing unit.
This unit mainly comprises that user logins, data crawl, Data Analysis and four subelements of data output.
The functional characteristics that described user logins subelement is as follows:
Create a Singleton Connector class, use DefaultHttpClient, HttpGet and HttpPost.HttpGet is used for obtaining Renren Network entrance URL, sets Renren Network login URL in HttpPost, sets the essential information (comprise user name, password, Renren Network domain name etc., these parameter informations can be got from dispensing unit) of login user simultaneously.Then carry out login () method, if entered into the page after login, just show successfully to login, then user rs credentials information is preserved as Cookie, while crawl so that next, use.
The functional characteristics that described data capture subelement is as follows:
Realize ICrawler interface and IParser interface, wherein IParser interface inheritance HtmlParser.This unit mainly comprises CrawlFeeds class, CrawlTimelineFeed class, FilterOpenUser class and FeedController class.Wherein on FeedController class stricti jurise, do not belong to data placement unit, because it is used for controlling, data capture and data output storage.After user's login, first FilterOpenUser starts to obtain all relevant URL of each user to be grabbed from the user node of login.If this user to be grabbed is the good friend of login user, can directly crawl; If not good friend, some informational needs just can be checked after having added good friend, obtain all userId lists of checking by such mode.Then the userId list that FeedController obtains take FilterOpenUser is input, calls CrawlFeeds or CrawlTimelineFeed crawls.In capturing, adopt the increment type grasping means of timer.The method of timer captures by setting the concrete time interval.The concrete time interval is set by dispensing unit.While crawling, crawl respectively according to userId exactly.
The functional characteristics of described Data Analysis subelement is as follows:
Resolve crawling the page, then classify according to state, daily record, the link of sharing etc. again crawling all data that subelement crawls by userId, and extract the information such as issuing time, particular content of these information, to be also that html text is resolved to the particular content of message.This subelement is mainly FeedFilter class and HtmlParser class.Wherein HtmlParser is a ripe routine library, and it is that HTML based on Java code resolves class libraries, and it does not rely on other Java storehouse, is mainly used in transformation and extracts HTML, and can resolve at a high speed, exactly HTML.This unit by using HtmlParser extracts the content of text of message.HtmlParser redefines the information of HTML by Node, AbstractNode and Tag.In program, by definition NodeFilter object, the label that text input is provided in html is filtered, can find easily the content of Message-text.
The functional characteristics of described data output subelement is as follows:
The data result obtaining by reptile is with the file output with userId name, and storage data content form is hereof data ID, data type, content, content language, issuing time.
The functional characteristics of described analyzing and testing unit is as follows:
The result obtaining take data capture unit is input, it is carried out to pre-service, and in analyzing detecting method, proposed 7 user behavior features, these 7 features are carried out respectively to modeling, the all historical data of user, according to these 7 characteristic model modelings, is obtained to user's behavior profile.To the data after last time point of historical data, first classify according to 7 behavioural characteristics, then each behavioural characteristic is obtained to an abnormal score, finally 7 abnormal scores are calculated to total abnormal score, thereby judge that whether this user is abnormal.
The analyzing detecting method that this unit adopts comprises user behavior modeling, and how the similarity analysis of user message, calculate the abnormal score of message, and how finally to detect four aspects of anomalous event.
The functional characteristics of described user behavior modeling is as follows:
User behavior profile is that the historical behavior on social networks obtains by user, and it can be used for expecting that this user is at normal behaviour in the future.In order to set up user's behavior profile, i.e. user behavior modeling, just need to this user be distributed on the message flow on social network sites, and these message flows result that data capture unit obtains just.So the result can usage data acquiring unit obtaining is carried out the foundation of behavior profile.
For the feature of social networks and the needs of detection, for every message, 7 features have been set in this unit, for statistical model of each features training.Each model has wherein reacted the characteristic of this message aspect, complete to all message analysis of certain user after, just can obtain the eigenwert of this user aspect these 7, just can expect what kind of the message that this user sends should be.Below 7 of every message characteristic models are described in detail.
1, the time (hour/day) that message sends.This characteristic model is used for catching which time of an account number in one day and enlivens.Many users are sluggish the determining time in one day, for example dinner hour or the length of one's sleep.By the time that in user's message flow, user gives out information, which can determine is non-enlivening the time, is distributed on so the non-message of enlivening the time and is just considered to abnormal.
2, message source.The application program giving out information.Most of social network sites provide legacy network and mobile network to access the user to them, and for for example iOS of application program and the Android of mobile platform.Many social networks provide the multiple application program independently being created by third party developer.Certainly,, under default situations, third party application can not be sent out the account of message to user.But if a user selects this mode to send, he can authorize this privilege and apply to this, this just make this third party be applied in the situation that there is no user rs credentials can calling party personal information.In fact, show according to dependent evaluation, third party application is often used to send malicious messages.
Whether in the past this model is used for determining user's normal use application-specific, or whether this is to send message by certain application program for the first time conversely speaking.Whenever user uses a new application issued message, this variation may show, an assailant successfully lures victim to authorize malicious application to access his account.
3, Message-text (language).User can freely use any language to give out information.But in fact each user is only with category of language few in number give out information (conventionally, one or two).Therefore, particularly, when this model feature (message language) is metastable, unexpected language change shows that user behavior is suspicious.
Determine the language that a message is used, utilize libtextcat storehouse.This storehouse is the increase income storehouse of an execution take n-gram as basic Algorithm of documents categorization.
4, message topic.The message back that user issues is toward comprising many chattering or secular information.But a lot of users have one group of their topic of often talking about, such as favorite sports team, band, or TV programme.The message of issuing as user concentrates in several topics conventionally, then issues suddenly some different and irrelevant topics, and this new message should be cited as extremely.
Generally, never contextual short text fragments, infers that the topic of message is difficult.But social network-i i-platform allows user's labeled message, which topic is the message of clearly specifying them are.When in the situation that having label, they provide valuable information source.The message marking mechanism of a well-known example is the topic label of Renren Network, microblogging, conventionally use " ## ", two " # " centres be topic.
5, the link in message.Under normal circumstances, the message that is distributed on social network sites comprises the link of pointing to other resources, as blog, and picture, video or news article.Occur till now from social network, the link in message all extensively exists, thereby more all concentrates on the analysis to URL about the security study work of social network in the past, and using it as determining whether message is unique factor of malice.Paper is the part using the URL in message as user behavior profile also, but just as a single characteristic model.In addition, the establishment behavior aspect of model is mainly the normal activity for catching user.That is to say, whether this detection method does not attempt to detect a URL itself is maliciously, under normal circumstances can the such URL of no transmission but go to detect this user.
In order to determine the link occurring in message, this method is only utilized the domain name of URL in link.Its reason is that user may often quote the content in same domain name.For example, many users often see specific news website and blog, and are often linked to the interesting article there.Malice link, on the other hand, sensing be illegal website.Therefore when, link information comprises the domain name that the past do not occur, represented a kind of variation.The behavior, model also considered to comprise in message the frequency of link, and user is linked to the consistance of specific website.
6, mutual between user.Social networks provides and between unique user, directly carries out mutual mechanism.Modal mode is by sending messages directly to recipient.Different social networks has different mechanism.As time goes on, user social networks just set up one with the historical record of other user interactions.Just can catch a user's historical intersection record by this characteristic of social network.In fact, it follows the tracks of all cross mutual of user account.The object that sends message is the attention in order to obtain recipient, and therefore the direct interaction mode between this user is often used to send rubbish message.
7, contiguous geographic position.In many cases, the friend of user in social networks is exactly other users that get close to them in reality.For example, the user of a Renren Network will have and much stay in same city, school of upper same institute, or be operated in the friend of identical company.If this user starts suddenly and people's contacts of living in another continent, this may be suspicious.This feature is locality or non-indigenous for catching message.
Every message for user is carried out modeling by above-mentioned 7 characteristic models, then it is carried out to model training and assessment.
The functional characteristics of described model training is as follows:
The input of model training is a series of message (message flow) that data capture unit crawls.For each message, extract above-mentioned 7 features, for example send the link comprising in the source program of message and message.
Each characteristic model represents with set M.Each element of M is a key-value pair tuple <fv, c>.Fv is eigenwert (for example, the language model of English, or link model example.com).C represents the message number that fv value occurs.In addition, each model is stored the total N of training message.
Training pattern is divided into two classes:
(1) necessary model is to have an eigenwert for each message, and the model that always occurs of this eigenwert.Default models comprises the time that message sends, message source, contiguous geographic position and message language.
(2) optional model refers to for a message, and this model not necessarily always need to have value.Meanwhile, be different from necessary model, for a message, this model can corresponding multiple values.Optional model, comprises link, mutual and theme between user.For example, a message may have 0, one or more link.For each optional model, we retain a fv=null, and " c " of this eigenwert value is proposed to (for example, there is no the message count of link).
Training for this characteristic model of message transmitting time is slightly different.Based on description above, first system is extracted message and is sent in some.Then, it is by the storage fv of each hour, and the message count being published in this hour.So just having a problem, is exactly may be discontinuous the time period, is discrete.Therefore the message that, near time point user's normal time sends just may be thought mistakenly extremely.
For fear of this problem, set-up procedure after time model is trained.Be exactly, for each hour i, to consider two hours adjacent with it specifically., for each key-value pair <i of M, C i>, a new calculating variable C ' ibe used for calculating i hour C ithe average giving out information, variable C i-1be used for storing the message count of transmission in that hour before, C i+ 1user stores the message count that hour after i hour sends.When calculating C ' i, just replace key-value pair <i, C with it ic in > i.
The functional characteristics of described model evaluation is as follows:
The assessment of model, calculates the abnormal score of 7 behavioural characteristic models, and the most at last these 7 values to adopt certain Algorithms Integrations be a value, i.e. the abnormal score of this message.
Figure BDA0000478762460000081
the calculating of 7 abnormal scores of characteristic model:
In the ordinary course of things, when the eigenwert in the necessary model of a message does not appear in user's information flow, or eigenwert occur number of times do not mate with the key-value pair in M, this message is exactly abnormal so.
For the characteristic model of necessary model, the abnormal score of message is calculated in the following way:
1, first to from message, extract the fv value of characteristic model to be analyzed.If comprise the key-value pair using fv as first element in M, so just can from M, extract whole key-value pair.If there is not the key-value pair take fv as first value in M, this message is exactly abnormal so, and program will be returned to abnormal score 1 here so.
2, second step, analyzes according to user's behavior profile whether fv is abnormal.C and M compare, based on formula:
M &OverBar; = &Sigma; i = 1 | | M | | c i N
Wherein Ci is for the each element <fv in M, second value c of c>.If c is greater than or equal to
Figure BDA0000478762460000092
this message is just considered to meet user's behavior profile, and returns to abnormal score 0.Reason is that this behavioural characteristic all appears in many message in user's past, meets user's behavioural habits, is normal behavior.
If c is less than
Figure BDA0000478762460000093
it is abnormal that this message is just considered to.Our system-computed goes out the relative frequency of f and fv, according to formula
f = c fv N
System is returned to abnormal score (1-f).
For the characteristic model of optional model, the abnormal score of message is calculated in the following way:
First extract the fv value of the characteristic model that will analyze in message.If comprise the key-value pair take fv as first value in M, so just judge that this message meets user's behavior profile, and system is returned to exceptional value 0.
If do not comprise the key-value pair take fv as first value in M, this message is just judged as abnormal.Exceptional value is in this case defined as being characterized as for this of this this user of model Probability p of null.Intuitively, if a user uses hardly a kind of feature in social networks, but in a piece of news, but comprised the fv value of this characteristic model, this message is exactly Height Anomalies so.Probability p is calculated by formula below:
p = c null N
If do not comprise the key-value pair take null as first element, so C in M nullbe exactly 0.P is exactly exceptional value.
Give one example, regard to the detection of language model under consideration: a specific user has issued 21 message, wherein 12 is English issues, and 9 is that Chinese is issued.The set M of this user language model is exactly: (<English, 12>, <Chinese, 9>).
The lower a piece of news that this user issues will have following three kinds of situations:
New information is issued by English.First from M, extract key-value pair <English, 12>, calculates by formula (4-1)
Figure BDA0000478762460000101
then use c=12 and
Figure BDA0000478762460000102
compare, because c is greater than
Figure BDA0000478762460000103
so this message is normal, returns to exceptional value 0.
New information is issued with French.Because never issued message with French before user, this message is exactly suspicious, returns to exceptional value 1.
New information is issued by Chinese.First from M, extract key-value pair <Chinese, 9>, calculates by formula (4-1)
Figure BDA0000478762460000104
then use c=9 and compare, because
Figure BDA0000478762460000106
so this message is abnormal.This user's Chinese relative frequency is:
Figure BDA0000478762460000107
therefore, return to exceptional value 1-f=0.58, this means that this information is abnormal.But this value is not enough to illustrate that this message is malice, because may be only that this piece of news is abnormal, be not that this type of a large amount of message exists.
Figure BDA0000478762460000108
the calculating of final abnormal score
By the exceptional value for each model that calculated above, then need them to be integrated into the exceptional value of a result as this message.This exceptional value obtains by the method for weighted sum.This method is by based on being indicated that normal and abnormal training set (user and user message) uses SMO(Sequential Minimal Optimization algorithm, being proposed in 1998 by the John C.Platt of Microsoft Research, and become the fastest quadratic programming optimized algorithm) algorithm obtains the optimal weights of each characteristic model.Certainly, different social networks needs different weights for these 7 characteristic models.If the abnormal score of a message has exceeded threshold values (elaboration sees below), this message has just been violated this user's behavior profile so, is abnormal.
The functional characteristics of the similarity analysis of described user message is as follows:
User's behavioral data carried out to 7 feature modelings and calculating after the abnormal score of every message, also needing all message datas of user to classify, and carry out similarity analysis, because the message such as fishing, swindle are to need a large amount of propagation.So when only having a piece of news to be judged as when abnormal, do not think that its corresponding account number occurs abnormal, need further to observe more other similar message, while only having similar message to reach some, just assert that the account number of these message of transmission is abnormal account number.
The calculating of content similarity has two kinds of methods: 1. content of text similarity; 2. the URL similarity comprising.
Content of text similarity classification, comprises similar message and is considered to have correlativity and is divided into a class in message.Use n-gram algorithm to realize, in this method, n gets 4.Because if two message are shared at least word of 4-gram, these two message are exactly similar so.N value is larger, and it is more accurate that similarity is calculated.But generally all get n=3, or store when n=4, otherwise n is large again, calculate to wait and expend very large, substantially can not practical application.
The classification of URL similarity, refers to that these two message are similar when comprising more than at least one identical URL link in two message.In social network sites, a lot of rubbish message general values comprises an inquiry string in URL.Therefore the calculating of this classification is in employing URL, to remove argument section to mate.The URL that many social network sites comprise during user is given out information shortens by this social network sites definition.So there will be different short URL to point to the situation of same page.So in the time that URL is short network address, need to obtains final page URL after short network address and mate by expanding.
The functional characteristics of described abnormality detection is as follows:
This function need be in conjunction with two aspects: the one, and the abnormal score of every message, the 2nd, the grouping of user message while carrying out similarity analysis, guarantees a large amount of of message.The rule of abnormality detection is: as long as there is the abnormal scoring of the personal behavior model of message to exceed certain threshold values in each grouping, just judge that this is grouped into unexpected message group, wherein account number corresponding to all message is abnormal account number.The account form of threshold values is:
th(n)=max(0.1,kn+d)
Wherein n is number of packet, obtains by experiment working as k=-0.005, and when d=0.82, result is the most accurate.
The functional characteristics of described alarm unit is as follows:
This unit mainly provides alarm and the two kinds of services of inquiry of reporting to the police, and comprises that note sends subelement and reports to the police inquiry subelement.Warning query function need to be used database, adopts MySQL database.
Note transmission subelement provides with Curl, Thrift and tri-kinds of modes of Json and sends, and utilizes java multithreading to solve concurrent problem.Safety problem when avoiding sending note, needs safety certification before sending note.
Authentication adopts the mode of HTTP Header to carry out.
Provide parameter as follows:
1, username: user's name;
2, Timestamp: user's current time stamp, form yyyy-MM-dd hh:ss:mm;
3, Nonce: prevent from retransmitting, the random string that character string and numeral form all must be different in calling at every turn;
4, password: with the unique corresponding key code of user's name.
5, Signature: signature (signature equals username+Timstamp+Nonce+password and is spliced into character string, then processes by digest algorithm, for example MD5).
User is by HTTP head " Authentication ", by Username+ t+Timestamp+ t+Nonce+ t+md5{Signature}, import into as character string.
The inquiry subelement of reporting to the police need to design database table, this unit provides by transmitting time, sends the function of the inquiry such as object warning messages for the user of social network accident detection instrument simultaneously, can be by inputting unexpected message type in plain frame, receive mobile phone and note transmitting time is according to condition inquired about searching, also can entirely inquire about, page listings part is shown the relevant abnormal user Id of All Alerts message, unexpected message type, unexpected message issuing time, short message state, note transmitting time, receives mobile phone etc.
Accompanying drawing explanation
Fig. 1 is the illustraton of model of the analytical approach of social network user's abnormal behaviour of the present invention.
Fig. 2 is the process flow diagram for the concrete operations of data capture unit.
Fig. 3 is the process flow diagram for the concrete operations of analyzing and testing unit.
Fig. 4 is the process flow diagram for the concrete operations of abnormal alarm unit.
Embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, the present invention is described in further detail.
Referring to Fig. 1, introduce the general function composition structure of the method: data capture unit, analyzing and testing unit and three of abnormal alarm unit component units, wherein:
Data capture unit, be intended to extract by user this user's behavioural information from the mass data of social network numerous and complicated, first need to obtain the identification authorization of target detection social network sites, then Adoption Network crawler technology obtains from the initial all user's subset with the authority of checking of login node, can adopt time shaft data to carry out the crawl of all customer data in subset for this subset, from the result set crawling, analyze again according to userId, obtain all strange things of the user that this userId is corresponding, can extract this userId issues and all states of sharing, daily record, , the data messages such as link, then these data messages are carried out to html text parsing and language parsing, after parsing, export in order to the document form of userId name, file content comprises data Id, issuing time, data type, content, language form, whether comprise link, chained address etc.
Analyzing and testing unit, the user data that data capture unit is obtained is set up user behavior model, and it is trained and is assessed, and then each user's behavioral data is carried out to content-based similarity classification, finally carries out abnormality detection according to special algorithm.
Alarm unit is reported to the police in the time abnormal user being detected, provides note to send and warning query function.
Referring to Fig. 2, concrete operation step is as follows:
1. use the username and password login in the registration of target detection social network sites;
2. start to obtain all relevant URL of each user to be grabbed from the user node of login.If this user to be grabbed is the good friend of login user, can directly crawl; If not good friend, some informational needs just can be checked after having added good friend, obtain all user's subset to be captured by such mode;
3. according to certain timer frequency, user and corresponding URL are carried out to increment type crawl, return to step 2;
4. open multithreading the time shaft data of all users in user's subset are captured, first judge whether this user is had to access rights: no, this user data is captured to thread and finish; To proceed subsequent step;
5. couple URL that comprises user data information carries out page parsing, thereby obtains the information such as this type of message, message content, issuing time, issue source;
6. continuous repeating step 3~5, until all time shaft message datas of all users have all crawled;
7. the formal output by each user's message data with file according to unified data layout (data ID, data type, content, content language, issuing time etc.).
Referring to Fig. 3, concrete operation step is as follows:
1. the input data using the output of data capture unit as this unit;
2. all message datas of couple each user were divided into groups according to the time period of issuing;
3. every message in pair each grouping is carried out modeling according to 7 characteristic models, and calculates the corresponding abnormal score of each characteristic model, and integrates, and obtains every abnormal score that message is final;
4. the message in pair each grouping is carried out content-based similarity analysis;
5. message similarity in step 4 that the abnormal score obtaining by step 3 is greater than threshold values is also large, and this user is exactly abnormal user.
Referring to Fig. 4, concrete operation step is as follows:
1. when analyzing and testing unit inspection is after abnormal user, just trigger and send short message event;
2. because sending, note needs to pay, so first carry out safety verification by the mode of Http Header before transmission note;
3. the username in checking Http head and whether password mates and be validated user verifies whether there is note sending permission by the mode of signature verification simultaneously;
4. when after safety verification success, send note and inform transmit leg by asynchronous system to receiving user; Do not there is authority and inform that the incorrect or user of user cipher does not exist;
5. send short messages and adopt multithreading, asynchronous system.

Claims (8)

1. an analytical approach for social network user abnormal behaviour, can detect the anomalous event that target social network sites (Renren Network, microblogging etc.) exists, comprise steal account number sending advertisement, issue malice link, social good friend's wealth etc. " is poured water ", defrauded of to network.It is characterized in that, crawler technology Network Based obtains user behavior data, basis using these data as user behavior analysis, the message that user is issued is carried out modeling and training, extract user's behavior profile, whether abnormal according to user's behavior profile assessment new information, in the time anomalous event being detected, send alarm.
The method is mainly made up of three functional units, i.e. data acquisition, analyzing and testing and abnormal alarm, wherein:
Data acquisition, be intended to get the Deep Web data of user in social network, be that user issues and the state of sharing, daily record, the data such as link, these data need Adoption Network reptile method to carry out Deep Web Crawler to social networks, i.e. effective login user account number based in the registration of target detection social network sites, authorizes thereby adopt account login target detection website to obtain website, crawls out user's Deep Web data.
Analyzing and testing, the user data obtaining according to data capture unit is set up user behavior model, and it is trained and is assessed, and then each user's behavioral data is carried out to content-based similarity classification, finally carries out abnormality detection according to special algorithm.
Abnormal alarm is reported to the police in the time abnormal user being detected, provides note to send and warning query function.
2. data acquisition functions according to claim 1 unit, it is characterized in that: the analysis foundation that obtains the method---social network user data, first need to obtain the identification authorization of target detection social network sites, then Adoption Network crawler technology obtains from the initial all user's subset with the authority of checking of login node, can adopt time shaft data to carry out the crawl of all customer data in subset for this subset, from the result set crawling according to userId, userId is user's unique ID number, analyze again, obtain all strange things of the user that this userId is corresponding, can extract this userId issues and all states of sharing, daily record, the data messages such as link, then these data messages are carried out to html text parsing and language parsing, after parsing, export in order to the document form of userId name, file content comprises data Id, issuing time, data type, content, language form, whether comprise link, chained address etc.
3. user behavior modeling method in analyzing and testing according to claim 1 unit, is characterized in that: the message flow being distributed on social network sites by user is set up user's behavior profile, and the output that data capture unit obtains just of these message flows.
For the feature of social networks and the needs of detection, for every message, 7 features are set in this unit, for statistical model of each features training.Each model reacts the characteristic of this message aspect, after complete to all message analysis of certain user, can obtain the eigenwert of this user aspect these 7, thereby can expect the message content that this user sends.
4. 7 kinds of features according to claim 3, it is characterized in that: 7 characteristic models of 7 kinds of corresponding every message of feature, be respectively mutual between the time (hour/day) of message transmission, the application program giving out information, language form, topic, link, user and geographic position, and these 7 kinds of features be divided into two classes:
(1) necessary model is to have an eigenwert for each message, and this eigenwert always occurs.Acquiescence feature comprises the time that message sends, message source, contiguous geographic position and message language.
(2) optional model refers to for a message, and this feature not necessarily always need to have value.Meanwhile, be different from necessary model, for a message, this feature can corresponding multiple values.Optional model, comprises link, mutual and theme between user.For example, a message may have 0, one or more link.For each optional model, we retain a fv=null, and " c " of this eigenwert value is proposed to (for example, there is no the message count of link).Fv refers to certain eigenwert, and c represents the message number that fv occurs.
5. training and the assessment of user behavior model in analyzing and testing unit according to claim 1, is characterized in that:
Training for model:
Input is a series of message (message flow) that data capture unit crawls.For each message, extract above-mentioned 7 features, for example send the link comprising in the source program of message and message.Each characteristic model represents with set M.Each element of M is a key-value pair tuple <fv, c>.Fv is eigenwert (for example, the language model of English, or link model example.com).C represents the message number that fv value occurs.In addition, each model is stored the total N of training message.
Training for this characteristic model of message transmitting time is slightly different.Be exactly, for each hour i, to consider two hours adjacent with it specifically., for each key-value pair <i of M, C i>, a new calculating variable C ' ibe used for calculating i hour C ithe average giving out information, variable C i-1be used for storing the message count of transmission in that hour before, C i+ 1user stores the message count that hour after i hour sends.When calculating C ' i, just replace key-value pair <i, C with it ic in > i.
Assessment for model:
Calculate the abnormal score of a piece of news, see whether this message does not meet user's behavior profile.
For characteristic model, the abnormal score of message is calculated in the following way:
(1) first necessary model will extract the fv value of characteristic model to be analyzed from message.If comprise the key-value pair using fv as first element in M, so just can from M, extract whole key-value pair.If there is not the key-value pair take fv as first value in M, this message is exactly abnormal so, and program will be returned to abnormal score 1 here so.
(2) analyze according to user's behavior profile whether fv is abnormal.C and
Figure FDA0000478762450000031
compare, based on formula:
M &OverBar; = &Sigma; i = 1 | | M | | c i N
Wherein Ci is for the each element <fv in M, second value c of c>.If c is greater than or equal to
Figure FDA0000478762450000033
this message is just considered to meet user's behavior profile, and returns to abnormal score 0.Reason is that this behavioural characteristic all appears in many message in user's past, meets user's behavioural habits, is normal behavior.
If c is less than
Figure FDA0000478762450000041
it is abnormal that this message is just considered to.Calculate the relative frequency of f and fv, according to formula
f = c fv N
System is returned to abnormal score (1-f).
For optional characteristic model, the abnormal score of message is calculated in the following way:
First extract the fv value of the characteristic model that will analyze in message.If comprise the key-value pair take fv as first value in M, so just judge that this message meets user's behavior profile, and system is returned to exceptional value 0.
If do not comprise the key-value pair take fv as first value in M, this message is just judged as abnormal.Exceptional value is in this case defined as being characterized as for this of this this user of model Probability p of null.Intuitively, if a user uses hardly a kind of feature in social networks, but in a piece of news, but comprised the fv value of this characteristic model, this message is exactly Height Anomalies so.Probability p is calculated by formula below:
p = c null N
If do not comprise the key-value pair take null as first element, so C in M nullbe exactly 0.P is exactly exceptional value.
6. content-based similarity classification in analyzing and testing unit according to claim 1, it is characterized in that: content-based similarity classification in described analyzing and testing unit, the reason that account number abnormality detection need to be carried out content-based similarity analysis is based on such fact: the message such as fishing, swindle are to need a large amount of propagation.So when only having a piece of news to be judged as when abnormal, do not think that its corresponding account number occurs abnormal, need further to observe more other similar message, while only having similar message to reach some, just assert that the account number of these message of transmission is abnormal account number.
The calculating of content similarity has two kinds of methods: the one, and content of text similarity; The 2nd, the URL similarity comprising.
7. abnormality detection in analyzing and testing unit according to claim 1, is characterized in that: mainly detect two classes abnormal: the one, and the suspicious user group being encroached on; The 2nd, non-suspicious user of being encroached on or application.Their difference is: the former exists normal user behavior profile, has issued afterwards a large amount of similar message; The latter is issuing a large amount of similar message from the beginning to the end.
What data capture unit obtained is the user data at certain hour interval, thereby in analyzing and testing unit, the message of content-based classification is also in a certain time interval.Data in this each time interval are called a grouping.For each grouping, this method checks whether the message of all user accounts has violated its user behavior profile.Whether based on such analysis, just can detect an account is abnormal.
The rule that abnormal account number detects is: as long as there is the abnormal scoring of the personal behavior model of message to exceed certain threshold values in each grouping, just judge that this is grouped into unexpected message group, wherein account number corresponding to all message is abnormal account number.The account form of threshold values is:
th(n)=max(0.1,kn+d)
Wherein n is number of packet, obtains by experiment working as k=-0.005, and when d=0.82, result is the most accurate.From formula, the unexpected message decision threshold of grouping small scale is higher, and the sweeping threshold values that divides into groups is lower.
8. abnormal alarm according to claim 1 unit, is characterized in that: described alarm unit provides alarm and the two kinds of services of inquiry of reporting to the police, and three kinds of method of calling---Curl, Thrift and Json mode are provided.Wherein, alarm provides in the mode that sends note.
CN201410101728.3A 2014-03-19 2014-03-19 Method for analyzing abnormal behavior of user in social networking site Pending CN103853841A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410101728.3A CN103853841A (en) 2014-03-19 2014-03-19 Method for analyzing abnormal behavior of user in social networking site

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410101728.3A CN103853841A (en) 2014-03-19 2014-03-19 Method for analyzing abnormal behavior of user in social networking site

Publications (1)

Publication Number Publication Date
CN103853841A true CN103853841A (en) 2014-06-11

Family

ID=50861496

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410101728.3A Pending CN103853841A (en) 2014-03-19 2014-03-19 Method for analyzing abnormal behavior of user in social networking site

Country Status (1)

Country Link
CN (1) CN103853841A (en)

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104580436A (en) * 2014-12-29 2015-04-29 北京锐安科技有限公司 Method and device for acquiring report data of illegal website
CN105005594A (en) * 2015-06-29 2015-10-28 嘉兴慧康智能科技有限公司 Abnormal Weibo user identification method
CN105320765A (en) * 2015-10-26 2016-02-10 华为技术有限公司 Message abnormality value acquisition method and apparatus
CN105656867A (en) * 2014-12-02 2016-06-08 阿里巴巴集团控股有限公司 Monitoring method and device for account theft event
WO2016180267A1 (en) * 2015-05-13 2016-11-17 阿里巴巴集团控股有限公司 Method of processing exchanged data and device utilizing same
CN106228410A (en) * 2016-07-29 2016-12-14 武汉斗鱼网络科技有限公司 Virtual present task anti-brush system and method in a kind of live platform
CN106294508A (en) * 2015-06-10 2017-01-04 深圳市腾讯计算机系统有限公司 A kind of brush amount tool detection method and device
CN106372938A (en) * 2015-07-21 2017-02-01 华为技术有限公司 Abnormal account identification method and system
CN106447488A (en) * 2016-09-07 2017-02-22 北京量科邦信息技术有限公司 Method and system for improving collection efficiency through technical means
CN106777024A (en) * 2016-12-08 2017-05-31 北京小米移动软件有限公司 Recognize the method and device of malicious user
CN106911668A (en) * 2017-01-10 2017-06-30 同济大学 A kind of identity identifying method and system based on personal behavior model
CN107317812A (en) * 2017-06-27 2017-11-03 福建中金在线信息科技有限公司 A kind of method, device, electronic equipment and storage medium for searching pirate user
CN107548500A (en) * 2015-04-29 2018-01-05 微软技术许可有限责任公司 Event anomalies based on user's routine model
CN107622333A (en) * 2017-11-02 2018-01-23 北京百分点信息科技有限公司 A kind of event prediction method, apparatus and system
CN107683486A (en) * 2015-06-05 2018-02-09 微软技术许可有限责任公司 The change with personal influence of customer incident
CN107770129A (en) * 2016-08-17 2018-03-06 华为技术有限公司 Method and apparatus for detecting user behavior
CN107888602A (en) * 2017-11-23 2018-04-06 北京白山耘科技有限公司 A kind of method and device for detecting abnormal user
CN108140075A (en) * 2015-07-27 2018-06-08 皮沃塔尔软件公司 User behavior is classified as exception
CN108133373A (en) * 2018-01-04 2018-06-08 交通银行股份有限公司 Seek the method and device for the adventure account for relating to machine behavior
TWI626549B (en) * 2017-04-17 2018-06-11 Chunghwa Telecom Co Ltd Method of analyzing a URL to generate a user profile
CN108268762A (en) * 2018-01-17 2018-07-10 同济大学 The mobile social networking user identity of Behavior-based control modeling knows fake method
US20180240133A1 (en) * 2017-02-20 2018-08-23 Baidu Online Network Technology (Beijing) Co., Ltd. Method, Apparatus and Server for Identifying Risky User
CN108509793A (en) * 2018-04-08 2018-09-07 北京明朝万达科技股份有限公司 A kind of user's anomaly detection method and device based on User action log data
CN105516128B (en) * 2015-12-07 2018-10-30 中国电子技术标准化研究院 A kind of detection method and device of Web attacks
CN108932669A (en) * 2018-06-27 2018-12-04 北京工业大学 A kind of abnormal account detection method based on supervised analytic hierarchy process (AHP)
CN109034661A (en) * 2018-08-28 2018-12-18 腾讯科技(深圳)有限公司 User identification method, device, server and storage medium
CN109145179A (en) * 2017-07-26 2019-01-04 北京数安鑫云信息技术有限公司 A kind of crawler behavioral value method and device
CN109145109A (en) * 2017-06-19 2019-01-04 国家计算机网络与信息安全管理中心 User group's message propagation anomaly analysis method and device based on social networks
CN109213859A (en) * 2017-07-07 2019-01-15 阿里巴巴集团控股有限公司 A kind of Method for text detection, apparatus and system
CN109271422A (en) * 2018-09-20 2019-01-25 华中科技大学 A kind of social networks subject matter expert's lookup method driven by not firm information
CN109416659A (en) * 2017-09-30 2019-03-01 深圳市得道健康管理有限公司 A kind of network terminal and its constrained procedure of internet behavior
CN109862018A (en) * 2019-02-21 2019-06-07 中国工商银行股份有限公司 Anti- crawler method and system based on user access activity
WO2019114481A1 (en) * 2017-12-13 2019-06-20 腾讯科技(深圳)有限公司 Cluster type recognition method, apparatus, electronic apparatus, and storage medium
CN110191110A (en) * 2019-05-20 2019-08-30 山西大学 Social networks exception account detection method and system based on network representation study
CN110366727A (en) * 2017-02-13 2019-10-22 微软技术许可有限责任公司 Multi signal analysis for damage range identification
CN110457601A (en) * 2019-08-15 2019-11-15 腾讯科技(武汉)有限公司 The recognition methods and device of social account, storage medium and electronic device
CN111193697A (en) * 2019-08-07 2020-05-22 腾讯科技(深圳)有限公司 Method, device and system for detecting credibility of social account
CN111385247A (en) * 2018-12-28 2020-07-07 广州市百果园信息技术有限公司 User behavior classification method and device, storage medium and server
WO2020155508A1 (en) * 2019-01-28 2020-08-06 平安科技(深圳)有限公司 Suspicious user screening method and apparatus, computer device and storage medium
CN111915086A (en) * 2020-08-06 2020-11-10 上海连尚网络科技有限公司 Abnormal user prediction method and equipment
CN111966978A (en) * 2020-08-20 2020-11-20 咪咕文化科技有限公司 Abnormal user determination method, electronic device and storage medium
CN112115908A (en) * 2020-09-25 2020-12-22 北京易华录信息技术股份有限公司 Social ability evaluation method and device
CN112417146A (en) * 2019-08-22 2021-02-26 脸谱公司 Notifying a user of offensive content
CN112685614A (en) * 2021-03-17 2021-04-20 中国电子科技集团公司第三十研究所 Social media robot group rapid detection method
CN113660255A (en) * 2021-08-13 2021-11-16 华世界数字科技(深圳)有限公司 Anonymous group chat method, device and storage medium
CN114897176A (en) * 2022-03-11 2022-08-12 南京鼎傲科技有限公司 Internet big data processing system and method based on artificial intelligence
CN116260715A (en) * 2023-05-09 2023-06-13 广东卓柏信息科技有限公司 Account safety early warning method, device, medium and computing equipment based on big data
EP4198775A4 (en) * 2020-09-30 2024-03-13 Bigo Tech Pte Ltd Abnormal user auditing method and apparatus, electronic device, and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1649311A (en) * 2005-03-23 2005-08-03 北京首信科技有限公司 Detecting system and method for user behaviour abnormal based on machine study
CN101615186A (en) * 2009-07-28 2009-12-30 东北大学 A kind of BBS user's abnormal behaviour auditing method based on Hidden Markov theory
CN102176698A (en) * 2010-12-20 2011-09-07 北京邮电大学 Method for detecting abnormal behaviors of user based on transfer learning
CN103150374A (en) * 2013-03-11 2013-06-12 中国科学院信息工程研究所 Method and system for identifying abnormal microblog users

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1649311A (en) * 2005-03-23 2005-08-03 北京首信科技有限公司 Detecting system and method for user behaviour abnormal based on machine study
CN101615186A (en) * 2009-07-28 2009-12-30 东北大学 A kind of BBS user's abnormal behaviour auditing method based on Hidden Markov theory
CN102176698A (en) * 2010-12-20 2011-09-07 北京邮电大学 Method for detecting abnormal behaviors of user based on transfer learning
CN103150374A (en) * 2013-03-11 2013-06-12 中国科学院信息工程研究所 Method and system for identifying abnormal microblog users

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨斌: "基于聚类的异常检测技术的研究", 《全国优秀硕士论文全文数据库》 *

Cited By (71)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105656867A (en) * 2014-12-02 2016-06-08 阿里巴巴集团控股有限公司 Monitoring method and device for account theft event
CN105656867B (en) * 2014-12-02 2018-10-16 阿里巴巴集团控股有限公司 Steal the monitoring method and device of account event
CN104580436A (en) * 2014-12-29 2015-04-29 北京锐安科技有限公司 Method and device for acquiring report data of illegal website
CN107548500A (en) * 2015-04-29 2018-01-05 微软技术许可有限责任公司 Event anomalies based on user's routine model
KR102127039B1 (en) 2015-05-13 2020-06-26 알리바바 그룹 홀딩 리미티드 Interactive data processing method and apparatus using same
US10956847B2 (en) 2015-05-13 2021-03-23 Advanced New Technologies Co., Ltd. Risk identification based on historical behavioral data
WO2016180267A1 (en) * 2015-05-13 2016-11-17 阿里巴巴集团控股有限公司 Method of processing exchanged data and device utilizing same
KR20180006955A (en) * 2015-05-13 2018-01-19 알리바바 그룹 홀딩 리미티드 METHOD FOR INTERACTION DATA PROCESSING AND APPARATUS USING THE SAME
CN107683486A (en) * 2015-06-05 2018-02-09 微软技术许可有限责任公司 The change with personal influence of customer incident
CN107683486B (en) * 2015-06-05 2022-01-07 微软技术许可有限责任公司 Personally influential changes to user events
CN106294508B (en) * 2015-06-10 2020-02-11 深圳市腾讯计算机系统有限公司 Brushing amount tool detection method and device
CN106294508A (en) * 2015-06-10 2017-01-04 深圳市腾讯计算机系统有限公司 A kind of brush amount tool detection method and device
CN105005594A (en) * 2015-06-29 2015-10-28 嘉兴慧康智能科技有限公司 Abnormal Weibo user identification method
CN105005594B (en) * 2015-06-29 2018-07-13 嘉兴慧康智能科技有限公司 Abnormal microblog users recognition methods
CN106372938A (en) * 2015-07-21 2017-02-01 华为技术有限公司 Abnormal account identification method and system
CN108140075B (en) * 2015-07-27 2021-10-26 皮沃塔尔软件公司 Classifying user behavior as anomalous
CN108140075A (en) * 2015-07-27 2018-06-08 皮沃塔尔软件公司 User behavior is classified as exception
CN105320765B (en) * 2015-10-26 2019-02-05 华为技术有限公司 The acquisition methods and device of message exception angle value
CN105320765A (en) * 2015-10-26 2016-02-10 华为技术有限公司 Message abnormality value acquisition method and apparatus
CN105516128B (en) * 2015-12-07 2018-10-30 中国电子技术标准化研究院 A kind of detection method and device of Web attacks
CN106228410A (en) * 2016-07-29 2016-12-14 武汉斗鱼网络科技有限公司 Virtual present task anti-brush system and method in a kind of live platform
CN107770129A (en) * 2016-08-17 2018-03-06 华为技术有限公司 Method and apparatus for detecting user behavior
CN106447488A (en) * 2016-09-07 2017-02-22 北京量科邦信息技术有限公司 Method and system for improving collection efficiency through technical means
CN106777024A (en) * 2016-12-08 2017-05-31 北京小米移动软件有限公司 Recognize the method and device of malicious user
CN106911668B (en) * 2017-01-10 2020-07-14 同济大学 Identity authentication method and system based on user behavior model
CN106911668A (en) * 2017-01-10 2017-06-30 同济大学 A kind of identity identifying method and system based on personal behavior model
CN110366727B (en) * 2017-02-13 2023-09-19 微软技术许可有限责任公司 Multi-signal analysis for damaged range identification
CN110366727A (en) * 2017-02-13 2019-10-22 微软技术许可有限责任公司 Multi signal analysis for damage range identification
US10558984B2 (en) * 2017-02-20 2020-02-11 Baidu Online Network Technology (Beijing) Co., Ltd. Method, apparatus and server for identifying risky user
US20180240133A1 (en) * 2017-02-20 2018-08-23 Baidu Online Network Technology (Beijing) Co., Ltd. Method, Apparatus and Server for Identifying Risky User
TWI626549B (en) * 2017-04-17 2018-06-11 Chunghwa Telecom Co Ltd Method of analyzing a URL to generate a user profile
CN109145109B (en) * 2017-06-19 2022-06-03 国家计算机网络与信息安全管理中心 User group message propagation abnormity analysis method and device based on social network
CN109145109A (en) * 2017-06-19 2019-01-04 国家计算机网络与信息安全管理中心 User group's message propagation anomaly analysis method and device based on social networks
CN107317812A (en) * 2017-06-27 2017-11-03 福建中金在线信息科技有限公司 A kind of method, device, electronic equipment and storage medium for searching pirate user
CN109213859A (en) * 2017-07-07 2019-01-15 阿里巴巴集团控股有限公司 A kind of Method for text detection, apparatus and system
CN109145179B (en) * 2017-07-26 2019-04-19 北京数安鑫云信息技术有限公司 A kind of crawler behavioral value method and device
CN109145179A (en) * 2017-07-26 2019-01-04 北京数安鑫云信息技术有限公司 A kind of crawler behavioral value method and device
CN109416659A (en) * 2017-09-30 2019-03-01 深圳市得道健康管理有限公司 A kind of network terminal and its constrained procedure of internet behavior
CN107622333B (en) * 2017-11-02 2020-08-18 北京百分点信息科技有限公司 Event prediction method, device and system
CN107622333A (en) * 2017-11-02 2018-01-23 北京百分点信息科技有限公司 A kind of event prediction method, apparatus and system
CN107888602A (en) * 2017-11-23 2018-04-06 北京白山耘科技有限公司 A kind of method and device for detecting abnormal user
WO2019114481A1 (en) * 2017-12-13 2019-06-20 腾讯科技(深圳)有限公司 Cluster type recognition method, apparatus, electronic apparatus, and storage medium
CN108133373A (en) * 2018-01-04 2018-06-08 交通银行股份有限公司 Seek the method and device for the adventure account for relating to machine behavior
CN108268762A (en) * 2018-01-17 2018-07-10 同济大学 The mobile social networking user identity of Behavior-based control modeling knows fake method
CN108268762B (en) * 2018-01-17 2021-04-30 同济大学 Mobile social network user identity identification method based on behavior modeling
CN108509793A (en) * 2018-04-08 2018-09-07 北京明朝万达科技股份有限公司 A kind of user's anomaly detection method and device based on User action log data
CN108932669A (en) * 2018-06-27 2018-12-04 北京工业大学 A kind of abnormal account detection method based on supervised analytic hierarchy process (AHP)
CN109034661A (en) * 2018-08-28 2018-12-18 腾讯科技(深圳)有限公司 User identification method, device, server and storage medium
CN109271422A (en) * 2018-09-20 2019-01-25 华中科技大学 A kind of social networks subject matter expert's lookup method driven by not firm information
CN109271422B (en) * 2018-09-20 2021-10-08 华中科技大学 Social network subject matter expert searching method driven by unreal information
CN111385247B (en) * 2018-12-28 2022-07-08 广州市百果园信息技术有限公司 User behavior classification method and device, storage medium and server
CN111385247A (en) * 2018-12-28 2020-07-07 广州市百果园信息技术有限公司 User behavior classification method and device, storage medium and server
WO2020155508A1 (en) * 2019-01-28 2020-08-06 平安科技(深圳)有限公司 Suspicious user screening method and apparatus, computer device and storage medium
CN109862018A (en) * 2019-02-21 2019-06-07 中国工商银行股份有限公司 Anti- crawler method and system based on user access activity
CN110191110A (en) * 2019-05-20 2019-08-30 山西大学 Social networks exception account detection method and system based on network representation study
CN111193697A (en) * 2019-08-07 2020-05-22 腾讯科技(深圳)有限公司 Method, device and system for detecting credibility of social account
CN111193697B (en) * 2019-08-07 2021-06-25 腾讯科技(深圳)有限公司 Method, device and system for detecting credibility of social account
CN110457601A (en) * 2019-08-15 2019-11-15 腾讯科技(武汉)有限公司 The recognition methods and device of social account, storage medium and electronic device
CN110457601B (en) * 2019-08-15 2023-10-24 腾讯科技(武汉)有限公司 Social account identification method and device, storage medium and electronic device
CN112417146A (en) * 2019-08-22 2021-02-26 脸谱公司 Notifying a user of offensive content
CN111915086A (en) * 2020-08-06 2020-11-10 上海连尚网络科技有限公司 Abnormal user prediction method and equipment
CN111966978A (en) * 2020-08-20 2020-11-20 咪咕文化科技有限公司 Abnormal user determination method, electronic device and storage medium
CN112115908A (en) * 2020-09-25 2020-12-22 北京易华录信息技术股份有限公司 Social ability evaluation method and device
CN112115908B (en) * 2020-09-25 2024-02-20 北京易华录信息技术股份有限公司 Social ability assessment method and device
EP4198775A4 (en) * 2020-09-30 2024-03-13 Bigo Tech Pte Ltd Abnormal user auditing method and apparatus, electronic device, and storage medium
CN112685614A (en) * 2021-03-17 2021-04-20 中国电子科技集团公司第三十研究所 Social media robot group rapid detection method
CN113660255A (en) * 2021-08-13 2021-11-16 华世界数字科技(深圳)有限公司 Anonymous group chat method, device and storage medium
CN114897176B (en) * 2022-03-11 2023-11-07 内蒙古塞上明珠科技成果推广服务有限公司 Internet big data processing system and method based on artificial intelligence
CN114897176A (en) * 2022-03-11 2022-08-12 南京鼎傲科技有限公司 Internet big data processing system and method based on artificial intelligence
CN116260715B (en) * 2023-05-09 2023-09-01 国品优选(北京)品牌管理有限公司 Account safety early warning method, device, medium and computing equipment based on big data
CN116260715A (en) * 2023-05-09 2023-06-13 广东卓柏信息科技有限公司 Account safety early warning method, device, medium and computing equipment based on big data

Similar Documents

Publication Publication Date Title
CN103853841A (en) Method for analyzing abnormal behavior of user in social networking site
Pacheco et al. Uncovering coordinated networks on social media: methods and case studies
US11595430B2 (en) Security system using pseudonyms to anonymously identify entities and corresponding security risk related behaviors
Pacheco et al. Uncovering coordinated networks on social media
Xu et al. Information security in big data: privacy and data mining
CN101971591B (en) System and method of analyzing web addresses
US11755586B2 (en) Generating enriched events using enriched data and extracted features
Brynielsson et al. Analysis of weak signals for detecting lone wolf terrorists
US11080109B1 (en) Dynamically reweighting distributions of event observations
Apte et al. Frauds in online social networks: A review
Ausloos et al. Researching with data rights
Freitas et al. An empirical study of socialbot infiltration strategies in the Twitter social network
Kligienė Digital footprints in the context of professional ethics
Drury et al. A social network of crime: A review of the use of social networks for crime and the detection of crime
Elekar Combination of data mining techniques for intrusion detection system
Zhu et al. Ontology-based approach for the measurement of privacy disclosure
Lee et al. The challenges and concerns of using big data to understand cybercrime
Möller Cyberattacker Profiles, Cyberattack Models and Scenarios, and Cybersecurity Ontology
Rodríguez et al. Webpages classification with phishing content using naive Bayes algorithm
Jansi An Effective Model of Terminating Phishing Websites and Detection Based On Logistic Regression
Lin et al. Novel JavaScript malware detection based on fuzzy Petri nets
Bai et al. Real-time prediction of meme burst
Grojek et al. Ontology-driven artificial intelligence in IoT forensics
Pacheco et al. Uncovering coordinated networks on social media: Methods and case studies
Seraj et al. MadDroid: malicious adware detection in Android using deep learning

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140611