WO2017084205A1 - 一种网络用户身份认证方法及系统 - Google Patents

一种网络用户身份认证方法及系统 Download PDF

Info

Publication number
WO2017084205A1
WO2017084205A1 PCT/CN2016/070994 CN2016070994W WO2017084205A1 WO 2017084205 A1 WO2017084205 A1 WO 2017084205A1 CN 2016070994 W CN2016070994 W CN 2016070994W WO 2017084205 A1 WO2017084205 A1 WO 2017084205A1
Authority
WO
WIPO (PCT)
Prior art keywords
session
user
browsing
algorithm
identity authentication
Prior art date
Application number
PCT/CN2016/070994
Other languages
English (en)
French (fr)
Inventor
蒋昌俊
闫春钢
陈闳中
丁志军
季梦清
Original Assignee
同济大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 同济大学 filed Critical 同济大学
Publication of WO2017084205A1 publication Critical patent/WO2017084205A1/zh
Priority to AU2018100671A priority Critical patent/AU2018100671A4/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Definitions

  • the present invention relates to a network security technology, and in particular, to a network user identity authentication method and system.
  • Authenticating network users is an important way to provide security in network transactions.
  • user authentication it can be divided into one-off certification and sustainable certification.
  • one-time authentication there are mainly traditional password-based authentication, smart card-based authentication, and user-based biometrics and behavioral characteristics.
  • one-time verification only performs authentication at a certain moment.
  • an object of the present invention is to provide a network user identity authentication method and system for solving the problem that the security of network user identity authentication needs to be further improved in the prior art.
  • the present invention provides a network user identity authentication method, where the network user identity authentication method includes: collecting all webpage browsing records of a legitimate user within a set time period, the browsing record including browsing a webpage URL, a text content, a timestamp; extracting a webpage top-level domain from the webpage of the browsing webpage, extracting a keyword from the text content, and determining a content class to which the text content belongs, and processing each of the browsing records In the form of ⁇ URL top-level domain name, content class, timestamp>, all the browsing records obtained in the set time period are regarded as one session; and m sessions of the legal user are obtained, for each session,
  • the processing is as follows: according to all browsing records in the session, counting a plurality of URL top-level domain names that are frequently accessed by the user, and using the first algorithm set to dig out between the top-level domain name of the website and the content class in the browsing record Relationship, using the set second algorithm to dig out the relationship between the content class and the time period in the
  • the network user identity authentication method further includes: acquiring a new session, and calculating a score of the new session; when the score falls within a range of the classification threshold, determining that the current user is the A legitimate user; when the score does not fall within the range of the classification threshold, it is determined that the current user is not the legitimate user.
  • the feature value includes: the number of elements included in the session; the number of frequently visited websites included in the session; the number of frequent itemsets matched by the session; and the number of frequently visited websites included in the frequent item set matched by the session; The length of the longest frequent item set matched by the session; the average length of the frequent itemsets matched by the session; the maximum support of the frequent itemsets matched by the session; the average support of the frequent itemsets matched by the session; the frequent matches of the session The number of time periods; the target column.
  • the first algorithm includes an Apriori algorithm.
  • the second algorithm includes: a method of maximum likelihood estimation calculates, from the browsing record of the session, a parameter value of a normal distribution that the user obeys for the browsing time of each content class.
  • the parameter values include: The time i is the relative time when the user browses the content class content i .
  • the third algorithm comprises: an LR logistic regression algorithm.
  • the fourth algorithm includes:
  • the classification threshold is Among them, score legal i is the score of the i-th session, a total of m sessions.
  • the set time period includes 30 minutes.
  • the present invention provides a network user identity authentication system, where the network user identity authentication system includes: a user session acquisition module, configured to collect all web browsing records of a legitimate user within a set time period, and the browsing record includes browsing the webpage. a URL, a text content, a time stamp; extracting a URL top-level domain from the browsing webpage URL, extracting a keyword from the text content to determine a content class to which the text content belongs, and processing each of the browsing records into a form of a ⁇ URL top-level domain name, a content class, a timestamp>, and all of the browsing records obtained within the set time period as a session; a session score calculation module for targeting a session, according to the session All the browsing records, counting the top-level domain names of the plurality of URLs that the user visits most frequently, and using the first algorithm set to dig out the relationship between the top-level domain name of the website and the content class in the browsing record, and using the set second The algorithm mines the content class and the time period in the browsing record Relationship,
  • the network user identity authentication system further includes a user legal judgment module, configured to acquire a new session, and calculate a score of the new session; when the score falls within a range of the classification threshold, Determining that the current user is the legitimate user; when the score does not fall within the range of the classification threshold, determining that the current user is not the legitimate user.
  • a user legal judgment module configured to acquire a new session, and calculate a score of the new session; when the score falls within a range of the classification threshold, Determining that the current user is the legitimate user; when the score does not fall within the range of the classification threshold, determining that the current user is not the legitimate user.
  • the feature value includes: the number of elements included in the session; the number of frequently visited websites included in the session; the number of frequent itemsets matched by the session; and the number of frequently visited websites included in the frequent item set matched by the session; The length of the longest frequent item set matched by the session; the average length of the frequent itemsets matched by the session; the maximum support of the frequent itemsets matched by the session; the average support of the frequent itemsets matched by the session; the frequent matches of the session The number of time periods; the target column.
  • the first algorithm includes an Apriori algorithm.
  • the second algorithm includes: a method of maximum likelihood estimation calculates, from the browsing record of the session, a parameter value of a normal distribution that the user obeys for the browsing time of each content class.
  • the parameter values include: The time i is the relative time when the user browses the content class content i .
  • the third algorithm comprises: an LR logistic regression algorithm.
  • the fourth algorithm includes:
  • the classification threshold is Among them, score legal i is the score of the i-th session, a total of m sessions.
  • the set time period includes 30 minutes.
  • a network user identity authentication method and system of the present invention has the following beneficial effects: 1) performing a sequence of two factors (web address, content) and (content, time) browsed by the user, and Rather than just considering one of the factors, the authentication method of the present invention conforms to the browsing habits of the user. 2) Using association rules, the (website, content) is combined to mine the user's browsing habits; based on the normal distribution, it is used to discover the frequent access time period of the user for each content. 3) The effect of continuous authentication is achieved during the process of the user browsing the webpage.
  • FIG. 1 is a schematic flowchart diagram of an embodiment of a network user identity authentication method according to the present invention.
  • FIG. 2 is a schematic flowchart diagram of another embodiment of a network user identity authentication method according to the present invention.
  • FIG. 3 is a block diagram showing an embodiment of a network user identity authentication system according to the present invention.
  • the invention provides a network user identity authentication method.
  • the network user identity authentication method performs identity authentication according to a user browsing behavior.
  • the network user identity authentication method includes:
  • Step S1 Collect all webpage browsing records of the legal user in the set time period, where the browsing record includes browsing the webpage URL, text content, and timestamp; extracting the webpage top-level domain name from the browsing webpage URL, from the text The content extracts the keyword to determine the content class to which the text content belongs, and processes each of the browsing records into a form of ⁇ URL top-level domain name, content class, timestamp>, which will be obtained within the set time period. All of the browsing records are treated as a single session.
  • a user's web browsing record is collected and processed for data to form a ⁇ (domain, content, timestamp) ⁇ session structure as shown below for the basis of subsequent analysis.
  • Step S1 is performed multiple times. For example, step S1 is performed m times to obtain m sessions, and finally m sessions are merged to obtain corresponding sessions. Collection S. Subsequent authentication is also performed in units of one user's access behavior (ie, one session, 30 minutes).
  • the browsing history of the legitimate user is collected. record.
  • the sqlite database records the detailed information of the user when browsing each webpage, and collects the url (Uniform Resource Locator) of the webpage browsed by each user, that is, the webpage address; the text content and the timestamp as the original browsing record. Record the browsing history as r, and its attributes are as shown in Table 1:
  • the original data is processed: first, each browsing record in the session is processed, and the top-level domain name is extracted from the url; and the text classification sample of the sogou laboratory is used together with the article on the network. The text content of each class is extracted to obtain a corresponding keyword, and then matched with the title of the webpage to be classified to obtain the content class to which the webpage belongs.
  • the first original browsing record in Table 1 is processed as (news.163.com, social, time stamp), we record this form data as web page p (domain, content, timestamp) ).
  • step S2 the m sessions of the legal user are obtained.
  • the following processing is performed: according to all browsing records in the session, multiple top-level domain names of the website that are most frequently accessed by the user are counted, and the set is used.
  • the first algorithm mines the relationship between the top-level domain name of the website and the content class in the browsing record, and uses the second algorithm to dig out the relationship between the content class and the time segment in the browsing record, thereby obtaining the user.
  • the collected browsing records are processed and divided at intervals of 30 minutes, one session is obtained every 30 minutes, step S1 is performed m times, m sessions are obtained, and m sessions are finally merged to obtain corresponding session sets.
  • the feature value includes: the number of elements included in the session; the number of frequently visited websites included in the session; the number of frequent itemsets matched by the session; the number of frequently visited websites included in the frequent item set matched by the session; The length of the longest frequent item set; the length of the frequent itemsets matched by the session; the maximum support of the frequent itemsets matched by the session; the average support of the frequent itemsets matched by the session; the number of frequent time segments matched by the session ; target column.
  • the first algorithm includes an Apriori algorithm.
  • Apriori algorithm is a frequent item set algorithm for mining association rules. Its core idea is to mine frequent itemsets through two stages: candidate set generation and plot closed down detection. Moreover, algorithms have been widely used in various fields such as business and network security.
  • the second algorithm includes a method of maximum likelihood estimation that calculates a parameter value of a normal distribution that the user obeys for the browsing time of each content class from the browsing record of the session.
  • the parameter value The timei is the relative time when the user browses the content class contenti; the parameter is used to count the number of frequent time segments matched by the session.
  • the third algorithm includes: an LR logistic regression algorithm. Logistic regression is a typical two-classification algorithm. The model generated by it is relatively intuitive and easy to interpret, and it is not easy to produce over-fitting. It is actually a process of learning the f:X->Y equation.
  • different users browse specific content on different websites at different times. Based on this user browsing feature, we mainly start from three aspects: frequent access to the website, (web address, content), and (content, time period). Perform feature extraction. According to the ⁇ (domain, content, timestamp) ⁇ session set, feature extraction is performed from frequent URL statistics, frequent item set mining and frequent time segment mining to obtain user browsing features. In one embodiment, processing statistics can be performed on multiple users simultaneously:
  • Frequent item set mining Using the Apriori algorithm, the sequence relationship between (URL, content) is mined. In this process, the present invention selects a suitable support threshold ⁇ by experiment. For a frequent item set fc:X,Y, if support(fc)> ⁇ , the frequent item set is added to the web browsing frequent item set FCj corresponding to the user j.
  • Frequent time period mining For a user, it is assumed that the time period in which it browses a certain content is subject to a normal distribution process.
  • the (content, time) data obtained from the S* process is used to establish a normal distribution model that the user obeys when browsing each type of content content. Since the parameters in the normal distribution process cannot be accurately obtained, the maximum likelihood estimation method is used to calculate the parameter values of the normal distribution that the user obeys for the browsing time of each content from the session. among them,
  • each session in the session set has a corresponding feature value, which is denoted as fvji.
  • fv ji ⁇ length i, pun i, mrn i, rpun i, mrml i, mral i, mrms i, mras i, mtn i, target i>, the meaning of each value among such specific Table 2 shows.
  • the user browsing feature authentication method (hereinafter referred to as UBFFA) is performed by using the LR logistic regression algorithm, and the specific process is as shown in Algorithm 1.
  • Algorithm 1 Authentication Method Based on User Browsing Features (UBFAA)
  • Mrtl 0;//the total length of the frequent itemsets matched by the session S*i
  • Mrts 0;//Total support for frequent itemsets matched by session S*i
  • Length the number of elements included in the session set S*
  • Step S3 Calculate, according to the scores of the m sessions, a classification threshold of the legal user by using a fourth algorithm.
  • the fourth algorithm comprises:
  • the classification threshold is Among them, score legal i is the score of the i-th session, a total of m sessions.
  • the network user identity authentication method further includes:
  • Step S4 acquiring a new session, and calculating a score of the new session; when the score falls within the range of the classification threshold, determining that the current user is the legitimate user; when the score does not fall into the location When the range of the threshold is classified, it is determined that the current user is not the legitimate user.
  • the method of step S1 is used to obtain a current session (new session), and the score of the session is calculated by the method of step S2, and then according to the classification threshold in step S3, it is determined whether the user to which the current session belongs is a legitimate user.
  • the score of the new session falls within the range of the classification threshold, it is determined that the current user is the combination The legal user; when the score of the new session does not fall within the range of the classification threshold, it is determined that the current user is not the legitimate user.
  • the invention provides a network user identity authentication system.
  • the network user identity authentication system may employ the network user identity authentication method as described above.
  • the network user identity authentication system 1 includes a user session acquisition module 11, a session score calculation module 12, and a classification threshold determination module 13. among them:
  • the user session obtaining module 11 is configured to collect all webpage browsing records of the user in a set time period, where the browsing record includes browsing a webpage URL, text content, and a timestamp; and extracting a webpage top-level domain name from the browsing webpage URL, Extracting a keyword from the text content to determine a content class to which the text content belongs, and processing each of the browsing records into a form of a ⁇ URL top-level domain name, a content class, a timestamp>, which will be in the set time period All of the browsing records obtained within the session as a session.
  • the set time period comprises 30 minutes.
  • the session score calculation module 12 is connected to the user session acquisition module 11 for counting, according to all browsing records in the session, a plurality of top-level domain names of the website that are most frequently accessed by the user, and using the set first algorithm. Mining the relationship between the top-level domain name of the website and the content class in the browsing record, and using the set second algorithm to dig out the relationship between the content class and the time segment in the browsing record, thereby obtaining the browsing page of the user. n eigenvalues; processing the acquired eigenvalues according to the set third algorithm to obtain a weight matrix corresponding to the eigenvalues; calculating the traits according to the eigenvalues and the corresponding weight matrix The score of the session.
  • the feature value includes: the number of elements included in the session; the number of frequently accessed websites included in the session; the number of frequent itemsets matched by the session; and the frequently visited websites included in the frequent itemsets of the session matching Number; the length of the longest frequent itemsets that the session matches; the length of the frequent itemsets that the session matches; the maximum support of the frequent itemsets that the session matches; the average support of the frequent itemsets that the session matches; the session matches The number of frequent time periods; the target column.
  • the first algorithm includes an Apriori algorithm.
  • the second algorithm includes a method of maximum likelihood estimation that calculates a parameter value of a normal distribution that the user obeys for the browsing time of each content class from the browsing record of the session.
  • the parameter value The time i is the relative time when the user browses the content class content i ; the parameter is used to count the number of frequent time segments matched by the session.
  • the third algorithm includes: a gradient descent method.
  • the classification threshold determining module 13 is connected to the session score calculation module, and is configured to obtain a plurality of session scores of the legal user, and calculate a classification threshold of the legal user by using a fourth algorithm.
  • the fourth algorithm comprises: The classification threshold is Wherein, score legitimate i is the i th fraction session, total m conversations.
  • the network user identity authentication system 1 further includes a user legal judgment module.
  • the user legal judgment module 14 is connected to the classification threshold determination module 13, the session score calculation module 12, and the user session acquisition module 11, and is configured to acquire a new session from the user session acquisition module 11, and calculate by the session score calculation module 12. a score of the new session; when the score falls within a range of classification thresholds of the legitimate user obtained in the classification threshold determination module 13, determining that the current user is the legitimate user; when the score does not fall within the classification When the range of the threshold is determined, it is determined that the current user is not the legitimate user.
  • the technical solution of the invention provides a reliable guarantee for the user's account security by distinguishing the browsing behaviors of different users from the three aspects of the browsed web address sequence, the content and the browsing time.
  • the network identity authentication of the present invention can achieve a detection rate of 93.6% when the false alarm rate is 10%, which has a good verification effect and is beneficial to ensure the security of the user account.
  • the network user identity authentication method and system of the present invention has the following beneficial effects: 1) serially mining two factors (website, content) and (content, time) browsed by the user, and Rather than just considering one of the factors, the authentication method of the present invention conforms to the browsing habits of the user. 2) Using association rules, the (website, content) is combined to mine the user's browsing habits; based on the normal distribution, it is used to discover the frequent access time period of the user for each content. 3) The effect of continuous authentication is achieved during the process of the user browsing the webpage. Therefore, the present invention effectively overcomes various shortcomings in the prior art and has high industrial utilization value.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Storage Device Security (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明提供一种网络用户身份认证方法及系统。所述网络用户身份认证方法包括:采集合法用户在设定时间段内的所有网页浏览记录作为一个会话,将每一条浏览记录处理成<网址顶级域名,内容类,时间戳>的形式;获取所述合法用户的m个会话,针对每一个会话,作如下处理:根据所述会话,得到所述用户浏览网页的n个特征值;根据设定的第三算法对所获取的特征值进行处理,得到与所述特征值相对应的权值矩阵;根据所述特征值以及相对应的权值矩阵计算得到所述会话的分数;根据会话的分数,采用第四算法计算得到所述合法用户的分类阈值。本发明的技术方案通过根据浏览记录网址,内容以及时间三个方面入手进行连续性认证,提高了认证效果。

Description

一种网络用户身份认证方法及系统 技术领域
本发明涉及一种网络安全技术,特别是涉及一种网络用户身份认证方法及系统。
背景技术
随着信息技术及互联网技术的发展,我国的上网人员的规模不断扩大,网上购物和交易也越来越频繁,上网已成为了许多人生活中不可缺少的一部分,与此同时,网购交易中的欺诈犯罪近年来也在急剧上升,人工骗术和技术手段相结合的新型网络欺诈已成为网民线上生活的首要安全威胁。对网络用户进行身份认证是提供网络交易中的安全性的重要方法。关于用户身份认证方面,可分为一次性认证和可持续认证两类。关于一次性认证,目前主要有传统的基于口令的认证、基于智能卡的认证、基于用户的生物特征和行为特征的认证等。但一次性验证只在某一时刻进行认证,认证通过则判断该用户身份合法,不能很好地为用户提供安全保障,因而进一步提出了可持续性认证。目前关于可持续认证的研究相对较少,现有的可持续认证主要从用户网址序列或者用户浏览内容之间的联系进行研究。对用户浏览行为考虑的不够全面,认证效果有待提高。
鉴于此,如何找到进一步提高网络用户身份认证安全性的技术方案就成了本领域技术人员亟待解决的问题。
发明内容
鉴于以上所述现有技术的缺点,本发明的目的在于提供一种网络用户身份认证方法及系统,用于解决现有技术中网络用户身份认证安全性有待进一步提高的问题。
为实现上述目的及其他相关目的,本发明提供一种网络用户身份认证方法,所述网络用户身份认证方法包括:采集合法用户在设定时间段内的所有网页浏览记录,所述浏览记录包括浏览网页网址、文本内容、时间戳;从所述浏览网页网址中抽取出网址顶级域名,从所述文本内容抽取出关键字进而确定所述文本内容所属的内容类,将每一条所述浏览记录处理成<网址顶级域名,内容类,时间戳>的形式,将在所述设定时间段内得到的所有所述浏览记录作为一个会话;获取所述合法用户的m个会话,针对每一个会话,作如下处理:根据所述会话中的所有浏览记录,统计出用户最频繁访问的多个网址顶级域名,并利用设定的第一算法挖掘出所述浏览记录中网址顶级域名与内容类之间的关系,利用设定的第二算法挖掘出所述浏览记录中内容类与时间段之间的关系,进而得到所述用户浏览网页的n个特征值;根据设定 的第三算法对所获取的特征值进行处理,得到与所述特征值相对应的权值矩阵;根据所述特征值以及相对应的权值矩阵计算得到所述会话的分数;根据所述m个会话的分数,采用第四算法计算得到所述合法用户的分类阈值。
可选地,所述网络用户身份认证方法还包括:获取一个新的会话,并计算出所述新会话的分数;当所述分数落入所述分类阈值的范围时,判定当前用户是所述合法用户;当所述分数不落入所述分类阈值的范围时,判定当前用户不是所述合法用户。
可选地,所述特征值包括:会话包含的元素个数;会话包含的频繁访问网站个数;会话所匹配的频繁项集个数;会话匹配的频繁项集中包含的频繁访问网站个数;会话所匹配的最长频繁项集长度;会话所匹配的频繁项集均长度;会话所匹配的频繁项集的最大支持度;会话所匹配的频繁项集的平均支持度;会话所匹配的频繁时间段个数;目标列。
可选地,所述第一算法包括Apriori算法。
可选地,所述第二算法包括:最大似然估计的方法从所述会话的浏览记录中计算出用户对每个内容类的浏览时间所服从的正态分布的参数值。
可选地,所述参数值包括:
Figure PCTCN2016070994-appb-000001
其中,timei为用户在浏览内容类contenti时的相对时间。
可选地,所述第三算法包括:LR逻辑回归算法。
可选地,所述第四算法包括:
Figure PCTCN2016070994-appb-000002
则所述分类阈值为
Figure PCTCN2016070994-appb-000003
其中,score合法i为第i个会话的分数,共m个会话。
可选地,所述设定时间段包括30分钟。
本发明提供还一种网络用户身份认证系统,所述网络用户身份认证系统包括:用户会话获取模块,用于采集合法用户在设定时间段内的所有网页浏览记录,所述浏览记录包括浏览网页网址、文本内容、时间戳;从所述浏览网页网址中抽取出网址顶级域名,从所述文本内容抽取出关键字进而确定所述文本内容所属的内容类,将每一条所述浏览记录处理成<网址顶级域名,内容类,时间戳>的形式,将在所述设定时间段内得到的所有所述浏览记录作为一个会话;会话分数计算模块,用于针对一个会话,根据所述会话中的所有浏览记录,统计出用户最频繁访问的多个网址顶级域名,并利用设定的第一算法挖掘出所述浏览记录中网址顶级域名与内容类之间的关系,利用设定的第二算法挖掘出所述浏览记录中内容类与时间段之间 的关系,进而得到所述用户浏览网页的n个特征值;根据设定的第三算法对所获取的特征值进行处理,得到与所述特征值相对应的权值矩阵;根据所述特征值以及相对应的权值矩阵计算得到所述会话的分数;分类阈值确定模块,用于获取合法用户的多个会话分数,采用第四算法计算得到所述合法用户的分类阈值。
可选地,所述网络用户身份认证系统还包括用户合法判断模块,用于获取一个新的会话,并计算出所述新会话的分数;当所述分数落入所述分类阈值的范围时,判定当前用户是所述合法用户;当所述分数不落入所述分类阈值的范围时,判定当前用户不是所述合法用户。
可选地,所述特征值包括:会话包含的元素个数;会话包含的频繁访问网站个数;会话所匹配的频繁项集个数;会话匹配的频繁项集中包含的频繁访问网站个数;会话所匹配的最长频繁项集长度;会话所匹配的频繁项集均长度;会话所匹配的频繁项集的最大支持度;会话所匹配的频繁项集的平均支持度;会话所匹配的频繁时间段个数;目标列。
可选地,所述第一算法包括Apriori算法。
可选地,所述第二算法包括:最大似然估计的方法从所述会话的浏览记录中计算出用户对每个内容类的浏览时间所服从的正态分布的参数值。
可选地,所述参数值包括:
Figure PCTCN2016070994-appb-000004
其中,timei为用户在浏览内容类contenti时的相对时间。
可选地,所述第三算法包括:LR逻辑回归算法。
可选地,所述第四算法包括:
Figure PCTCN2016070994-appb-000005
则所述分类阈值为
Figure PCTCN2016070994-appb-000006
其中,score合法i为第i个会话的分数,共m个会话。
可选地,所述设定时间段包括30分钟。
如上所述,本发明的一种网络用户身份认证方法及系统,具有以下有益效果:1)将用户所浏览的(网址,内容),以及(内容,时间)两个因素进行序列的挖掘,而不是单纯只考虑其中某个因素,从而使得本发明的认证方法符合用户的浏览习惯。2)利用关联规则,将(网址,内容)联合进行用户浏览习惯的挖掘;基于正态分布,用以发现用户对各个内容的频繁访问时间段。3)在用户浏览网页的过程中达到了持续性的认证的效果。
附图说明
图1显示为本发明的一种网络用户身份认证方法的一实施例的流程示意图。
图2显示为本发明的一种网络用户身份认证方法的另一实施例的流程示意图。
图3显示为本发明的一种网络用户身份认证系统的一实施例的模块示意图。
元件标号说明
1                网络用户身份认证系统
11               用户会话获取模块
12               会话分数计算模块
13               分类阈值确定模块
14               用户合法判断模块
S1~S4           步骤
具体实施方式
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。
需要说明的是,本实施例中所提供的图示仅以示意方式说明本发明的基本构想,遂图式中仅显示与本发明中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制,其实际实施时各组件的型态、数量及比例可为一种随意的改变,且其组件布局型态也可能更为复杂。
本发明提供一种网络用户身份认证方法。所述网络用户身份认证方法根据用户浏览行为进行身份认证。在一个实施例中,如图1所示,所述网络用户身份认证方法包括:
步骤S1,采集合法用户在设定时间段内的所有网页浏览记录,所述浏览记录包括浏览网页网址、文本内容、时间戳;从所述浏览网页网址中抽取出网址顶级域名,从所述文本内容抽取出关键字进而确定所述文本内容所属的内容类,将每一条所述浏览记录处理成<网址顶级域名,内容类,时间戳>的形式,将在所述设定时间段内得到的所有所述浏览记录作为一个会话。在一个实施例中,采集一个用户的web浏览记录,进行数据处理,形成如下所示的{(domain,content,timestamp)}会话结构作为后续分析的基础。以30分钟的时间间隔对采集的浏览记录进行处理划分,每30分钟得到一个会话,执行多次步骤S1,如执行m次步骤S1,得到m个会话,最后将m个会话合并得到相应的会话集合S。后续进行认证时也是以用户的一次访问行为(即一个会话,30分钟)为单位进行认证。
在一个实施例中,首先利用chrome浏览器自带的sqlite数据库,采集合法用户的浏览记 录。sqlite数据库中记录了用户浏览每个网页时的详细信息,采集每个用户所浏览网页的url(统一资源定位符,Uniform Resource Locator),即网页地址;文本内容以及时间戳作为原始浏览记录。将浏览记录记为r,其属性如下表1所示:
Figure PCTCN2016070994-appb-000007
在获得原始数据以后,会对原始数据进行处理:首先,对会话中的每个浏览记录进行处理,对其url进行顶级域名的抽取;再利用sogou实验室的文本分类样本与网络上的文章共同,对每个类下的文本内容抽取得到相应的关键字,之后与需要分类的网页标题进行匹配得到该网页所属的内容类。经过网页处理之后,如表1的第一条原始的浏览记录被处理为(news.163.com,社会,时间戳)的形式,我们将这种形式数据记为网页p(domain,content,timestamp)。
步骤S2,获取所述合法用户的m个会话,针对每一个会话,作如下处理:根据所述会话中的所有浏览记录,统计出用户最频繁访问的多个网址顶级域名,并利用设定的第一算法挖掘出所述浏览记录中网址顶级域名与内容类之间的关系,利用设定的第二算法挖掘出所述浏览记录中内容类与时间段之间的关系,进而得到所述用户浏览网页的n个特征值;根据设定的第三算法对所获取的特征值进行处理,得到与所述特征值相对应的权值矩阵;根据所述特征值以及相对应的权值矩阵计算得到所述会话的分数。在一个实施例中,以30分钟的时间间隔对采集的浏览记录进行处理划分,每30分钟得到一个会话,执行m次步骤S1,得到m个会话,最后将m个会话合并得到相应的会话集合S。所述特征值包括:会话包含的元素个数;会话包含的频繁访问网站个数;会话所匹配的频繁项集个数;会话匹配的频繁项集中包含的频繁访问网站个数;会话所匹配的最长频繁项集长度;会话所匹配的频繁项集均长度;会话所匹配的频繁项集的最大支持度;会话所匹配的频繁项集的平均支持度;会话所匹配的频繁时间段个数;目标列。所述第一算法包括Apriori算法。Apriori算法是一种挖掘关联规则的频繁项集算法,其核心思想是通过候选集生成和情节的向下封闭检测两个阶段来挖掘频繁项集。而且算法已经被广泛的应用到商业、网络安全等各个领域。
所述第二算法包括:最大似然估计的方法从所述会话的浏览记录中计算出用户对每个内 容类的浏览时间所服从的正态分布的参数值。所述参数值
Figure PCTCN2016070994-appb-000008
其中,timei为用户在浏览内容类contenti时的相对时间;所述参数用于统计所述会话所匹配的频繁时间段个数。所述第三算法包括:LR逻辑回归算法。逻辑回归是一个典型的二分类算法,由它产生的模型相对直观简单,容易解释,并且不容易产生过拟合现象。它其实是学习f:X->Y方程的一个过程,我们会预先给定一个n元组变量向量X=<X1,X2...,Xn>和m元目标向量Y=<Y1,Y2...,Ym>,而逻辑回归就是学习一个函数f(X),使得学习到的函数能根据我们事先给出的变量值最大程度地拟合目标向量Y。
在一个实施例中,不同用户会在不同时间,在不同网站浏览特定内容,基于这一用户浏览特征,我们主要从频繁访问网址,(网址,内容)以及(内容,时间段)这三方面着手进行特征的抽取。按照{(domain,content,timestamp)}会话集从频繁网址统计,频繁项集挖掘以及频繁时间段挖掘三方面进行特征抽取得到用户浏览特征。在一个实施例中,可以同时对多个用户进行处理统计:
频繁访问网站统计:由于每个用户所频繁浏览的网页不同,统计出每个用户最频繁访问的15个网址顶级域名,放入相应用户j的频繁访问网址类FUj当中。
频繁项集挖掘:利用Apriori算法,挖掘出(网址,内容)之间存在的序列关系。在这个过程中,本发明通过实验选取了一个合适的支持度阈值δ。对于一个频繁项集fc:X,Y,若support(fc)>δ,则将该频繁项集添加到对应用户j的web浏览频繁项集FCj中。
频繁时间段挖掘:对于一个用户,假定其浏览某个content的时间段服从于一个正态分布过程。利用从S*处理得到的(content,time)数据为用户建立浏览每一类content内容的时间所服从的正态分布模型。由于正态分布过程中的参数无法准确获得,利用最大似然估计的方法从会话中计算出用户对每个content的浏览时间所服从的正态分布的参数值。其中,
Figure PCTCN2016070994-appb-000009
进而获取会话中相应的特征值,会话集当中的每个会话都有与之相应的特征值,记为fvji。在一个实施例中,且fvji=<lengthi,puni,mrni,rpuni,mrmli,mrali,mrmsi,mrasi,mtni,targeti>,当中每个值的含义具体如表2当中所示。
Figure PCTCN2016070994-appb-000010
Figure PCTCN2016070994-appb-000011
在得到会话的特征值集合之后,利用LR逻辑回归算法进行基于用户浏览特征认证方法(以下简称UBFAA),其具体过程如算法1所示。
算法1:基于用户浏览特征的认证方法(UBFAA)
输入:合法用户会话集S*,合法用户的频繁项集FC,合法用户的频繁访问网址集FU以
及频繁访问时间段集FT
输出:特征值权值矩阵w,数组score合法
1)遍历合法用户的会话集S*的每一个会话s*i
mrtl=0;//会话S*i所匹配的频繁项集的总长度
mrts=0;//会话S*i所匹配的频繁项集的总支持度
pun=0;
length=会话集S*包含的元素个数;
target=1;
1.1)遍历合法用户的频繁访问网址集FU
if合法用户的频繁访问网址集FU中存在fuj=当前会话网页类的顶级域名,则pun加1;
1.2)遍历合法用户的频繁项集FC
if当前会话包含频繁项集fcj
1.2.1)mrn加1,mrtl累加上当前频繁项集的长度,mrts累加上当前频繁项集的支持度;
1.2.2)将当前会话所匹配规则的最大支持度保存在mrms中;
1.2.3)将当前会话所匹配规则的最大长度保存在mrml中;
1.2.4)统计中fcj包含的频繁访问网站个数保存在rpun中;
1.3)获得当前会话所匹配规则的平均支持度mras与平均长度mral;
1.4)遍历合法用户的频繁时间段集FT
if合法用户的频繁访问网址集FT中存在ftj,使得当前会话网页类.content=
ftj.content and当前会话网页类.time在
Figure PCTCN2016070994-appb-000012
区间内,则mtn加1;
1.5)将会话s*i的各个属性写入十元组集合FVi当中;
2)遍历十元组集FVi
2.1)创建矩阵datas,将其第一列全赋值为1,并将的特征数据存储到矩阵当中;
2.2)创建labels矩阵,并将FVi的最后一列数据存储到labels当中;
2.3)创建值全为1的10*1大小的权值矩阵w;
3)设置LR逻辑回归的学习速度alpha=0.01,LR的最大循环次数maxCycles=500;
4)当计算次数小于maxCycles时,重复利用梯度下降法计算权值矩阵w的值;
5)利用权值矩阵w计算计算得到会话相应的score,并存入数组score合法中;
6)返回权值矩阵w与合法会话评分数组score合法
然后,根据以上算法得到的权值矩阵w与会话j所对应的特征值向量fvj计算其对应的scorej,其计算公式如下所示:
对于fvi∈FV,
score=w0+w1*fvi.length+w2*fvi.pun+...+w10*fvi.mtn
针对m个合法用户的会话得到评分数组score合法={score合法1,score合法2,...,score合法m}。
步骤S3,根据所述m个会话的分数,采用第四算法计算得到所述合法用户的分类阈值。在一个实施例中,所述第四算法包括:
Figure PCTCN2016070994-appb-000013
所述分类阈值为
Figure PCTCN2016070994-appb-000014
其中,score合法i为第i个会话的分数,共m个会话。
在一个实施例中,如图2所示,所述网络用户身份认证方法还包括:
步骤S4,获取一个新的会话,并计算出所述新会话的分数;当所述分数落入所述分类阈值的范围时,判定当前用户是所述合法用户;当所述分数不落入所述分类阈值的范围时,判定当前用户不是所述合法用户。采用步骤S1的方法获得一个当前会话(新的会话),并采用步骤S2的方法计算该会话的分数,然后根据步骤S3中的分类阈值,判断当前会话所属的用户是否为合法用户。当新的会话的分数落入所述分类阈值的范围时,判定当前用户是所述合 法用户;当新的会话的分数不落入所述分类阈值的范围时,判定当前用户不是所述合法用户。
本发明提供还一种网络用户身份认证系统。所述网络用户身份认证系统可以采用如上所述的网络用户身份认证方法。在一个实施例中,如图3所示,所述网络用户身份认证系统1包括用户会话获取模块11、会话分数计算模块12以及分类阈值确定模块13。其中:
用户会话获取模块11用于采集用户在设定时间段内的所有网页浏览记录,所述浏览记录包括浏览网页网址、文本内容、时间戳;从所述浏览网页网址中抽取出网址顶级域名,从所述文本内容抽取出关键字进而确定所述文本内容所属的内容类,将每一条所述浏览记录处理成<网址顶级域名,内容类,时间戳>的形式,将在所述设定时间段内得到的所有所述浏览记录作为一个会话。在一个实施例中所述设定时间段包括30分钟。
会话分数计算模块12与用户会话获取模块11相连,用于针对一个会话,根据所述会话中的所有浏览记录,统计出用户最频繁访问的多个网址顶级域名,并利用设定的第一算法挖掘出所述浏览记录中网址顶级域名与内容类之间的关系,利用设定的第二算法挖掘出所述浏览记录中内容类与时间段之间的关系,进而得到所述用户浏览网页的n个特征值;根据设定的第三算法对所获取的特征值进行处理,得到与所述特征值相对应的权值矩阵;根据所述特征值以及相对应的权值矩阵计算得到所述会话的分数。在一个实施例中,所述特征值包括:会话包含的元素个数;会话包含的频繁访问网站个数;会话所匹配的频繁项集个数;会话匹配的频繁项集中包含的频繁访问网站个数;会话所匹配的最长频繁项集长度;会话所匹配的频繁项集均长度;会话所匹配的频繁项集的最大支持度;会话所匹配的频繁项集的平均支持度;会话所匹配的频繁时间段个数;目标列。所述第一算法包括Apriori算法。所述第二算法包括:最大似然估计的方法从所述会话的浏览记录中计算出用户对每个内容类的浏览时间所服从的正态分布的参数值。所述参数值
Figure PCTCN2016070994-appb-000015
其中,timei为用户在浏览内容类contenti时的相对时间;所述参数用于统计所述会话所匹配的频繁时间段个数。所述第三算法包括:梯度下降法。
分类阈值确定模块13与会话分数计算模块相连,用于获取合法用户的多个会话分数,采用第四算法计算得到所述合法用户的分类阈值。在一个实施例中,所述第四算法包括:
Figure PCTCN2016070994-appb-000016
则所述分类阈值为
Figure PCTCN2016070994-appb-000017
其中,score 法i为第i个会话的分数,共m个会话。
在一个实施例中,如图3所示,所述网络用户身份认证系统1还包括用户合法判断模块 14,用户合法判断模块14与分类阈值确定模块13、会话分数计算模块12、用户会话获取模块11相连,用于从用户会话获取模块11中获取一个新的会话,并通过会话分数计算模块12计算出所述新会话的分数;当所述分数落入分类阈值确定模块13中得到的合法用户的分类阈值的范围时,判定当前用户是所述合法用户;当所述分数不落入所述分类阈值的范围时,判定当前用户不是所述合法用户。本发明的技术方案通过从浏览的网址序列,内容以及浏览时间三方面区分出不同用户的浏览行为,从而为用户的账户安全提供可靠保证。经过实验测试,在误报率为10%时能本发明的网络身份认证可以达到93.6%的检测率,具有很好的验证效果,有利于保证用户账户安全。
综上所述,本发明的一种网络用户身份认证方法及系统具有以下有益效果:1)将用户所浏览的(网址,内容),以及(内容,时间)两个因素进行序列的挖掘,而不是单纯只考虑其中某个因素,从而使得本发明的认证方法符合用户的浏览习惯。2)利用关联规则,将(网址,内容)联合进行用户浏览习惯的挖掘;基于正态分布,用以发现用户对各个内容的频繁访问时间段。3)在用户浏览网页的过程中达到了持续性的认证的效果。所以,本发明有效克服了现有技术中的种种缺点而具高度产业利用价值。
上述实施例仅例示性说明本发明的原理及其功效,而非用于限制本发明。任何熟悉此技术的人士皆可在不违背本发明的精神及范畴下,对上述实施例进行修饰或改变。因此,举凡所属技术领域中具有通常知识者在未脱离本发明所揭示的精神与技术思想下所完成的一切等效修饰或改变,仍应由本发明的权利要求所涵盖。

Claims (10)

  1. 一种网络用户身份认证方法,其特征在于,所述网络用户身份认证方法包括:
    采集合法用户在设定时间段内的所有网页浏览记录,所述浏览记录包括浏览网页网址、文本内容、时间戳;从所述浏览网页网址中抽取出网址顶级域名,从所述文本内容抽取出关键字进而确定所述文本内容所属的内容类,将每一条所述浏览记录处理成<网址顶级域名,内容类,时间戳>的形式,将在所述设定时间段内得到的所有所述浏览记录作为一个会话;
    获取所述合法用户的m个会话,针对每一个会话,作如下处理:根据所述会话中的所有浏览记录,统计出用户最频繁访问的多个网址顶级域名,并利用设定的第一算法挖掘出所述浏览记录中网址顶级域名与内容类之间的关系,利用设定的第二算法挖掘出所述浏览记录中内容类与时间段之间的关系,进而得到所述用户浏览网页的n个特征值;根据设定的第三算法对所获取的特征值进行处理,得到与所述特征值相对应的权值矩阵;根据所述特征值以及相对应的权值矩阵计算得到所述会话的分数;
    根据所述m个会话的分数,采用第四算法计算得到所述合法用户的分类阈值。
  2. 根据权利要求1所述的网络用户身份认证方法,其特征在于:所述网络用户身份认证方法还包括:获取一个新的会话,并计算出所述新会话的分数;当所述分数落入所述分类阈值的范围时,判定当前用户是所述合法用户;当所述分数不落入所述分类阈值的范围时,判定当前用户不是所述合法用户。
  3. 根据权利要求1所述的网络用户身份认证方法,其特征在于:所述特征值包括:会话包含的元素个数;会话包含的频繁访问网站个数;会话所匹配的频繁项集个数;会话匹配的频繁项集中包含的频繁访问网站个数;会话所匹配的最长频繁项集长度;会话所匹配的频繁项集均长度;会话所匹配的频繁项集的最大支持度;会话所匹配的频繁项集的平均支持度;会话所匹配的频繁时间段个数;目标列。
  4. 根据权利要求1所述的网络用户身份认证方法,其特征在于:所述第一算法包括Apriori算法。
  5. 根据权利要求1所述的网络用户身份认证方法,其特征在于:所述第二算法包括:最大似然估计的方法从所述会话的浏览记录中计算出用户对每个内容类的浏览时间所服从的正 态分布的参数值。
  6. 根据权利要求5所述的网络用户身份认证方法,其特征在于:所述参数值包括:
    Figure PCTCN2016070994-appb-100001
    其中,timei为用户在浏览内容类contenti时的相对时间。
  7. 根据权利要求1所述的网络用户身份认证方法,其特征在于:所述第三算法包括:LR逻辑回归算法。
  8. 根据权利要求1所述的网络用户身份认证方法,其特征在于:所述第四算法包括:
    Figure PCTCN2016070994-appb-100002
    则所述分类阈值为
    Figure PCTCN2016070994-appb-100003
    其中,score合法i为第i个会话的分数。
  9. 一种网络用户身份认证系统,其特征在于:所述网络用户身份认证系统包括:
    用户会话获取模块,用于采集合法用户在设定时间段内的所有网页浏览记录,所述浏览记录包括浏览网页网址、文本内容、时间戳;从所述浏览网页网址中抽取出网址顶级域名,从所述文本内容抽取出关键字进而确定所述文本内容所属的内容类,将每一条所述浏览记录处理成<网址顶级域名,内容类,时间戳>的形式,将在所述设定时间段内得到的所有所述浏览记录作为一个会话;
    会话分数计算模块,用于针对一个会话,根据所述会话中的所有浏览记录,统计出用户最频繁访问的多个网址顶级域名,并利用设定的第一算法挖掘出所述浏览记录中网址顶级域名与内容类之间的关系,利用设定的第二算法挖掘出所述浏览记录中内容类与时间段之间的关系,进而得到所述用户浏览网页的n个特征值;根据设定的第三算法对所获取的特征值进行处理,得到与所述特征值相对应的权值矩阵;根据所述特征值以及相对应的权值矩阵计算得到所述会话的分数;
    分类阈值确定模块,用于获取合法用户的多个会话分数,采用第四算法计算得到所述合法用户的分类阈值。
  10. 根据权利要求9所述的网络用户身份认证系统,其特征在于:所述网络用户身份认证系统还包括用户合法判断模块,用于获取一个新的会话,并计算出所述新会话的分数;当所述分数落入所述分类阈值的范围时,判定当前用户是所述合法用户;当所述分数不落入所述分类阈值的范围时,判定当前用户不是所述合法用户。
PCT/CN2016/070994 2015-11-20 2016-01-15 一种网络用户身份认证方法及系统 WO2017084205A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2018100671A AU2018100671A4 (en) 2015-11-20 2018-05-18 Network user identity authentication method and system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510810443.1 2015-11-20
CN201510810443.1A CN105337987B (zh) 2015-11-20 2015-11-20 一种网络用户身份认证方法及系统

Related Child Applications (1)

Application Number Title Priority Date Filing Date
AU2018100671A Division AU2018100671A4 (en) 2015-11-20 2018-05-18 Network user identity authentication method and system

Publications (1)

Publication Number Publication Date
WO2017084205A1 true WO2017084205A1 (zh) 2017-05-26

Family

ID=55288270

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/070994 WO2017084205A1 (zh) 2015-11-20 2016-01-15 一种网络用户身份认证方法及系统

Country Status (2)

Country Link
CN (1) CN105337987B (zh)
WO (1) WO2017084205A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918873A (zh) * 2019-03-05 2019-06-21 西安电子科技大学 利用移动终端采集用户交互行为的持续身份认证方法
CN117040923A (zh) * 2023-09-28 2023-11-10 联通(广东)产业互联网有限公司 基于Apriori算法的用户行为异常检测方法及系统

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10154041B2 (en) * 2015-01-13 2018-12-11 Microsoft Technology Licensing, Llc Website access control
CN106776895B (zh) * 2016-11-29 2019-05-14 天津大学 基于人际间会话信息的人际关系自动化画像方法
CN107368718B (zh) * 2017-07-06 2022-08-16 同济大学 一种用户浏览行为认证方法及系统
CN109903067B (zh) * 2017-12-08 2021-07-16 北京京东尚科信息技术有限公司 信息处理方法和装置
CN110324292B (zh) * 2018-03-30 2022-01-07 富泰华工业(深圳)有限公司 身份验证装置、身份验证方法及计算机存储介质
CN108632087B (zh) * 2018-04-26 2021-12-28 深圳市华迅光通信有限公司 一种基于路由器的上网管理方法及系统
CN110414212A (zh) * 2019-08-05 2019-11-05 国网电子商务有限公司 一种面向电力业务的多维特征动态身份认证方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130132366A1 (en) * 2006-04-24 2013-05-23 Working Research Inc. Interest Keyword Identification
CN103544150A (zh) * 2012-07-10 2014-01-29 腾讯科技(深圳)有限公司 为移动终端浏览器提供推荐信息的方法及系统
CN104618372A (zh) * 2015-02-02 2015-05-13 同济大学 一种基于web浏览习惯的用户身份认证装置和方法
CN104838629A (zh) * 2012-12-07 2015-08-12 微秒资讯科技发展有限公司 使用移动设备并借助于证书对用户进行认证的方法及系统

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104270358B (zh) * 2014-09-25 2018-10-26 同济大学 可信网络交易系统客户端监控器及其实现方法
CN104731914A (zh) * 2015-03-24 2015-06-24 浪潮集团有限公司 一种基于行为相似度的用户异常行为检测方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130132366A1 (en) * 2006-04-24 2013-05-23 Working Research Inc. Interest Keyword Identification
CN103544150A (zh) * 2012-07-10 2014-01-29 腾讯科技(深圳)有限公司 为移动终端浏览器提供推荐信息的方法及系统
CN104838629A (zh) * 2012-12-07 2015-08-12 微秒资讯科技发展有限公司 使用移动设备并借助于证书对用户进行认证的方法及系统
CN104618372A (zh) * 2015-02-02 2015-05-13 同济大学 一种基于web浏览习惯的用户身份认证装置和方法

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918873A (zh) * 2019-03-05 2019-06-21 西安电子科技大学 利用移动终端采集用户交互行为的持续身份认证方法
CN117040923A (zh) * 2023-09-28 2023-11-10 联通(广东)产业互联网有限公司 基于Apriori算法的用户行为异常检测方法及系统
CN117040923B (zh) * 2023-09-28 2024-03-19 联通(广东)产业互联网有限公司 基于Apriori算法的用户行为异常检测方法及系统

Also Published As

Publication number Publication date
CN105337987B (zh) 2018-07-03
CN105337987A (zh) 2016-02-17

Similar Documents

Publication Publication Date Title
WO2017084205A1 (zh) 一种网络用户身份认证方法及系统
CN106776841B (zh) 一种互联网舆情事件传播指数的获取方法和系统
Mitra et al. Credbank: A large-scale social media corpus with associated credibility annotations
WO2019218514A1 (zh) 网页目标信息的提取方法、装置及存储介质
CN104462385B (zh) 一种基于用户兴趣模型的电影个性化相似度计算方法
US8190621B2 (en) Method, system, and computer readable recording medium for filtering obscene contents
CN104899508B (zh) 一种多阶段钓鱼网站检测方法与系统
Zarrinkalam et al. Semantics-enabled user interest detection from twitter
US9245035B2 (en) Information processing system, information processing method, program, and non-transitory information storage medium
CN105824822A (zh) 一种由钓鱼网页聚类定位目标网页的方法
Alahmadi et al. Using internet activity profiling for insider-threat detection
Zhou et al. Boosting video popularity through keyword suggestion and recommendation systems
Li et al. Search engine click spam detection based on bipartite graph propagation
CN103838754A (zh) 信息搜索装置及方法
Gu et al. AnchorMF: towards effective event context identification
Han et al. Linking social network accounts by modeling user spatiotemporal habits
Fan et al. Learning visual features from snapshots for web search
Weller Compromised account detection based on clickstream data
Guha Related Fact Checks: a tool for combating fake news
CN111125747B (zh) 一种商务网站用户的商品浏览隐私保护方法及系统
CN111612531A (zh) 一种点击欺诈的检测方法及系统
Moghaddam et al. AgeTrust: A new temporal trust-based collaborative filtering approach
Xue et al. Cross-media topic detection associated with hot search queries
Rozario et al. Community detection in social network using temporal data
Li et al. Spatio-temporal event modeling and ranking

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16865388

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17/10/2018)

122 Ep: pct application non-entry in european phase

Ref document number: 16865388

Country of ref document: EP

Kind code of ref document: A1