CN105337987A - Network user identity authentication method and system - Google Patents

Network user identity authentication method and system Download PDF

Info

Publication number
CN105337987A
CN105337987A CN201510810443.1A CN201510810443A CN105337987A CN 105337987 A CN105337987 A CN 105337987A CN 201510810443 A CN201510810443 A CN 201510810443A CN 105337987 A CN105337987 A CN 105337987A
Authority
CN
China
Prior art keywords
session
user
algorithm
content
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510810443.1A
Other languages
Chinese (zh)
Other versions
CN105337987B (en
Inventor
蒋昌俊
闫春钢
陈闳中
丁志军
季梦清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201510810443.1A priority Critical patent/CN105337987B/en
Priority to PCT/CN2016/070994 priority patent/WO2017084205A1/en
Publication of CN105337987A publication Critical patent/CN105337987A/en
Application granted granted Critical
Publication of CN105337987B publication Critical patent/CN105337987B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Storage Device Security (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a network user identity authentication method and system. The network user identity authentication method comprises the following steps: collecting all web browsing history of a legal user within a set time interval as a conversation, and processing each browsing history into the pattern of a website top level domain, a content class and a time stamp; acquiring m conversations of the legal user, and processing according to the following steps aiming at each conversation: acquiring n feature values of the browsed web by the user according to the conversations; processing the acquired feature values according to the set third algorithm so as to obtain a weight matrix corresponding to the feature values; calculating according to the feature values and the corresponding weight matrix so as to obtain the score of the conversations; and calculating by adopting a fourth algorithm according to the score of the conversations so as to obtain the classified threshold of the legal user. The technical scheme is used for authenticating continuously from three aspects of browsed and recorded website, content and time, so that the authentication effect is enhanced.

Description

A kind of method for authentication of identification of network user and system
Technical field
The present invention relates to a kind of network security technology, particularly relate to a kind of method for authentication of identification of network user and system.
Background technology
Along with the development of information technology and Internet technology, the scale of the online personnel of China constantly expands, shopping online is also more and more frequent with transaction, online has become an indispensable part in many people life, meanwhile, swindle crime in net purchase transaction is also sharply being risen in recent years, and the new network swindle that artificial hoax and technological means combine has become the primary security threat that netizen's line is lived.The important method that authentication is to provide the fail safe in network trading is carried out to the network user.In authenticating user identification, disposable certification and sustainable certification two class can be divided into.About disposable certification, mainly contain at present traditional certification based on password, the certification based on smart card, based on the biological characteristic of user and the certification etc. of behavioural characteristic.But disposable checking only at a time carries out certification, thus certification, by then judging that this user identity is legal, well for user provides safety guarantee, can not further provide sustainability certification.Relatively less about the research of sustainable certification at present, existing sustainable certification is mainly studied from the contact between user's network address sequence or user's browsing content.To user browsing behavior consider comprehensive not, certification effect has much room for improvement.
Given this, the technical scheme improving authentication of identification of network user fail safe further how is found just to become those skilled in the art's problem demanding prompt solution.
Summary of the invention
The shortcoming of prior art in view of the above, the object of the present invention is to provide a kind of method for authentication of identification of network user and system, for solving the problem that authentication of identification of network user fail safe in prior art needs to be improved further.
For achieving the above object and other relevant objects, the invention provides a kind of method for authentication of identification of network user, described method for authentication of identification of network user comprises: gather all web page browsing records of validated user in setting-up time section, described in browse record and comprise browsing page network address, content of text, timestamp; Network address TLD is extracted from described browsing page network address, extract keyword from described content of text and then determine the content class belonging to described content of text, < network address TLD is become by browsing recording processing described in each, content class, the form of timestamp >, using obtain in described setting-up time section all described in browse record as a session; Obtain m session of described validated user, for each session, be handled as follows: browse record according to all in described session, count multiple network address TLDs that user the most frequently accesses, and utilize the first algorithm of setting excavate described in browse relation in record between network address TLD and content class, the second algorithm that utilization sets browses content class and the relation between the time period in record described in excavating, and then obtains n characteristic value of described user's browsing page; The 3rd algorithm according to setting processes obtained characteristic value, obtains the weight matrix corresponding with described characteristic value; The mark of described session is calculated according to described characteristic value and corresponding weight matrix; According to the mark of a described m session, the 4th algorithm is adopted to calculate the classification thresholds of described validated user.
Alternatively, described method for authentication of identification of network user also comprises: obtain a new session, and calculate the mark of described new session; When described mark falls into the scope of described classification thresholds, judge that active user is described validated user; When described mark does not fall into the scope of described classification thresholds, judge that active user is not described validated user.
Alternatively, described characteristic value comprises: the element number that session packet contains; The frequent access websites number that session packet contains; The frequent item set number that session is mated; The frequent access websites number comprised in the frequent item set of session coupling; The longest frequent item set length that session is mated; The equal length of frequent item set that session is mated; The max support of the frequent item set that session is mated; The Average Supports of the frequent item set that session is mated; The frequent time period number that session is mated; Target column.
Alternatively, described first algorithm comprises Apriori algorithm.
Alternatively, described second algorithm comprises: the method for maximal possibility estimation browses record from described session the parameter value calculating the normal distribution that user obeys the browsing time to each content class.
Alternatively, described parameter value comprises: &mu; ^ = &Sigma; i = 1 m time i m , &sigma; ^ 2 = &Sigma; i = 1 m &lsqb; time i - &mu; ^ &rsqb; 2 m - 1 ; Wherein, time ifor user is at browsing content class content itime relative time.
Alternatively, described 3rd algorithm comprises: LR logistic regression algorithm.
Alternatively, described 4th algorithm comprises: then described classification thresholds is wherein, score legal ibe the mark of i-th session, m session altogether.
Alternatively, described setting-up time section comprises 30 minutes.
The invention provides also a kind of identity authentication system of network user, described identity authentication system of network user comprises: user conversation acquisition module, for gathering all web page browsing records of validated user in setting-up time section, described in browse record and comprise browsing page network address, content of text, timestamp; Network address TLD is extracted from described browsing page network address, extract keyword from described content of text and then determine the content class belonging to described content of text, < network address TLD is become by browsing recording processing described in each, content class, the form of timestamp >, using obtain in described setting-up time section all described in browse record as a session; Session scores computing module, for for a session, record is browsed according to all in described session, count multiple network address TLDs that user the most frequently accesses, and utilize the first algorithm of setting excavate described in browse relation in record between network address TLD and content class, the second algorithm that utilization sets browses content class and the relation between the time period in record described in excavating, and then obtains n characteristic value of described user's browsing page; The 3rd algorithm according to setting processes obtained characteristic value, obtains the weight matrix corresponding with described characteristic value; The mark of described session is calculated according to described characteristic value and corresponding weight matrix; Classification thresholds determination module, for obtaining multiple session scores of validated user, adopts the 4th algorithm to calculate the classification thresholds of described validated user.
Alternatively, described identity authentication system of network user also comprises the legal judge module of user, for obtaining a new session, and calculates the mark of described new session; When described mark falls into the scope of described classification thresholds, judge that active user is described validated user; When described mark does not fall into the scope of described classification thresholds, judge that active user is not described validated user.
Alternatively, described characteristic value comprises: the element number that session packet contains; The frequent access websites number that session packet contains; The frequent item set number that session is mated; The frequent access websites number comprised in the frequent item set of session coupling; The longest frequent item set length that session is mated; The equal length of frequent item set that session is mated; The max support of the frequent item set that session is mated; The Average Supports of the frequent item set that session is mated; The frequent time period number that session is mated; Target column.
Alternatively, described first algorithm comprises Apriori algorithm.
Alternatively, described second algorithm comprises: the method for maximal possibility estimation browses record from described session the parameter value calculating the normal distribution that user obeys the browsing time to each content class.
Alternatively, described parameter value comprises: &mu; ^ = &Sigma; i = 1 m time i m , &sigma; ^ 2 = &Sigma; i = 1 m &lsqb; time i - &mu; ^ &rsqb; 2 m - 1 ; Wherein, time ifor user is at browsing content class content itime relative time.
Alternatively, described 3rd algorithm comprises: LR logistic regression algorithm.
Alternatively, described 4th algorithm comprises: then described classification thresholds is wherein, score legal ibe the mark of i-th session, m session altogether.
Alternatively, described setting-up time section comprises 30 minutes.
As mentioned above, a kind of method for authentication of identification of network user of the present invention and system, there is following beneficial effect: (the network address 1) user browsed, content), and (content, time) two factors carry out the excavation of sequence, instead of only consider wherein certain factor merely, thus make authentication method of the present invention meet user browse custom.2) utilize correlation rule, (network address, content) is combined and carries out the excavation that user browses custom; Based on normal distribution, in order to find the frequent access time section of user to each content.3) in the process of user's browsing page, reach the effect of the certification of continuation.
Accompanying drawing explanation
Fig. 1 is shown as the schematic flow sheet of an embodiment of a kind of method for authentication of identification of network user of the present invention.
Fig. 2 is shown as the schematic flow sheet of another embodiment of a kind of method for authentication of identification of network user of the present invention.
Fig. 3 is shown as the module diagram of an embodiment of a kind of identity authentication system of network user of the present invention.
Element numbers explanation
1 identity authentication system of network user
11 user conversation acquisition modules
12 session scores computing modules
13 classification thresholds determination modules
The legal judge module of 14 user
S1 ~ S4 step
Embodiment
Below by way of specific instantiation, embodiments of the present invention are described, those skilled in the art the content disclosed by this specification can understand other advantages of the present invention and effect easily.The present invention can also be implemented or be applied by embodiments different in addition, and the every details in this specification also can based on different viewpoints and application, carries out various modification or change not deviating under spirit of the present invention.
It should be noted that, the diagram provided in the present embodiment only illustrates basic conception of the present invention in a schematic way, then only the assembly relevant with the present invention is shown in graphic but not component count, shape and size when implementing according to reality is drawn, it is actual when implementing, and the kenel of each assembly, quantity and ratio can be a kind of change arbitrarily, and its assembly layout kenel also may be more complicated.
The invention provides a kind of method for authentication of identification of network user.Described method for authentication of identification of network user carries out authentication according to user browsing behavior.In one embodiment, as shown in Figure 1, described method for authentication of identification of network user comprises:
Step S1, gathers all web page browsing records of validated user in setting-up time section, described in browse record and comprise browsing page network address, content of text, timestamp; Network address TLD is extracted from described browsing page network address, extract keyword from described content of text and then determine the content class belonging to described content of text, < network address TLD is become by browsing recording processing described in each, content class, the form of timestamp >, using obtain in described setting-up time section all described in browse record as a session.In one embodiment, gather the web-browsing record of a user, carry out data processing, form the basis of { (domain, content, timestamp) } session structure as follows as subsequent analysis.With the time interval of 30 minutes, process division is carried out to the record of browsing gathered, within every 30 minutes, obtain a session, perform repeatedly step S1, as performed m step S1, obtaining m session, finally m merged session being obtained corresponding session aggregation S.Follow-up is also for unit carries out certification with the once access behavior of user (session, 30 minutes) when carrying out certification.
In one embodiment, first utilize the sqlite database that chrome browser carries, what gather validated user browses record.Have recorded details when user browses each webpage in sqlite database, gather the url (URL(uniform resource locator), UniformResourceLocator) of each user institute browsing page, i.e. web page address; Content of text and timestamp browse record as original.To browse record and be designated as r, its attribute is as shown in table 1 below:
After acquisition initial data, can process initial data: first, each record of browsing in session be processed, its url is carried out to the extraction of TLD; Article on text classification sample and the network in recycling sogou laboratory is common, extracts obtain corresponding keyword to the content of text under each class, and the web page title of classifying with needs afterwards carries out mating the content class obtained belonging to this webpage.After Web Page Processing, what the Article 1 as table 1 was original browses the form that record is treated to (news.163.com, society, timestamp), and this form data are designated as webpage p (domain, content, timestamp) by us.
Step S2, obtain m session of described validated user, for each session, be handled as follows: browse record according to all in described session, count multiple network address TLDs that user the most frequently accesses, and utilize the first algorithm of setting excavate described in browse relation in record between network address TLD and content class, utilize the second algorithm of setting excavate described in browse content class and the relation between the time period in record, and then obtain n characteristic value of described user's browsing page; The 3rd algorithm according to setting processes obtained characteristic value, obtains the weight matrix corresponding with described characteristic value; The mark of described session is calculated according to described characteristic value and corresponding weight matrix.In one embodiment, with the time interval of 30 minutes, process division is carried out to the record of browsing gathered, within every 30 minutes, obtain a session, perform m step S1, obtain m session, finally m merged session is obtained corresponding session aggregation S.Described characteristic value comprises: the element number that session packet contains; The frequent access websites number that session packet contains; The frequent item set number that session is mated; The frequent access websites number comprised in the frequent item set of session coupling; The longest frequent item set length that session is mated; The equal length of frequent item set that session is mated; The max support of the frequent item set that session is mated; The Average Supports of the frequent item set that session is mated; The frequent time period number that session is mated; Target column.Described first algorithm comprises Apriori algorithm.Apriori algorithm is a kind of frequent item set algorithm of Mining Association Rules, and its core concept is closed detection two the stage Mining Frequent Itemsets Baseds downwards by candidate generation and plot.And algorithm has been widely used the every field such as business, network security.
Described second algorithm comprises: the method for maximal possibility estimation browses record from described session the parameter value calculating the normal distribution that user obeys the browsing time to each content class.Described parameter value wherein, timei is the relative time of user when browsing content class contenti; The frequent time period number that described parameter is mated for adding up described session.Described 3rd algorithm comprises: LR logistic regression algorithm.Logistic regression is typical two sorting algorithms, and the model relative straightforward produced by it is simple, easily explains, and is not easy to produce Expired Drugs.It is a process of study f:X->Y equation in fact, we can n tuple variable vector X=<X1 given in advance, X2..., Xn> and m unit object vector Y=<Y1, Y2..., Ym>, and logistic regression learns a function f (X) exactly, the variate-value that the function learnt can be provided in advance according to us is fit object vector Y farthest.
In one embodiment, different user at different time, can browse certain content at different web sites, browse feature based on this user, we are mainly from frequently accessing network address, (network address, content) and (content, time period) this three aspect set about the extraction carrying out feature.Add up from frequent network address according to { (domain, content, timestamp) } session collection, frequent item set mining and frequent time period excavate three aspects and carry out feature extraction and obtain user and browse feature.In one embodiment, process statistics can be carried out to multiple user simultaneously:
Frequent access websites statistics: the webpage difference frequently browsed due to each user, counts 15 network address TLDs that each user the most frequently accesses, in the middle of the frequent access network address class FUj putting into relative users j.
Frequent item set mining: utilize Apriori algorithm, excavates the sequence relation existed between (network address, content).In this process, the present invention have chosen a suitable support threshold δ by experiment.For a frequent item set fc:X, Y, if support (fc) > is δ, then this frequent item set is added in the web-browsing frequent item set FCj of respective user j.
The frequent time period excavates: for a user, assuming that its time period of browsing certain content submits to a normal distribution process.The normal distribution model that the time utilizing (content, the time) data obtained from S* process to browse each class content content for user sets up obeys.Because the parameter in normal distribution process cannot accurately obtain, utilize the method for maximal possibility estimation from session, calculate the parameter value of the normal distribution that user obeys the browsing time to each content.Wherein, &mu; ^ = &Sigma; i = 1 m time i m , &sigma; 2 = &Sigma; i = 1 m &lsqb; time i - &mu; ^ &rsqb; 2 m - 1 .
And then corresponding characteristic value in acquisition session, each session in the middle of session collection has corresponding characteristic value, is designated as fvji.In one embodiment, and fv ji=<length i, pun i, mrn i, rpun i, mrml i, mral i, mrms i, mras i, mtn i, target i>, the implication of central each value concrete as table 2 ought as shown in.
After the characteristic value collection obtaining session, utilize LR logistic regression algorithm to carry out browsing feature verification method (hereinafter referred to as UBFAA) based on user, its detailed process is as shown in algorithm 1.
Algorithm 1: the authentication method (UBFAA) browsing feature based on user
Input: validated user session collection S*, the frequent item set FC of validated user, the frequent access address set FU of validated user with
And frequent access time section collection FT
Export: characteristic value weight matrix w, array score is legal
1) each session s*i of the session collection S* of validated user is traveled through
Mrtl=0; The total length of the frequent item set that // session S*i mates
Mrts=0; Total support of the frequent item set that // session S*i mates
pun=0;
The element number that length=session collection S* comprises;
target=1;
1.1) the frequent access address set FU of validated user is traveled through
There is the TLD of fuj=current sessions web page class in the frequent access address set FU of if validated user, then pun adds 1;
1.2) the frequent item set FC of validated user is traveled through
If current sessions comprises frequent item set fcj
1.2.1) mrn adds 1, mrtl and to add up the length of upper current frequent item set, and mrts adds up the support of upper current frequent item set;
1.2.2) max support of current sessions institute matched rule is kept in mrms;
1.2.3) maximum length of current sessions institute matched rule is kept in mrml;
1.2.4) the frequent access websites number that in statistics, fcj comprises is kept in rpun;
1.3) Average Supports mras and the average length mral of current sessions institute matched rule is obtained;
1.4) the frequent time period collection FT of validated user is traveled through
There is ftj in the frequent access address set FT of if validated user, current sessions web page class .content=ftj.contentand current sessions web page class .time is existed in interval,
Then mtn adds 1;
1.5) each attribute of session s*i is write in the middle of ten tuple-set FVi;
2) ten tuple set FVi are traveled through
2.1) creating matrix datas, is 1 by complete for its first row assignment, and by characteristic be stored in the middle of matrix;
2.2) create labels matrix, and last column data of FVi is stored in the middle of labels;
2.3) establishment value is the weight matrix w of the 10*1 size of 1 entirely;
3) the pace of learning alpha=0.01 of LR logistic regression is set, the maximum cycle maxCycles=500 of LR;
4) when calculation times is less than maxCycles, recycling gradient descent method calculates the value of weight matrix w;
5) exploitation right value matrix w calculates the corresponding score of session, and stored in array score legalin;
6) returning right value matrix w and legitimate conversation are marked array score legal.
Then, the feature value vector fv corresponding to the weight matrix w obtained according to above algorithm and session j jcalculate the score of its correspondence j, its computing formula is as follows:
For fv i∈ FV,
score=w 0+w 1*fv i.length+w 2*fv i.pun+...+w 10*fv i.mtn
Session for m validated user obtains the array score that marks legal={ score legal 1, score legal 2..., score legal m.
Step S3, according to the mark of a described m session, adopts the 4th algorithm to calculate the classification thresholds of described validated user.In one embodiment, described 4th algorithm comprises: described classification thresholds is wherein, score legal ibe the mark of i-th session, m session altogether.
In one embodiment, as shown in Figure 2, described method for authentication of identification of network user also comprises:
Step S4, obtains a new session, and calculates the mark of described new session; When described mark falls into the scope of described classification thresholds, judge that active user is described validated user; When described mark does not fall into the scope of described classification thresholds, judge that active user is not described validated user.Adopt the method for step S1 to obtain a current sessions (new session), and adopt the method for step S2 to calculate the mark of this session, then according to the classification thresholds in step S3, judge whether the user belonging to current sessions is validated user.When the mark of new session falls into the scope of described classification thresholds, judge that active user is described validated user; When the mark of new session does not fall into the scope of described classification thresholds, judge that active user is not described validated user.
The invention provides also a kind of identity authentication system of network user.Described identity authentication system of network user can adopt method for authentication of identification of network user as above.In one embodiment, as shown in Figure 3, described identity authentication system of network user 1 comprises user conversation acquisition module 11, session scores computing module 12 and classification thresholds determination module 13.Wherein:
User conversation acquisition module 11 for gathering all web page browsing records of user in setting-up time section, described in browse record and comprise browsing page network address, content of text, timestamp; Network address TLD is extracted from described browsing page network address, extract keyword from described content of text and then determine the content class belonging to described content of text, < network address TLD is become by browsing recording processing described in each, content class, the form of timestamp >, using obtain in described setting-up time section all described in browse record as a session.Described setting-up time section comprises 30 minutes in one embodiment.
Session scores computing module 12 is connected with user conversation acquisition module 11, for for a session, record is browsed according to all in described session, count multiple network address TLDs that user the most frequently accesses, and utilize the first algorithm of setting excavate described in browse relation in record between network address TLD and content class, the second algorithm that utilization sets browses content class and the relation between the time period in record described in excavating, and then obtains n characteristic value of described user's browsing page; The 3rd algorithm according to setting processes obtained characteristic value, obtains the weight matrix corresponding with described characteristic value; The mark of described session is calculated according to described characteristic value and corresponding weight matrix.In one embodiment, described characteristic value comprises: the element number that session packet contains; The frequent access websites number that session packet contains; The frequent item set number that session is mated; The frequent access websites number comprised in the frequent item set of session coupling; The longest frequent item set length that session is mated; The equal length of frequent item set that session is mated; The max support of the frequent item set that session is mated; The Average Supports of the frequent item set that session is mated; The frequent time period number that session is mated; Target column.Described first algorithm comprises Apriori algorithm.Described second algorithm comprises: the method for maximal possibility estimation browses record from described session the parameter value calculating the normal distribution that user obeys the browsing time to each content class.Described parameter value &mu; ^ = &Sigma; i = 1 m time i m , &sigma; ^ 2 = &Sigma; i = 1 m &lsqb; time i - &mu; ^ &rsqb; 2 m - 1 ; Wherein, time ifor user is at browsing content class content itime relative time; The frequent time period number that described parameter is mated for adding up described session.Described 3rd algorithm comprises: gradient descent method.
Classification thresholds determination module 13 is connected with session scores computing module, for obtaining multiple session scores of validated user, adopts the 4th algorithm to calculate the classification thresholds of described validated user.In one embodiment, described 4th algorithm comprises: then described classification thresholds is wherein, score close method ibe the mark of i-th session, m session altogether.
In one embodiment, as shown in Figure 3, described identity authentication system of network user 1 also comprises the legal judge module 14 of user, the legal judge module 14 of user is connected with classification thresholds determination module 13, session scores computing module 12, user conversation acquisition module 11, for the session that acquisition one from user conversation acquisition module 11 is new, and calculated the mark of described new session by session scores computing module 12; When described mark falls into the scope of the classification thresholds of the validated user that classification thresholds determination module 13 obtains, judge that active user is described validated user; When described mark does not fall into the scope of described classification thresholds, judge that active user is not described validated user.Technical scheme of the present invention is by from the network address sequence browsed, and the navigation patterns of different user is distinguished in content and browsing time three aspect, thus provides Reliable guarantee for the account safety of user.Through experiment test, when rate of false alarm is 10%, energy network ID authentication of the present invention can reach the verification and measurement ratio of 93.6%, has good verification the verifying results, is conducive to ensureing user account safety.
In sum, a kind of method for authentication of identification of network user of the present invention and system have following beneficial effect: (the network address 1) user browsed, content), and (content, time) two factors carry out the excavation of sequence, instead of simple only consider wherein certain factor, thus make authentication method of the present invention meet user browse custom.2) utilize correlation rule, (network address, content) is combined and carries out the excavation that user browses custom; Based on normal distribution, in order to find the frequent access time section of user to each content.3) in the process of user's browsing page, reach the effect of the certification of continuation.So the present invention effectively overcomes various shortcoming of the prior art and tool high industrial utilization.
Above-described embodiment is illustrative principle of the present invention and effect thereof only, but not for limiting the present invention.Any person skilled in the art scholar all without prejudice under spirit of the present invention and category, can modify above-described embodiment or changes.Therefore, such as have in art usually know the knowledgeable do not depart from complete under disclosed spirit and technological thought all equivalence modify or change, must be contained by claim of the present invention.

Claims (10)

1. a method for authentication of identification of network user, is characterized in that, described method for authentication of identification of network user comprises:
Gather all web page browsing records of validated user in setting-up time section, described in browse record and comprise browsing page network address, content of text, timestamp; Network address TLD is extracted from described browsing page network address, extract keyword from described content of text and then determine the content class belonging to described content of text, < network address TLD is become by browsing recording processing described in each, content class, the form of timestamp >, using obtain in described setting-up time section all described in browse record as a session;
Obtain m session of described validated user, for each session, be handled as follows: browse record according to all in described session, count multiple network address TLDs that user the most frequently accesses, and utilize the first algorithm of setting excavate described in browse relation in record between network address TLD and content class, the second algorithm that utilization sets browses content class and the relation between the time period in record described in excavating, and then obtains n characteristic value of described user's browsing page; The 3rd algorithm according to setting processes obtained characteristic value, obtains the weight matrix corresponding with described characteristic value; The mark of described session is calculated according to described characteristic value and corresponding weight matrix;
According to the mark of a described m session, the 4th algorithm is adopted to calculate the classification thresholds of described validated user.
2. method for authentication of identification of network user according to claim 1, is characterized in that: described method for authentication of identification of network user also comprises: obtain a new session, and calculate the mark of described new session; When described mark falls into the scope of described classification thresholds, judge that active user is described validated user; When described mark does not fall into the scope of described classification thresholds, judge that active user is not described validated user.
3. method for authentication of identification of network user according to claim 1, is characterized in that: described characteristic value comprises: the element number that session packet contains; The frequent access websites number that session packet contains; The frequent item set number that session is mated; The frequent access websites number comprised in the frequent item set of session coupling; The longest frequent item set length that session is mated; The equal length of frequent item set that session is mated; The max support of the frequent item set that session is mated; The Average Supports of the frequent item set that session is mated; The frequent time period number that session is mated; Target column.
4. method for authentication of identification of network user according to claim 1, is characterized in that: described first algorithm comprises Apriori algorithm.
5. method for authentication of identification of network user according to claim 1, is characterized in that: described second algorithm comprises: the method for maximal possibility estimation browses record from described session the parameter value calculating the normal distribution that user obeys the browsing time to each content class.
6. method for authentication of identification of network user according to claim 5, is characterized in that: described parameter value comprises: wherein, time ifor user is at browsing content class content itime relative time.
7. method for authentication of identification of network user according to claim 1, is characterized in that: described 3rd algorithm comprises: LR logistic regression algorithm.
8. method for authentication of identification of network user according to claim 1, is characterized in that: described 4th algorithm comprises: then described classification thresholds is wherein, score legal iit is the mark of i-th session.
9. an identity authentication system of network user, is characterized in that: described identity authentication system of network user comprises:
User conversation acquisition module, for gathering all web page browsing records of validated user in setting-up time section, described in browse record and comprise browsing page network address, content of text, timestamp; Network address TLD is extracted from described browsing page network address, extract keyword from described content of text and then determine the content class belonging to described content of text, < network address TLD is become by browsing recording processing described in each, content class, the form of timestamp >, using obtain in described setting-up time section all described in browse record as a session;
Session scores computing module, for for a session, record is browsed according to all in described session, count multiple network address TLDs that user the most frequently accesses, and utilize the first algorithm of setting excavate described in browse relation in record between network address TLD and content class, the second algorithm that utilization sets browses content class and the relation between the time period in record described in excavating, and then obtains n characteristic value of described user's browsing page; The 3rd algorithm according to setting processes obtained characteristic value, obtains the weight matrix corresponding with described characteristic value; The mark of described session is calculated according to described characteristic value and corresponding weight matrix;
Classification thresholds determination module, for obtaining multiple session scores of validated user, adopts the 4th algorithm to calculate the classification thresholds of described validated user.
10. identity authentication system of network user according to claim 9, is characterized in that: described identity authentication system of network user also comprises the legal judge module of user, for obtaining a new session, and calculates the mark of described new session; When described mark falls into the scope of described classification thresholds, judge that active user is described validated user; When described mark does not fall into the scope of described classification thresholds, judge that active user is not described validated user.
CN201510810443.1A 2015-11-20 2015-11-20 A kind of method for authentication of identification of network user and system Active CN105337987B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201510810443.1A CN105337987B (en) 2015-11-20 2015-11-20 A kind of method for authentication of identification of network user and system
PCT/CN2016/070994 WO2017084205A1 (en) 2015-11-20 2016-01-15 Network user identity authentication method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510810443.1A CN105337987B (en) 2015-11-20 2015-11-20 A kind of method for authentication of identification of network user and system

Publications (2)

Publication Number Publication Date
CN105337987A true CN105337987A (en) 2016-02-17
CN105337987B CN105337987B (en) 2018-07-03

Family

ID=55288270

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510810443.1A Active CN105337987B (en) 2015-11-20 2015-11-20 A kind of method for authentication of identification of network user and system

Country Status (2)

Country Link
CN (1) CN105337987B (en)
WO (1) WO2017084205A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776895A (en) * 2016-11-29 2017-05-31 天津大学 Interpersonal relationships automation portrait method based on person-to-person session information
CN107368718A (en) * 2017-07-06 2017-11-21 同济大学 A kind of user browsing behavior authentication method and system
CN107408115A (en) * 2015-01-13 2017-11-28 微软技术许可有限责任公司 web site access control
CN108632087A (en) * 2018-04-26 2018-10-09 四川斐讯信息技术有限公司 A kind of online management method and system based on router
CN109903067A (en) * 2017-12-08 2019-06-18 北京京东尚科信息技术有限公司 Information processing method and device
CN110324292A (en) * 2018-03-30 2019-10-11 富泰华工业(深圳)有限公司 Authentication means, auth method and computer storage medium
CN110414212A (en) * 2019-08-05 2019-11-05 国网电子商务有限公司 A kind of multidimensional characteristic dynamic identity authentication method and system towards power business

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918873B (en) * 2019-03-05 2022-12-06 西安电子科技大学 Continuous identity authentication method for acquiring user interaction behavior by using mobile terminal
CN117040923B (en) * 2023-09-28 2024-03-19 联通(广东)产业互联网有限公司 User behavior anomaly detection method and system based on Apriori algorithm

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130132366A1 (en) * 2006-04-24 2013-05-23 Working Research Inc. Interest Keyword Identification
CN104270358A (en) * 2014-09-25 2015-01-07 同济大学 Trusted network transaction system client side monitor and implementation method thereof
CN104618372A (en) * 2015-02-02 2015-05-13 同济大学 Device and method for authenticating user identity based on WEB browsing habits
CN104731914A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for detecting user abnormal behavior based on behavior similarity

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103544150B (en) * 2012-07-10 2016-03-09 腾讯科技(深圳)有限公司 For browser of mobile terminal provides the method and system of recommendation information
EP2929671B1 (en) * 2012-12-07 2017-02-22 Microsec Szamitastechnikai Fejlesztö Zrt. Method and system for authenticating a user using a mobile device and by means of certificates

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130132366A1 (en) * 2006-04-24 2013-05-23 Working Research Inc. Interest Keyword Identification
CN104270358A (en) * 2014-09-25 2015-01-07 同济大学 Trusted network transaction system client side monitor and implementation method thereof
CN104618372A (en) * 2015-02-02 2015-05-13 同济大学 Device and method for authenticating user identity based on WEB browsing habits
CN104731914A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for detecting user abnormal behavior based on behavior similarity

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107408115A (en) * 2015-01-13 2017-11-28 微软技术许可有限责任公司 web site access control
CN107408115B (en) * 2015-01-13 2020-10-09 微软技术许可有限责任公司 Web site filter, method and medium for controlling access to content
CN106776895A (en) * 2016-11-29 2017-05-31 天津大学 Interpersonal relationships automation portrait method based on person-to-person session information
CN106776895B (en) * 2016-11-29 2019-05-14 天津大学 Interpersonal relationships based on person-to-person session information automates portrait method
CN107368718A (en) * 2017-07-06 2017-11-21 同济大学 A kind of user browsing behavior authentication method and system
CN107368718B (en) * 2017-07-06 2022-08-16 同济大学 User browsing behavior authentication method and system
CN109903067A (en) * 2017-12-08 2019-06-18 北京京东尚科信息技术有限公司 Information processing method and device
CN109903067B (en) * 2017-12-08 2021-07-16 北京京东尚科信息技术有限公司 Information processing method and device
CN110324292A (en) * 2018-03-30 2019-10-11 富泰华工业(深圳)有限公司 Authentication means, auth method and computer storage medium
CN110324292B (en) * 2018-03-30 2022-01-07 富泰华工业(深圳)有限公司 Authentication device, authentication method, and computer storage medium
CN108632087A (en) * 2018-04-26 2018-10-09 四川斐讯信息技术有限公司 A kind of online management method and system based on router
CN110414212A (en) * 2019-08-05 2019-11-05 国网电子商务有限公司 A kind of multidimensional characteristic dynamic identity authentication method and system towards power business

Also Published As

Publication number Publication date
CN105337987B (en) 2018-07-03
WO2017084205A1 (en) 2017-05-26

Similar Documents

Publication Publication Date Title
CN105337987A (en) Network user identity authentication method and system
Mitra et al. Comparing person-and process-centric strategies for obtaining quality data on amazon mechanical turk
CN103745000B (en) Hot topic detection method of Chinese micro-blogs
CN102170446A (en) Fishing webpage detection method based on spatial layout and visual features
CN110781308B (en) Anti-fraud system for constructing knowledge graph based on big data
CN104077396A (en) Method and device for detecting phishing website
CN106056407A (en) Online banking user portrait drawing method and equipment based on user behavior analysis
CN102609407B (en) Fine-grained semantic detection method of harmful text contents in network
US20120314941A1 (en) Accurate text classification through selective use of image data
CN107577682A (en) Users&#39; Interests Mining and user based on social picture recommend method and system
Feng et al. Patterns and pace: Quantifying diverse exploration behavior with visualizations on the web
CN107368718A (en) A kind of user browsing behavior authentication method and system
CN104899229A (en) Swarm intelligence based behavior clustering system
Zhang et al. Anomaly detection in bitcoin information networks with multi-constrained meta path
CN107885857A (en) A kind of search results pages user&#39;s behavior pattern mining method, apparatus and system
CN103440328B (en) A kind of user classification method based on mouse behavior
CN108009215B (en) A kind of search results pages user behavior pattern assessment method, apparatus and system
Rahman et al. New biostatistics features for detecting web bot activity on web applications
CN103023874A (en) Phishing website detection method
CN105512224A (en) Search engine user satisfaction automatic assessment method based on cursor position sequence
de Moura et al. Using structural information to improve search in Web collections
CN106330861A (en) Website detection method and apparatus
CN111125747B (en) Commodity browsing privacy protection method and system for commercial website user
CN111612531A (en) Click fraud detection method and system
Jin et al. Graph-based identification and authentication: A stochastic kronecker approach

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant