CN105337987B - A kind of method for authentication of identification of network user and system - Google Patents

A kind of method for authentication of identification of network user and system Download PDF

Info

Publication number
CN105337987B
CN105337987B CN201510810443.1A CN201510810443A CN105337987B CN 105337987 B CN105337987 B CN 105337987B CN 201510810443 A CN201510810443 A CN 201510810443A CN 105337987 B CN105337987 B CN 105337987B
Authority
CN
China
Prior art keywords
session
user
score
algorithm
browsing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510810443.1A
Other languages
Chinese (zh)
Other versions
CN105337987A (en
Inventor
蒋昌俊
闫春钢
陈闳中
丁志军
季梦清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN201510810443.1A priority Critical patent/CN105337987B/en
Priority to PCT/CN2016/070994 priority patent/WO2017084205A1/en
Publication of CN105337987A publication Critical patent/CN105337987A/en
Application granted granted Critical
Publication of CN105337987B publication Critical patent/CN105337987B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/535Tracking the activity of the user

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Storage Device Security (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The present invention provides a kind of method for authentication of identification of network user and system.The method for authentication of identification of network user includes:All web page browsings record of the validated user in set period of time is acquired as a session, each browsing record is processed into<Network address top level domain, content class, timestamp>Form;M session of the validated user is obtained, for each session, is handled as follows:According to the session, the n characteristic value that the user browses webpage is obtained;Acquired characteristic value is handled according to the third algorithm of setting, is obtained and the corresponding weight matrix of the characteristic value;The score of the session is calculated according to the characteristic value and corresponding weight matrix;According to the score of session, the classification thresholds of the validated user are calculated using the 4th algorithm.Technical scheme of the present invention by recording network address according to browsing, start with into Line Continuity certification, improves certification effect by three aspects of content and time.

Description

A kind of method for authentication of identification of network user and system
Technical field
The present invention relates to a kind of network security technology, more particularly to a kind of method for authentication of identification of network user and system.
Background technology
With the development of information technology and Internet technology, the scale of the online personnel in China constantly expands, shopping online Also more and more frequent with transaction, online has become an indispensable part during many life are lived, at the same time, net purchase transaction In fraud crime also steeply rising in recent years, the new network fraud that artificial hoax and technological means are combined has become net The primary security threat lived on people's line.The important of the safety that authentication is to provide in network trading is carried out to the network user Method.In terms of authenticating user identification, two class of disposable certification and sustainable certification can be divided into.About disposable certification, mesh It is preceding mainly to have traditional certification based on password, the certification based on smart card, the biological characteristic based on user and behavioural characteristic Certification etc..But disposable verification is only at a time authenticated, and certification is legal by then judging the user identity, it is impossible to very well Ground provides safety guarantee to the user, thus further provides sustainability certification.At present about the research phase of sustainable certification To less, existing sustainable certification is mainly studied from the contact between user's network address sequence or user's browsing content. To the not comprehensive enough of user browsing behavior consideration, certification effect is to be improved.
In consideration of it, the technical solution for further improving authentication of identification of network user safety how is found just into this field Technical staff's urgent problem to be solved.
Invention content
In view of the foregoing deficiencies of prior art, the purpose of the present invention is to provide a kind of authentication of identification of network user sides Method and system, for solving the problems, such as that authentication of identification of network user safety needs to be further improved in the prior art.
In order to achieve the above objects and other related objects, the present invention provides a kind of method for authentication of identification of network user, described Method for authentication of identification of network user includes:All web page browsings record of the validated user in set period of time is acquired, it is described clear Record of looking at includes browsing webpage network address, content of text, timestamp;Network address top level domain is extracted from the browsing webpage network address Name extracts keyword from the content of text and then determines the content class belonging to the content of text, will be clear described in each Record of looking at is processed into<Network address top level domain, content class, timestamp>Form, it is all by what is obtained in the set period of time The browsing record is as a session;M session of the validated user is obtained, for each session, is handled as follows: All browsings record in the session, counts multiple network address top level domain that user is most frequently visited by, and utilize and set The first fixed algorithm excavates the relationship between network address top level domain and content class in the browsing record, utilizes the second of setting Algorithm excavates content class and the relationship between the period in the browsing record, and then obtains the n that the user browses webpage A characteristic value;Acquired characteristic value is handled according to the third algorithm of setting, is obtained corresponding with the characteristic value Weight matrix;The score of the session is calculated according to the characteristic value and corresponding weight matrix;According to the m The classification thresholds of the validated user are calculated using the 4th algorithm for the score of session.
Optionally, the method for authentication of identification of network user further includes:A new session is obtained, and is calculated described new The score of session;When the score falls into the range of the classification thresholds, judgement active user is the validated user;Work as institute When stating range of the score without falling into the classification thresholds, judgement active user is not the validated user.
Optionally, the characteristic value includes:The element number that session includes;The frequent access Number of websites that session includes; The matched frequent item set number of session institute;The frequent access Number of websites included in the matched frequent item set of session;Session institute The longest frequent item set length matched;The matched equal length of frequent item set of session institute;Session matched frequent item set maximum branch Degree of holding;Session matched frequent item set Average Supports;The matched frequent period number of session institute;Target column.
Optionally, first algorithm includes Apriori algorithm.
Optionally, second algorithm includes:The method of maximal possibility estimation is calculated from the browsing record of the session Go out the parameter value of normal distribution that user obeys the browsing time of each content class.
Optionally, the third algorithm includes:LR logistic regression algorithms.
Optionally, the 4th algorithm includes:Then institute Stating classification thresholds isWherein, scoreLegal iFor the score of i-th of session, common m session.
Optionally, the set period of time includes 30 minutes.
The present invention provides also a kind of identity authentication system of network user, and the identity authentication system of network user includes:With Family acquisition conversation module, for acquiring all web page browsings record of the validated user in set period of time, the browsing record Including browsing webpage network address, content of text, timestamp;Network address top level domain is extracted from the browsing webpage network address, from institute It states content of text to extract keyword and then determine the content class belonging to the content of text, will be browsed at record described in each It manages into<Network address top level domain, content class, timestamp>Form, all browsings that will be obtained in the set period of time Record is as a session;Session scores computing module, for being directed to a session, all browsings note in the session Record, counts multiple network address top level domain that user is most frequently visited by, and excavate the browsing using the first algorithm of setting Relationship in record between network address top level domain and content class is excavated using the second algorithm of setting in the browsing record Hold class and the relationship between the period, and then obtain the n characteristic value that the user browses webpage;According to the third algorithm of setting Acquired characteristic value is handled, is obtained and the corresponding weight matrix of the characteristic value;According to the characteristic value and The score of the session is calculated in corresponding weight matrix;Classification thresholds determining module, for obtaining the more of validated user The classification thresholds of the validated user are calculated using the 4th algorithm for a session scores.
Optionally, the identity authentication system of network user further includes the legal judgment module of user, new for obtaining one Session, and calculate the score of the new session;When the score falls into the range of the classification thresholds, current use is judged Family is the validated user;When range of the score without falling into the classification thresholds, judgement active user is not the conjunction Method user.
Optionally, the characteristic value includes:The element number that session includes;The frequent access Number of websites that session includes; The matched frequent item set number of session institute;The frequent access Number of websites included in the matched frequent item set of session;Session institute The longest frequent item set length matched;The matched equal length of frequent item set of session institute;Session matched frequent item set maximum branch Degree of holding;Session matched frequent item set Average Supports;The matched frequent period number of session institute;Target column.
Optionally, first algorithm includes Apriori algorithm.
Optionally, second algorithm includes:The method of maximal possibility estimation is calculated from the browsing record of the session Go out the parameter value of normal distribution that user obeys the browsing time of each content class.
Optionally, the third algorithm includes:LR logistic regression algorithms.
Optionally, the 4th algorithm includes:Then institute Stating classification thresholds isWherein, scoreLegal iFor the score of i-th of session, common m session.
Optionally, the set period of time includes 30 minutes.
As described above, a kind of method for authentication of identification of network user and system of the present invention, have the advantages that:1) will (network address, content) and (content, time) two factors that user is browsed carry out the excavation of sequence rather than only examine merely Wherein some factor is considered, so that the authentication method of the present invention meets the browsing custom of user.It 2), will using correlation rule (network address, content) joint carries out the excavation that user browses custom;Based on normal distribution, to find frequency of the user to each content Numerous access time section.3) certification of duration has been achieved the effect that during user browses webpage.
Description of the drawings
Fig. 1 is shown as a kind of flow diagram of an embodiment of method for authentication of identification of network user of the present invention.
Fig. 2 is shown as a kind of flow diagram of another embodiment of method for authentication of identification of network user of the present invention.
Fig. 3 is shown as a kind of module diagram of an embodiment of identity authentication system of network user of the present invention.
Component label instructions
1 identity authentication system of network user
11 user conversation acquisition modules
12 session scores computing modules
13 classification thresholds determining modules
The legal judgment module of 14 users
S1~S4 steps
Specific embodiment
Illustrate embodiments of the present invention below by way of specific specific example, those skilled in the art can be by this specification Disclosed content understands other advantages and effect of the present invention easily.The present invention can also pass through in addition different specific realities The mode of applying is embodied or practiced, the various details in this specification can also be based on different viewpoints with application, without departing from Various modifications or alterations are carried out under the spirit of the present invention.
It should be noted that the diagram provided in the present embodiment only illustrates the basic conception of the present invention in a schematic way, Then in schema only display with it is of the invention in related component rather than component count, shape and size during according to actual implementation paint System, kenel, quantity and the ratio of each component can be a kind of random change, and its assembly layout kenel also may be used during actual implementation It can be increasingly complex.
The present invention provides a kind of method for authentication of identification of network user.The method for authentication of identification of network user is clear according to user Behavior of looking at carries out authentication.In one embodiment, as shown in Figure 1, the method for authentication of identification of network user includes:
Step S1, all web page browsings record of the acquisition validated user in set period of time, the browsing record include Browse webpage network address, content of text, timestamp;Network address top level domain is extracted from the browsing webpage network address, from the text This content extraction goes out keyword and then determines the content class belonging to the content of text, and browsing record described in each is processed into <Network address top level domain, content class, timestamp>Form, will be obtained in the set period of time it is all it is described browsing record As a session.In one embodiment, the web-browsing record of a user is acquired, data processing is carried out, forms following institute Basis of { (domain, content, the timestamp) } session structure as subsequent analysis shown.With the time interval of 30 minutes Processing division is carried out to the browsing record of acquisition, a session is obtained within every 30 minutes, performs multiple step S1, such as perform m hyposynchronization Rapid S1 obtains m session, and m merged session finally is obtained corresponding session aggregation S.It is follow-up be also when being authenticated with The primary access behavior (i.e. a session, 30 minutes) at family is authenticated for unit.
In one embodiment, the sqlite databases carried first with chrome browsers, acquire validated user Browsing record.Details when user browses each webpage are had recorded in sqlite databases, each user is acquired and is browsed The url (uniform resource locator, Uniform Resource Locator) of webpage, i.e. web page address;Content of text is with timely Between stamp as it is original browsing record.Browsing is denoted as r, attribute is as shown in table 1 below:
After initial data is obtained, initial data can be handled:First, each browsing in session is recorded into Row processing, carries out its url the extraction of top level domain;The text classification sample and the text on network for recycling sogou laboratories Zhang Gongtong extracts the content of text under each class to obtain corresponding keyword, is carried out later with the web page title for needing to classify Matching obtains the content class belonging to the webpage.After Web Page Processing, as first original browsing record of table 1 is handled For the form of (news.163.com, society, timestamp), we by this form data be denoted as webpage p (domain, content,timestamp)。
Step S2 obtains m session of the validated user, for each session, is handled as follows:According to the meeting All browsings record in words counts multiple network address top level domain that user is most frequently visited by, and first using setting is calculated Method excavates the relationship between network address top level domain and content class in the browsing record, is excavated using the second algorithm of setting Content class and the relationship between the period in the browsing record, and then obtain the n characteristic value that the user browses webpage;Root Acquired characteristic value is handled according to the third algorithm of setting, is obtained and the corresponding weight matrix of the characteristic value;Root The score of the session is calculated according to the characteristic value and corresponding weight matrix.In one embodiment, with 30 points The time interval of clock carries out processing division to the browsing record of acquisition, obtains within every 30 minutes a session, performs m step S1, M session is obtained, m merged session is finally obtained into corresponding session aggregation S.The characteristic value includes:The member that session includes Plain number;The frequent access Number of websites that session includes;The matched frequent item set number of session institute;The matched frequent item set of session In include frequent access the Number of websites;The matched longest frequent item set length of session institute;The matched frequent item set of session institute is equal Length;Session matched frequent item set max support;Session matched frequent item set Average Supports;Session institute Matched frequent period number;Target column.First algorithm includes Apriori algorithm.Apriori algorithm is a kind of excavation The frequent item set algorithm of correlation rule, core concept are downward closing two stages of detection by candidate generation and plot Carry out Mining Frequent Itemsets Based.And algorithm has been widely used the every field such as business, network security.
Second algorithm includes:The method of maximal possibility estimation calculates user couple from the browsing record of the session The parameter value of normal distribution that the browsing time of each content class is obeyed.The parameter value Wherein, timei is relative time of the user in browsing content class contenti;The parameter is used to count the session institute The frequent period number matched.The third algorithm includes:LR logistic regression algorithms.Logistic regression is typical two classification Algorithm, the model relative straightforward generated by it is simple, easily explains, and do not allow to be also easy to produce over-fitting.It is to learn in fact Practise f:X->One process of Y equations, we can previously given n tuple variable vectors X=<X1,X2...,Xn>With m member mesh Mark vector Y=<Y1,Y2...,Ym>, and logistic regression is exactly to learn a function f (X) so that the function energy learnt according to The variate-value that we provide in advance farthest fit object vector Y.
In one embodiment, different user can browse specific content, based on this use in different time in different web sites Family browses feature, we mainly access network address from frequent, (network address, content) and (content, period) this three aspect set about into The extraction of row feature.It is counted according to { (domain, content, timestamp) } session collection from frequent network address, frequent item set is dug Progress feature extractions obtain user and browse feature in terms of pick and frequently period excavate three.In one embodiment, Ke Yitong When processing statistics is carried out to multiple users:
Frequently access website statistics:Since the webpage that each user frequently browses is different, each user most frequency is counted 15 network address top level domain of numerous access are put into the frequent of relative users j and access in network address class FUj.
Frequent item set mining:Using Apriori algorithm, existing sequence relation between (network address, content) is excavated.At this During a, the present invention has chosen a suitable support threshold δ by experiment.For a frequent item set fc:X, Y, if support(fc)>The frequent item set is then added in the web-browsing frequent item set FCj of corresponding user j by δ.
And then corresponding characteristic value in session is obtained, each session in session collection has corresponding characteristic value, It is denoted as fvji.In one embodiment, and fvji=<lengthi,puni,mrni,rpuni,mrmli,mrali,mrmsi, mrasi,mtni,targeti>, the meaning being each worth in the middle is specifically if table 2 is as shown in.
After the characteristic value collection of session is obtained, carry out browsing feature verification based on user using LR logistic regressions algorithm Method (hereinafter referred to as UBFAA), detailed process is as shown in algorithm 1.
Algorithm 1:The authentication method (UBFAA) of feature is browsed based on user
Input:Validated user session collection S*, the frequent item set FC of validated user, the frequent access address set FU of validated user With
And frequently access time section collects FT
Output:Characteristic value weight matrix w, array score is legal
1) each session s*i of the session collection S* of traversal validated user
Mrtl=0;// session S*i matched frequent item set total length
Mrts=0;// session S*i matched frequent item set total support
Pun=0;
The element number that length=session collection S* is included;
Target=1;
1.1) the frequent of traversal validated user accesses address set FU
The frequent of if validated users is accessed in address set FU there are the top level domain of fuj=current sessions web page class, then pun Add 1;
1.2) the frequent item set FC of validated user is traversed
If current sessions include frequent item set fcj
1.2.1) mrn adds 1, mrtl to add up the length of upper current frequent item set, and mrts adds up the branch of upper current frequent item set Degree of holding;
1.2.2) max support of current sessions institute matching rule is stored in mrms;
1.2.3) maximum length of current sessions institute matching rule is stored in mrml;
1.2.4 the frequent Number of websites that accesses that fcj is included in) counting is stored in rpun;
1.3) the Average Supports mras and average length mral of current sessions institute matching rule are obtained;
1.4) the frequent period collection FT of validated user is traversed
There are ftj in the frequent access address set FT of if validated users so that current sessions web page class .content= Ftj.content and current sessions web page class .time existsIn section,
Then mtn adds 1;
1.5) each attribute of session s*i is written in ten tuple-set FVi;
2) ten tuple set FVi are traversed
2.1) create matrix datas, its first row be assigned a value of 1 entirely, and by characteristic storage to matrix in;
2.2) labels matrixes are created, and will be in last column data storage to labels of FVi;
2.3) establishment value is all the weight matrix w of 1 10*1 sizes;
3) the maximum cycle maxCycles=500 of the pace of learning alpha=0.01, LR of setting LR logistic regressions;
4) when calculation times are less than maxCycles, recycling gradient descent method calculates the value of weight matrix w;
5) the corresponding score of session is calculated using weight matrix w, and is stored in array scoreIt is legalIn;
6) weight matrix w and legitimate conversation scoring array score are returnedIt is legal
Then, the feature value vector fv according to corresponding to the weight matrix w that algorithm above obtains with session jjIt is right to calculate its The score answeredj, calculation formula is as follows:
For fvi∈ FV,
Score=w0+w1*fvi.length+w2*fvi.pun+...+w10*fvi.mtn
Session for m validated user obtains scoring array scoreIt is legal={ scoreLegal 1, scoreLegal 2..., scoreLegal m}。
According to the score of the m session, the classification threshold of the validated user is calculated using the 4th algorithm by step S3 Value.In one embodiment, the 4th algorithm includes:Institute Stating classification thresholds isWherein, scoreLegal iFor the score of i-th of session, common m session.
In one embodiment, as shown in Fig. 2, the method for authentication of identification of network user further includes:
Step S4 obtains a new session, and calculates the score of the new session;When the score falls into described point During the range of class threshold value, judgement active user is the validated user;When the score is without falling into the range of the classification thresholds When, judgement active user is not the validated user.One current sessions (new session) is obtained using the method for step S1, and The score of the session is calculated using the method for step S2, then the classification thresholds in step S3, judge belonging to current sessions User whether be validated user.When the score of new session falls into the range of the classification thresholds, judgement active user is The validated user;When new session score without falling into the classification thresholds range when, judgement active user be not described Validated user.
The present invention provides also a kind of identity authentication system of network user.The identity authentication system of network user may be used Method for authentication of identification of network user as described above.In one embodiment, as shown in figure 3, the authentication of identification of network user System 1 includes user conversation acquisition module 11, session scores computing module 12 and classification thresholds determining module 13.Wherein:
User conversation acquisition module 11 is used to acquire all web page browsings record of the user in set period of time, described clear Record of looking at includes browsing webpage network address, content of text, timestamp;Network address top level domain is extracted from the browsing webpage network address Name extracts keyword from the content of text and then determines the content class belonging to the content of text, will be clear described in each Record of looking at is processed into<Network address top level domain, content class, timestamp>Form, it is all by what is obtained in the set period of time The browsing record is as a session.The set period of time includes 30 minutes in one embodiment.
Classification thresholds determining module 13 is connected with session scores computing module, for obtaining multiple sessions of validated user point The classification thresholds of the validated user are calculated using the 4th algorithm for number.In one embodiment, the 4th algorithm packet It includes:Then the classification thresholds areWherein, scoreLegal iFor the score of i-th of session, common m session.
In one embodiment, as shown in figure 3, the identity authentication system of network user 1 further includes the legal judgement of user Module 14, the legal judgment module 14 of user are obtained with classification thresholds determining module 13, session scores computing module 12, user conversation Module 11 is connected, and for obtaining a new session from user conversation acquisition module 11, and passes through session scores computing module 12 calculate the score of the new session;When the score falls into point of the validated user obtained in classification thresholds determining module 13 During the range of class threshold value, judgement active user is the validated user;When the score is without falling into the range of the classification thresholds When, judgement active user is not the validated user.Technical scheme of the present invention by the network address sequence from browsing, content and Browsing time, three aspects distinguished the navigation patterns of different user, so as to which the account safety for user provides reliable guarantee.By Experiment test, when rate of false alarm is 10%, the network ID authentication of the energy present invention can reach 93.6% verification and measurement ratio, have very Good verification the verifying results advantageously ensure that user account safety.
In conclusion a kind of method for authentication of identification of network user and system of the present invention have the advantages that:1) will (network address, content) and (content, time) two factors that user is browsed carry out the excavation of sequence rather than only examine merely Wherein some factor is considered, so that the authentication method of the present invention meets the browsing custom of user.It 2), will using correlation rule (network address, content) joint carries out the excavation that user browses custom;Based on normal distribution, to find frequency of the user to each content Numerous access time section.3) certification of duration has been achieved the effect that during user browses webpage.So the present invention is effectively It overcomes various shortcoming of the prior art and has high industrial utilization.
The above-described embodiments merely illustrate the principles and effects of the present invention, and is not intended to limit the present invention.It is any ripe The personage for knowing this technology all can carry out modifications and changes under the spirit and scope without prejudice to the present invention to above-described embodiment.Cause This, those of ordinary skill in the art is complete without departing from disclosed spirit and institute under technological thought such as Into all equivalent modifications or change, should by the present invention claim be covered.

Claims (7)

1. a kind of method for authentication of identification of network user, which is characterized in that the method for authentication of identification of network user includes:
All web page browsings record of the validated user in set period of time is acquired, the browsing record includes browsing webpage net Location, content of text, timestamp;Network address top level domain is extracted from the browsing webpage network address, is extracted from the content of text Go out keyword and then determine the content class belonging to the content of text, browsing record described in each is processed into<Network address is top Domain name, content class, timestamp>Form, using obtained in the set period of time it is all it is described browsing record as one Session;
M session of the validated user is obtained, for each session, is handled as follows:It is all in the session Browsing record, counts multiple network address top level domain that user is most frequently visited by, and excavate institute using the first algorithm of setting The relationship between network address top level domain and content class in browsing record is stated, the browsing is excavated using the second algorithm of setting and remembers Content class and the relationship between the period in record, and then obtain the n characteristic value that the user browses webpage;According to the of setting Three algorithms handle acquired characteristic value, obtain and the corresponding weight matrix of the characteristic value;According to the feature The score of the session is calculated in value and corresponding weight matrix;
According to the score of the m session, the classification thresholds of the validated user are calculated using the 4th algorithm;
The method for authentication of identification of network user further includes:A new session is obtained, and calculates the score of the new session; When the score falls into the range of the classification thresholds, judgement active user is the validated user;When the score is not fallen When entering the range of the classification thresholds, judgement active user is not the validated user;
4th algorithm includes:Then the classification thresholds areWherein, scoreLegal iScore for i-th of session.
2. method for authentication of identification of network user according to claim 1, it is characterised in that:The characteristic value includes:Session Comprising element number;The frequent access Number of websites that session includes;The matched frequent item set number of session institute;Session is matched The frequent access Number of websites included in frequent item set;The matched longest frequent item set length of session institute;The matched frequency of session institute Numerous equal length of item collection;Session matched frequent item set max support;Session matched frequent item set average support Degree;The matched frequent period number of session institute;Target column.
3. method for authentication of identification of network user according to claim 1, it is characterised in that:First algorithm includes Apriori algorithm.
4. method for authentication of identification of network user according to claim 1, it is characterised in that:Second algorithm includes:Most The method of maximum-likelihood estimation calculates user from the browsing record of the session and the browsing time of each content class is obeyed Normal distribution parameter value.
5. method for authentication of identification of network user according to claim 4, it is characterised in that:The parameter value includes:Wherein, timeiIt is user in browsing content class contentiWhen relative time.
6. method for authentication of identification of network user according to claim 1, it is characterised in that:The third algorithm includes:LR Logistic regression algorithm.
7. a kind of identity authentication system of network user, it is characterised in that:The identity authentication system of network user includes:
User conversation acquisition module, it is described clear for acquiring all web page browsings record of the validated user in set period of time Record of looking at includes browsing webpage network address, content of text, timestamp;Network address top level domain is extracted from the browsing webpage network address Name extracts keyword from the content of text and then determines the content class belonging to the content of text, will be clear described in each Record of looking at is processed into<Network address top level domain, content class, timestamp>Form, it is all by what is obtained in the set period of time The browsing record is as a session;
Session scores computing module, for being directed to a session, all browsings record in the session counts user The multiple network address top level domain being most frequently visited by, and using setting the first algorithm excavate it is described browsing record in network address it is top Relationship between domain name and content class, using the second algorithm of setting excavate in the browsing record content class and period it Between relationship, and then obtain the n characteristic value that the user browses webpage;According to the third algorithm of setting to acquired feature Value is handled, and is obtained and the corresponding weight matrix of the characteristic value;According to the characteristic value and corresponding weights square The score of the session is calculated in battle array;
Classification thresholds determining module for obtaining multiple session scores of validated user, is calculated described using the 4th algorithm The classification thresholds of validated user;
The identity authentication system of network user further includes the legal judgment module of user, for obtaining a new session, and counts Calculate the score of the new session;When the score falls into the range of the classification thresholds, judgement active user is the conjunction Method user;When range of the score without falling into the classification thresholds, judgement active user is not the validated user;
4th algorithm includes:Then the classification thresholds areWherein, scoreLegal iScore for i-th of session.
CN201510810443.1A 2015-11-20 2015-11-20 A kind of method for authentication of identification of network user and system Active CN105337987B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201510810443.1A CN105337987B (en) 2015-11-20 2015-11-20 A kind of method for authentication of identification of network user and system
PCT/CN2016/070994 WO2017084205A1 (en) 2015-11-20 2016-01-15 Network user identity authentication method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510810443.1A CN105337987B (en) 2015-11-20 2015-11-20 A kind of method for authentication of identification of network user and system

Publications (2)

Publication Number Publication Date
CN105337987A CN105337987A (en) 2016-02-17
CN105337987B true CN105337987B (en) 2018-07-03

Family

ID=55288270

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510810443.1A Active CN105337987B (en) 2015-11-20 2015-11-20 A kind of method for authentication of identification of network user and system

Country Status (2)

Country Link
CN (1) CN105337987B (en)
WO (1) WO2017084205A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414212A (en) * 2019-08-05 2019-11-05 国网电子商务有限公司 A kind of multidimensional characteristic dynamic identity authentication method and system towards power business

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10154041B2 (en) * 2015-01-13 2018-12-11 Microsoft Technology Licensing, Llc Website access control
CN106776895B (en) * 2016-11-29 2019-05-14 天津大学 Interpersonal relationships based on person-to-person session information automates portrait method
CN107368718B (en) * 2017-07-06 2022-08-16 同济大学 User browsing behavior authentication method and system
CN109903067B (en) * 2017-12-08 2021-07-16 北京京东尚科信息技术有限公司 Information processing method and device
CN110324292B (en) * 2018-03-30 2022-01-07 富泰华工业(深圳)有限公司 Authentication device, authentication method, and computer storage medium
CN108632087B (en) * 2018-04-26 2021-12-28 深圳市华迅光通信有限公司 Internet access management method and system based on router
CN109918873B (en) * 2019-03-05 2022-12-06 西安电子科技大学 Continuous identity authentication method for acquiring user interaction behavior by using mobile terminal
CN117040923B (en) * 2023-09-28 2024-03-19 联通(广东)产业互联网有限公司 User behavior anomaly detection method and system based on Apriori algorithm

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104270358A (en) * 2014-09-25 2015-01-07 同济大学 Trusted network transaction system client side monitor and implementation method thereof
CN104618372A (en) * 2015-02-02 2015-05-13 同济大学 Device and method for authenticating user identity based on WEB browsing habits
CN104731914A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for detecting user abnormal behavior based on behavior similarity

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10042927B2 (en) * 2006-04-24 2018-08-07 Yeildbot Inc. Interest keyword identification
CN103544150B (en) * 2012-07-10 2016-03-09 腾讯科技(深圳)有限公司 For browser of mobile terminal provides the method and system of recommendation information
CN104838629B (en) * 2012-12-07 2017-11-21 微秒资讯科技发展有限公司 Use mobile device and the method and system that are authenticated by means of certificate to user

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104270358A (en) * 2014-09-25 2015-01-07 同济大学 Trusted network transaction system client side monitor and implementation method thereof
CN104618372A (en) * 2015-02-02 2015-05-13 同济大学 Device and method for authenticating user identity based on WEB browsing habits
CN104731914A (en) * 2015-03-24 2015-06-24 浪潮集团有限公司 Method for detecting user abnormal behavior based on behavior similarity

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110414212A (en) * 2019-08-05 2019-11-05 国网电子商务有限公司 A kind of multidimensional characteristic dynamic identity authentication method and system towards power business

Also Published As

Publication number Publication date
WO2017084205A1 (en) 2017-05-26
CN105337987A (en) 2016-02-17

Similar Documents

Publication Publication Date Title
CN105337987B (en) A kind of method for authentication of identification of network user and system
CN104077396B (en) Method and device for detecting phishing website
WO2019218514A1 (en) Method for extracting webpage target information, device, and storage medium
CN101820366B (en) Pre-fetching-based fishing web page detection method
CN108959383A (en) Analysis method, device and the computer readable storage medium of network public-opinion
CN106685936B (en) Webpage tampering detection method and device
CN104199874A (en) Webpage recommendation method based on user browsing behaviors
CN108416198A (en) Man-machine identification model establishes device, method and computer readable storage medium
CN102170446A (en) Fishing webpage detection method based on spatial layout and visual features
WO2016201938A1 (en) Multi-stage phishing website detection method and system
CN101534306A (en) Detecting method and a device for fishing website
CN104135365A (en) A method, a server, and a client for verifying an access request
CN106302438A (en) A kind of method of actively monitoring fishing website of Behavior-based control feature by all kinds of means
CN103838785A (en) Vertical search engine in patent field
CN107341183A (en) A kind of Website classification method based on darknet website comprehensive characteristics
CN107438083B (en) Detection method for phishing site and its detection system under a kind of Android environment
CN104050243B (en) It is a kind of to search for the network search method combined with social activity and its system
CN109413023A (en) The training of machine recognition model and machine identification method, device, electronic equipment
CN107608980A (en) Information-pushing method and system based on the analysis of DPI big datas
CN107818132A (en) A kind of webpage agent discovery method based on machine learning
Tao [Retracted] Application of Data Mining in the Analysis of Martial Arts Athlete Competition Skills and Tactics
CN109522692A (en) Webpage machine behavioral value method and system
Camiña et al. Towards building a masquerade detection method based on user file system navigation
CN106330861A (en) Website detection method and apparatus
CN111125747B (en) Commodity browsing privacy protection method and system for commercial website user

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant