CN109063431A - Weight the method for identifying ID of keystroke characteristic curve diversity factor - Google Patents

Weight the method for identifying ID of keystroke characteristic curve diversity factor Download PDF

Info

Publication number
CN109063431A
CN109063431A CN201810644782.0A CN201810644782A CN109063431A CN 109063431 A CN109063431 A CN 109063431A CN 201810644782 A CN201810644782 A CN 201810644782A CN 109063431 A CN109063431 A CN 109063431A
Authority
CN
China
Prior art keywords
keystroke
diversity factor
data set
interval time
temporal characteristics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810644782.0A
Other languages
Chinese (zh)
Other versions
CN109063431B (en
Inventor
王林
贺冰清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN201810644782.0A priority Critical patent/CN109063431B/en
Publication of CN109063431A publication Critical patent/CN109063431A/en
Application granted granted Critical
Publication of CN109063431B publication Critical patent/CN109063431B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/316User authentication by observing the pattern of computer usage, e.g. typical user behaviour

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Collating Specific Patterns (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses the method for identifying ID of weighting keystroke characteristic curve diversity factor, specific step is as follows, first keystroke interval time data set and half temporal characteristics data set are extracted, then the mean value and standard deviation of calculating keystroke interval time data set and half time data, the up/down boundary of keystroke interval time indicatrix and the up/down boundary of half temporal characteristics curve and keystroke interval time weighted feature curve diversity factor and half temporal characteristics curve diversity factor, finally identify user identity using weighted curve diversity factor and indicatrix diversity factor.The present invention is using the method for identifying ID for weighting keystroke characteristic curve diversity factor, compared with traditional keystroke identifying algorithm of keystroke duration and keystroke time interval is used only, user identity authentication recognizer performance based on indicatrix diversity factor is more preferable, false rejection rate, false acceptance rate and equal error rate are reduced, the accuracy rate of identification is improved.

Description

Weight the method for identifying ID of keystroke characteristic curve diversity factor
Technical field
The invention belongs to biological authentication method technical fields, are related to a kind of use using weighting keystroke characteristic curve diversity factor Family personal identification method.
Background technique
In recent years, we are using a large amount of online web application, these programs include social media platform (such as Facebook, Twitter, Weibo), cloud storage service (such as Drobox, Google Drive) and some online games. However these web application bring network crimes are unconsciously being spread to all over the world unexpectedly.Serious network Crime refers to that some offenders using the account of internet intrusion victim, steal quick including password and financial assets Feel information, in order to solve theft problem, we are entered in sequence of threads or equipment by a kind of additional biological identification mechanism To improve the safety of user account.In current various computer security measures, one is use the tradition based on password Identity validation technology, but password is easy leakage;Another kind is to replace simple challenge using some physical tokens (smart card etc.), But this method requires system to be equipped with corresponding hardware device, this meeting so that increased costs and there is also physical token loss, The problems such as stealing, replicating.Due to the biological characteristic of people have it is not reproducible, the characteristics such as be difficult to change so that living things feature recognition Technology becomes research hotspot.Common biometrics identification technology has: fingerprint identification technology, face recognition technology, iris recognition Technology etc..But above-mentioned technology requires to be equipped with the higher hardware device of cost, keeps its application inconvenient and is difficult to popularize.
Keystroke dynamic identity authentication is a kind of to carry out identity based on keystroke characteristic (such as: keystroke time delay, keystroke strength etc.) The biometrics of identification, this method acquire keystroke data, the keystroke behavior to user by the keyboard input of monitoring user Feature carries out classification model construction, thus carries out the differentiation of user identity.Keystroke dynamic identity authentication not only solves tradition and is based on The safety issue of password authentication, while being compared with other biological identification technology, the hardware for not needing additional expensive is set, standby Have many advantages, such as that at low cost, flexibility is high.
Summary of the invention
The object of the present invention is to provide a kind of method for identifying ID using weighting keystroke characteristic curve diversity factor, solutions Identity knowledge is carried out only with the size of each keystroke characteristic included in keystroke characteristic vector in existing authentication method of having determined Not, using the change rate between two adjacent characteristic values, so as to cause the not high problem of accuracy.
The technical scheme adopted by the invention is that the method for identifying ID of weighting keystroke characteristic curve diversity factor, tool Body follows the steps below to implement:
Step 1, acquisition data, establish half temporal characteristics data set and keystroke interval time data set;
Step 2, the mean value for calculating separately keystroke interval time data set and standard deviation are equal with half temporal characteristics data set Value and standard deviation;
Step 3 calculates keystroke interval time indicatrix according to the mean value and standard deviation of keystroke interval time data set Up/down boundary calculates the up/down boundary of half temporal characteristics curve according to the mean value of half temporal characteristics data set and standard deviation;
Step 4, the up/down feature modeling keystroke interval time weighted feature curve according to keystroke interval time indicatrix Diversity factor, according to the half temporal characteristics curve diversity factor of up/down feature modeling of half temporal characteristics curve;
Step 5 identifies user identity using weighted curve diversity factor and indicatrix diversity factor.
The features of the present invention also characterized in that
Step 1 specific implementation step is as follows:
1.1, k representative specific double bond character strings, group are filtered out from the original keystroke information of free text At specific character sequence set SK;
1.2, the frequency of use λ of each double bond is calculatedj, j=1,2 ..., k construct the keystroke interval time data set of user SppWith half temporal characteristics data set Sst, SppAnd SstBe expressed as follows:
Sst={ Vi st=[WPMi,Pi,N_UD,Pi,error,Pi,CapsLock,Pi,Shift] | i=1,2 ..., n } (2)
Wherein: k is the specific double bond character string number screened, V in formulai pp∈RkFor i-th of keystroke interval time Vector sample,For the keystroke interval time of the specific double bond character string of the last one in i-th of sample,For i-th of sample In j-th of specific double bond character string keystroke interval time (j=1 ..., k), m be collected keystroke interval time vector Number of samples;Vi st∈R5For i-th of half temporal characteristics vector samples, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftPoint It Wei not the average keystroke speed of i-th of sample, the frequency of occurrences of negative interval time RP, error rate for input, cap lock key use Frequency and shift key frequency of use, PN_UD、Perror、PShiftAnd PCapsLockVariation range be [0,1], average keystroke speed The variation range of WPM be [0 ,+∞), under normal circumstances, the magnitude of WPM is 102, exist with the magnitude of other half temporal characteristics aobvious Difference is write, n is collected half temporal characteristics vector number of samples;
1.3, double of temporal characteristics data set SstIn average keystroke speed WPM normalization formula is normalized Are as follows:
In formula: max { WPMi| i=1 ..., n } it is that maximum in sample is averaged keystroke speed, it is denoted as WPMmax, by normalizing After change processing, by half temporal characteristics data set SstIt is abbreviated as
Sst={ Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5] | i=1,2 ... n } (4)
In formula:vi,2=Pi,N_UD, vi,3=Pi,error, vi,4=Pi,CapsLock, vi,5=Pi,Shift
The mean value and standard of the mean value and standard deviation of keystroke interval time data set and half temporal characteristics data set in step 2 The calculation method of difference are as follows:
If data set SppThe mean value of middle all elements isData set SstThe mean value of middle all elements ForThen
If data set SppThe standard deviation of middle all elements isData set SstIncluded in element Standard deviation beThen
The meter on the up/down boundary of keystroke interval time indicatrix and the up/down boundary of half temporal characteristics curve in step 3 Calculation method are as follows:
If data set SppIncluded in the coboundary vector sum lower boundary vector of element be respectivelyData set SstIncluded in element coboundary vector sum lower boundary vector RespectivelyThe then coboundary of keystroke interval time indicatrixBelow BoundaryCalculating such as following formula (9), the coboundary v of half temporal characteristics curveu,l, lower boundary vd,lCalculating such as following formula (10):
In formula:WithFor adjustable threshold value.
The calculation method of keystroke interval time weighted feature curve diversity factor and half temporal characteristics curve diversity factor in step 4 Are as follows:
If any keystroke interval time vector sampleThen the sample is in data set SppIn plus Weigh indicatrix diversity factorCalculation formula are as follows:
In formula:
Wherein: λjFor the frequency of use of each specific double bond character string, j=1,2 ..., k;
If any half temporal characteristics vector sampleIn data set SstMiddle indicatrix diversity factorFor
In formula:
Keystroke interval time data set S is calculated according to the frequency of use of double bond each in set SK and formula (11)ppIn it is each The weighted feature curve diversity factor of element, and constitute keystroke interval time indicatrix diversity factor set Qpp;It is calculated by formula (12) Half temporal characteristics data set SstIn each element indicatrix diversity factor, and constitute half temporal characteristics curve diversity factor set Qst, the above-mentioned definition respectively gathered is
In formula:Indicate data set SppMiddle element Vi pp∈RkWeighted feature curve diversity factor,It indicates Data set SstMiddle element Vi st∈R5Indicatrix diversity factor.
Knowledge method for distinguishing is carried out to user identity using weighted curve diversity factor and indicatrix diversity factor in step 5 are as follows:
Test sample is determined according to following inequality
In formula:WithFor adjustable threshold;
If inequality (15) and formula (16) are set up simultaneously, assert that this test sample belongs to the user;Otherwise, assert this survey Sample is originally not belonging to the user.
Threshold value in step 4WithValue range be 0~3.
Threshold value in step 5WithValue range be not less than 0.
The invention has the advantages that using the method for identifying ID of weighting keystroke characteristic curve diversity factor, and only It is compared using the keystroke duration with traditional keystroke identifying algorithm of keystroke time interval, the user based on indicatrix diversity factor Authentication recognizer performance is more preferable, reduces false rejection rate (FRR), false acceptance rate (FAR) and equal error rate (ERR), the accuracy rate of identification is improved.
Detailed description of the invention
Fig. 1 is keystroke duration of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor Indicatrix;
Fig. 2 is that the present invention is bent using half temporal characteristics of the method for identifying ID of weighting keystroke characteristic curve diversity factor Line;
Fig. 3 is data set S of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factorppKeystroke The upper and lower boundary curve figure of feature;
Fig. 4 is data set S of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factorstKeystroke The upper and lower boundary curve figure of feature;
Fig. 5 is that the present invention is special using the free text keystroke of method for identifying ID of weighting keystroke characteristic curve diversity factor Identifying algorithm performance indicator ERR is levied with the change curve of TP;
Fig. 6 is keystroke data Ji Qu of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor Domain divides schematic diagram;
Fig. 7 is internal specimen signal of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor Figure;
Fig. 8 is external samples signal of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor Figure;
Fig. 9 is weighted curve feature of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor Curve diversity factor schematic diagram.
Specific embodiment
The following describes the present invention in detail with reference to the accompanying drawings and specific embodiments.
The present invention weights the method for identifying ID of keystroke characteristic curve diversity factor, is specifically implemented according to the following steps:
Step 1, acquisition data, establish half temporal characteristics data set and keystroke interval time data set, specific implementation step It is as follows:
1.1, k representative specific double bond character strings, group are filtered out from the original keystroke information of free text At specific character sequence set SK;
1.2, the frequency of use λ of each double bond is calculatedj, j=1,2 ..., k construct the keystroke interval time data set of user SppWith half temporal characteristics data set Sst, SppAnd SstBe expressed as follows:
Sst={ Vi st=[WPMi,Pi,N_UD,Pi,error,Pi,CapsLock,Pi,Shift] | i=1,2 ..., n } (2)
Wherein: k is the specific double bond character string number screened, V in formulai pp∈RkFor i-th of keystroke interval time Vector sample,For the keystroke interval time of the specific double bond character string of the last one in i-th of sample,For i-th of sample In j-th of specific double bond character string keystroke interval time (j=1 ..., k), m be collected keystroke interval time vector Number of samples;Vi st∈R5For i-th of half temporal characteristics vector samples, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftPoint It Wei not the average keystroke speed of i-th of sample, the frequency of occurrences of negative interval time RP, error rate for input, cap lock key use Frequency and shift key frequency of use, PN_UD、Perror、PShiftAnd PCapsLockVariation range be [0,1], average keystroke speed The variation range of WPM be [0 ,+∞), under normal circumstances, the magnitude of WPM is 102, exist with the magnitude of other half temporal characteristics aobvious Difference is write, n is collected half temporal characteristics vector number of samples;
1.3, double of temporal characteristics data set SstIn average keystroke speed WPM normalization formula is normalized Are as follows:
In formula: max { WPMi| i=1 ..., n } it is that maximum in sample is averaged keystroke speed, it is denoted as WPMmax, by normalizing After change processing, by half temporal characteristics data set SstIt is abbreviated as
Sst={ Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5] | i=1,2 ... n } (4)
In formula:vi,2=Pi,N_UD, vi,3=Pi,error, vi,4=Pi,CapsLock, vi,5=Pi,Shift
Step 2, the mean value for calculating separately keystroke interval time data set and standard deviation are equal with half temporal characteristics data set Value and standard deviation, circular are as follows:
By data set SppIn either element Vi ppRepresented by curve, abscissa j, ordinate isWherein j=1, L, k;Similarly, by data set SstIn either element Vi prRepresented by curve, abscissa l, ordinate is followed successively by vi,l, wherein l= 1, L, 5, for ease of description, by data set SppIn either element Vi ppCurve be known as keystroke interval time indicatrix, will count According to collection SstIn either element Vi prCurve be referred to as half temporal characteristics curve, keystroke characteristic curve can also be referred to as.
If data set SppThe mean value of middle all elements isData set SstThe mean value of middle all elements ForThen
If data set SppThe standard deviation of middle all elements isData set SstIncluded in element Standard deviation isThen
Step 3 calculates keystroke interval time indicatrix according to the mean value and standard deviation of keystroke interval time data set Up/down boundary calculates the up/down boundary of half temporal characteristics curve, tool according to the mean value of half temporal characteristics data set and standard deviation Body calculation method are as follows:
If data set SppIncluded in the coboundary vector sum lower boundary vector of element be respectivelyData set SstIncluded in element coboundary vector sum lower boundary vector RespectivelyThe then coboundary of keystroke interval time indicatrixBelow BoundaryCalculating such as following formula (9), the coboundary v of half temporal characteristics curveu,l, lower boundary vd,lCalculating such as following formula (10):
In formula:WithFor adjustable threshold value,WithValue range be 0~3;
WithValue range be according to central-limit theorem (i.e. assuming acquisition keystroke temporal characteristics amount obey just State distribution) and determine,WithValue is bigger, and the range on upper and lower boundary is bigger, and sample is in the increase of the probability in boundary, from And reduce FRR value, the increase of FAR value;WithIt is worth smaller, the range on upper and lower boundary is smaller, and sample is in general in boundary Rate reduces, so that FRR value increases, FAR value reduces.It choosesWithValue should make as far as possible EER value reach minimum value,WithValue range is 0~3, can generally choose 2.
Step 4, the up/down feature modeling keystroke interval time weighted feature curve according to keystroke interval time indicatrix Diversity factor, according to the half temporal characteristics curve diversity factor of up/down feature modeling of half temporal characteristics curve, circular are as follows:
Data set SppAnd SstUp/down boundary curve entire two-dimensional surface is divided into interior zone and perimeter, such as Shown in Fig. 6.For any keystroke interval time vector sampleIfIt is all satisfiedThen the sample is completely in SppInterior zone, be called data set SppInternal specimen, as shown in Figure 7; Otherwise, then it is called data set SppExternal samples, as shown in Figure 8.It can similarly obtain, for any half temporal characteristics vector sampleIfThere is vd,l≤vs,l≤vu,l, then it is called data set SstInternal specimen;Otherwise, Then it is called data set SstExternal samples.
According to above-mentioned definition it is found that if when a sample is the external samples of some data set, in corresponding outside area In domain, the indicatrix of this sample inherently constitutes several enclosed areas with the coboundary of its data set or lower boundary curve Domain, as shown in the shadow region in Fig. 8.The gross area of all closed areas is bigger, indicates that the difference of sample and this data set is got over Greatly, it is bigger to be not belonging to a possibility that this data set for sample.The feature in conjunction with possessed by free text keystroke characteristic information, this chapter pairs Fixed text keystroke characteristic curve diversity factor is suitably modified, and the concept of weighting keystroke characteristic curve diversity factor is extracted, It is associated with it with the generation of the frequency of use of specific double bond character string.
In the research of fixed text keystroke characteristic, the physical meaning of the keystroke characteristic curve diversity factor of any sample is the sample Whole envelopes that this keystroke characteristic curve and the coboundary of corresponding data collection or lower boundary curve are constituted in its perimeter The sum of closed region area, the area of each closed area depend primarily on each element in feature vector and exceed up or down boundary D in distance, such as Fig. 92、d4And d7.In view of the use frequency of the specific double bond character string filtered out from free text Rate can have differences, by element each in feature vector beyond up or down boundary distance multiplied by corresponding weight coefficient, such as λ in Fig. 82d2、λ4d4And λ7d7, so that the waviness tolerance range of specific double bond keystroke interval time is inversely proportional with its frequency of use Relationship.The weight coefficient that each element is multiplied in feature vector to its corresponding to the frequency of use of double bond character string it is directly proportional Relationship can directly be chosen use frequency as weight coefficient under normal circumstances.
Compared with the keystroke characteristic curve diversity factor in fixed text, according to the obtained weighting keystroke of above-mentioned design method Indicatrix diversity factor is unique in that when any two element exceeds the absolute of up or down frontier distance in feature vector When being worth equal, the variable quantity of indicatrix diversity factor caused by the big element of weight coefficient is greater than the small element institute of weight coefficient The variable quantity of caused indicatrix diversity factor.In view of frequency of use high double bond keystroke interval time is lower than frequency of use Double bond stability is good, fluctuating range is small, it should differentiation processing is carried out to it, so that when the high double bond keystroke interval of frequency of use Between waviness tolerance range be less than the low double bond of frequency of use.Therefore, it uses and adds in free text keystroke characteristic verification process It is more suitable to weigh keystroke characteristic curve diversity factor.
If any keystroke interval time vector sampleThen the sample is in data set SppIn plus Weigh indicatrix diversity factorCalculation formula are as follows:
In formula:
Wherein: λjFor the frequency of use of each specific double bond character string, j=1,2 ..., k;
If any half temporal characteristics vector sampleIn data set SstMiddle indicatrix diversity factorFor
In formula:
Keystroke interval time data set S is calculated according to the frequency of use of double bond each in set SK and formula (11)ppIn it is each The weighted feature curve diversity factor of element, and constitute keystroke interval time indicatrix diversity factor set Qpp;It is calculated by formula (12) Half temporal characteristics data set SstIn each element indicatrix diversity factor, and constitute half temporal characteristics curve diversity factor set Qst, the above-mentioned definition respectively gathered is
In formula:Indicate data set SppMiddle elementWeighted feature curve diversity factor,Table Show data set SstMiddle element Vi st∈R5Indicatrix diversity factor.
Step 5 identifies user identity using weighted curve diversity factor and indicatrix diversity factor method particularly includes:
Assuming that sampleAnd VstIt is set S respectivelyppAnd SstInternal specimen, then defining its indicatrix diversity factor is Zero;Otherwise, the indicatrix diversity factor of sample is equal to the indicatrix of the sample and the up/down boundary characteristic song of corresponding data collection The sum of whole closed area areas that line is constituted in its perimeter.
Then test sample is determined according to following inequality
In formula:WithFor adjustable threshold,WithValue range be not less than 0;
If inequality (15) and formula (16) are set up simultaneously, assert that this test sample belongs to the user;Otherwise, assert this survey Sample is originally not belonging to the user.
Embodiment 1
Introduce the example of a specific user identity identification.
Step 1: acquisition data establish half temporal characteristics data set and keystroke interval time data set
Experimental data acquisition mainly carries out above and below the PC machine of installation Windows system, and conventional mechanical keyboard is selected to make Equipment is acquired for keystroke information, in addition, having write a user keystroke information acquisition program based on VC++6.0 exploitation environment, is led to Crossing the program can freely tap user the keystroke information storage of keyboard into specified file.Start it in data collection task Before, the keystroke information capture program write is installed in the computer used by a user for participating in experiment first.In data During acquisition, it is desirable that user just runs keystroke information capture program after opening computer every time, and program display interface is such as Shown in Fig. 9.After user click [beginning] button, program just starts to acquire the free keystroke letter of user in a manner of running background Breath, and be stored in key_record.txt file.In data acquisition, keystroke information capture program will not be bothered User's normal use computer.Before user shuts down computer every time, [end] button is clicked to exit keystroke information acquisition Program.
After completing the raw data acquisition work of whole participants, pair that each participant gets used to therefrom is extracted Key characters sequence and access times (frequency), statistical result is shown in Table 1.
Table 1
Listed in table 1 each participant during the experiment in frequency of use come first 15 (by frequency of use by height to Low sequence) double bond character string and access times.By analysis it is found that double bond character string " in ", " an ", " ng ", " zh ", " wo ", " en ", " sh ", " ji " are that all participants are jointly owned and frequency of use is higher, can also be reflected above-mentioned double Key characters sequence has certain generality.Therefore, it is specific that above-mentioned 8 double bond character strings composition is chosen in the experiment of this chapter Character string set SK, i.e. SK={ in, an, ng, zh, wo, en, sh, ji }.
After the selected specific character sequence set SK, made according to each double bond in acquired original data set of computations SK With frequency, it is denoted as λj, indicate set SK in j-th of double bond frequency of use, j=1,2, L, 8.Each participant is in Freely input During, each a period of time will collect a double bond keystroke interval time vector sample and half temporal characteristics vector sample This, in conjunction with the data in table 1, each participant at least has 200 double bond keystroke interval time vector samples and 200 half Temporal characteristics vector sample.
The mean value and mark of step 2, the mean value for calculating separately keystroke interval time data set and standard deviation and half time data It is quasi- poor
It is substantially similar with fixed text for the experimental program of free text keystroke characteristic authentication, only make in experiment Keystroke characteristic information and identifying algorithm different from.
Successively concentrating from each participant's keystroke data takes preceding 20%, 40%, 60% and 80% sample as sample Originally the keystroke characteristic model of the participant was established.The sample of above-mentioned participant to be indicated with variable TP convenient for analysis of experimental results This quantity accounts for the percentage of total number of samples amount.
Then, it concentrates after taking 80%, 60%, 40% and 20% sample to be used as from each participant's keystroke data respectively to survey Sample sheet calculates the false rejection rate FRR of the participant.
Next, using whole samples of other 9 participants as test sample, which is attacked It hits, calculates the false acceptance rate FAR of the participant.
The above process can recycle down, until the FRR and FAR of 10 users are all calculated.Finally, taking all participations Performance indicator of the average value of person FRR and FAR as identity authentication algorithm.
Step 4, user identity identification
Experiment has obtained user's sample size to account for the percentage TP of total number of samples amount being respectively 20%, 40%, 60% and In the case of 80%, false rejection rate (FRR), false acceptance rate (FAR) and the equal error rate (EER) of various algorithms, experiment knot Fruit is shown in Table 2.Through the experimental result in table 2 it is found that in the case where TP value is different, based on weighting keystroke characteristic curve difference The equal error rate (EER) of the identifying algorithm of degree is respectively 20.11%, 16.28%, 13.48% and 10.32%, significant excellent In other 2 kinds of alignment algorithms, accuracy height is authenticated, it is more preferable to the certification effect of characteristics of user keystroke.This is primarily due to Man Ha Keystroke interval time and half temporal characteristics are used only as keystroke characteristic progress user's body in distance algorithm and relative distance algorithm Part certification, and the identifying algorithm based on weighting keystroke characteristic curve diversity factor that this chapter is proposed is calculating weighting keystroke characteristic song Traditional keystroke interval time is not only contained during line diversity factor, also introduces the change rate and double bond of interval time The information such as the frequency of use of character string.Therefore the mentioned algorithm of this chapter can more accurately describe the keystroke characteristic of user, into And the accuracy rate of authentication can be improved.
The performance indicator of free text keystroke characteristic identifying algorithm is as shown in table 2, free text keystroke characteristic identifying algorithm Performance indicator ERR is as shown in Figure 5 with the change curve of TP.
The performance indicator of the free text keystroke characteristic identifying algorithm of table 2
It can be seen that the method for identifying ID using weighting keystroke characteristic curve diversity factor from above-mentioned experimental result, Compared with traditional keystroke identifying algorithm of keystroke duration and keystroke time interval is used only, based on indicatrix diversity factor User identity authentication recognizer performance is more preferable, reduces false rejection rate (FRR), false acceptance rate (FAR) and equal error Rate (ERR), improves the accuracy rate of identification.

Claims (8)

1. weighting the method for identifying ID of keystroke characteristic curve diversity factor, which is characterized in that specifically real according to the following steps It applies:
Step 1, acquisition data, establish half temporal characteristics data set and keystroke interval time data set;
The mean value of step 2, the mean value for calculating separately keystroke interval time data set and standard deviation and half temporal characteristics data set and Standard deviation;
Step 3, the up/down that keystroke interval time indicatrix is calculated according to the mean value and standard deviation of keystroke interval time data set Boundary calculates the up/down boundary of half temporal characteristics curve according to the mean value of half temporal characteristics data set and standard deviation;
Step 4, the up/down feature modeling keystroke interval time weighted feature curve difference according to keystroke interval time indicatrix Degree, according to the half temporal characteristics curve diversity factor of up/down feature modeling of half temporal characteristics curve;
Step 5 identifies user identity using weighted feature curve diversity factor and indicatrix diversity factor.
2. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that Step 1 specific implementation step is as follows:
1.1, k representative specific double bond character strings are filtered out from the original keystroke information of free text, composition is special Determine character string set SK;
1.2, the frequency of use λ of each double bond is calculatedj, j=1,2 ..., k construct the keystroke interval time data set S of userppWith Half temporal characteristics data set Sst, SppAnd SstBe expressed as follows:
Wherein: k is the specific double bond character string number screened, V in formulai pp∈RkFor i-th of keystroke interval time vector Sample,For the keystroke interval time of the specific double bond character string of the last one in i-th of sample,It is in i-th of sample The keystroke interval time (j=1 ..., k) of j specific double bond character strings, m are collected keystroke interval time vector sample Number;Vi st∈R5For i-th of half temporal characteristics vector samples, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftRespectively The average keystroke speed of i-th of sample, the frequency of occurrences of negative interval time RP, error rate for input, cap lock key frequency of use With shift key frequency of use, PN_UD、Perror、PShiftAnd PCapsLockVariation range be [0,1], average keystroke speed WPM's Variation range be [0 ,+∞), under normal circumstances, the magnitude of WPM is 102, there are significance differences with the magnitude of other half temporal characteristics Different, n is collected half temporal characteristics vector number of samples;
1.3, double of temporal characteristics data set SstIn average keystroke speed WPM normalization formula is normalized are as follows:
In formula: max { WPMi| i=1 ..., n } it is that maximum in sample is averaged keystroke speed, it is denoted as WPMmax, at normalization After reason, by half temporal characteristics data set SstIt is abbreviated as
Sst={ Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5] | i=1,2 ... n } (4)
In formula:vi,2=Pi,N_UD, vi,3=Pi,error, vi,4=Pi,CapsLock, vi,5=Pi,Shift
3. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that The mean value and standard deviation of keystroke interval time data set and the mean value of half temporal characteristics data set and standard deviation in the step 2 Calculation method are as follows:
If data set SppThe mean value of middle all elements isData set SstThe mean value of middle all elements isThen
If data set SppThe standard deviation of middle all elements isData set SstIncluded in element standard Difference isThen
4. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that The calculating side on the up/down boundary of keystroke interval time indicatrix and the up/down boundary of half temporal characteristics curve in the step 3 Method are as follows:
If data set SppIncluded in the coboundary vector sum lower boundary vector of element be respectivelyData set SstIncluded in element coboundary vector sum lower boundary vector RespectivelyThe then coboundary of keystroke interval time indicatrixBelow BoundaryCalculating such as following formula (9), the coboundary v of half temporal characteristics curveu,l, lower boundary vd,lCalculating such as following formula (10):
In formula:WithFor adjustable threshold value.
5. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that The calculation method of keystroke interval time weighted feature curve diversity factor and half temporal characteristics curve diversity factor in the step 4 are as follows:
If any keystroke interval time vector sampleThen the sample is in data set SppIn weighting it is special Levy curve diversity factorCalculation formula are as follows:
In formula:
Wherein: λjFor the frequency of use of each specific double bond character string, j=1,2 ..., k;
If any half temporal characteristics vector sample Vs st=[vs,1,vs,2,…,vs,5] in data set SstMiddle indicatrix diversity factorFor
In formula:
Keystroke interval time data set S is calculated according to the frequency of use of double bond each in set SK and formula (11)ppIn each element Weighted feature curve diversity factor, and constitute keystroke interval time indicatrix diversity factor set Qpp;When calculating half by formula (12) Between characteristic data set SstIn each element indicatrix diversity factor, and constitute half temporal characteristics curve diversity factor set Qst, on The definition for stating each set is
In formula:Indicate data set SppMiddle element Vi pp∈RkWeighted feature curve diversity factor,Indicate data Collect SstMiddle element Vi st∈R5Indicatrix diversity factor.
6. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that Knowledge method for distinguishing is carried out to user identity using weighted feature curve diversity factor and indicatrix diversity factor in the step 5 are as follows:
Test sample is determined according to following inequality
In formula:WithFor adjustable threshold;
If inequality (15) and formula (16) are set up simultaneously, assert that this test sample belongs to the user;Otherwise, assert this test specimens Originally it is not belonging to the user.
7. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that Threshold value in the step 4WithValue range be 0~3.
8. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 5, which is characterized in that Threshold value in the step 5WithValue range be not less than 0.
CN201810644782.0A 2018-06-21 2018-06-21 User identity recognition method for weighting keystroke characteristic curve difference degree Expired - Fee Related CN109063431B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810644782.0A CN109063431B (en) 2018-06-21 2018-06-21 User identity recognition method for weighting keystroke characteristic curve difference degree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810644782.0A CN109063431B (en) 2018-06-21 2018-06-21 User identity recognition method for weighting keystroke characteristic curve difference degree

Publications (2)

Publication Number Publication Date
CN109063431A true CN109063431A (en) 2018-12-21
CN109063431B CN109063431B (en) 2021-10-22

Family

ID=64821322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810644782.0A Expired - Fee Related CN109063431B (en) 2018-06-21 2018-06-21 User identity recognition method for weighting keystroke characteristic curve difference degree

Country Status (1)

Country Link
CN (1) CN109063431B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111988294A (en) * 2020-08-10 2020-11-24 中国平安人寿保险股份有限公司 User identity recognition method, device, terminal and medium based on artificial intelligence

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101478401A (en) * 2009-01-21 2009-07-08 东北大学 Authentication method and system based on key stroke characteristic recognition
US7649478B1 (en) * 2005-11-03 2010-01-19 Hyoungsoo Yoon Data entry using sequential keystrokes
US20100257212A1 (en) * 2009-04-06 2010-10-07 Caption Colorado L.L.C. Metatagging of captions
CN103703433A (en) * 2011-05-16 2014-04-02 触摸式有限公司 User input prediction
CN104809377A (en) * 2015-04-29 2015-07-29 西安交通大学 Method for monitoring network user identity based on webpage input behavior characteristics
CN105429937A (en) * 2015-10-22 2016-03-23 同济大学 Identity authentication method and system based on keystroke behaviors

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7649478B1 (en) * 2005-11-03 2010-01-19 Hyoungsoo Yoon Data entry using sequential keystrokes
CN101478401A (en) * 2009-01-21 2009-07-08 东北大学 Authentication method and system based on key stroke characteristic recognition
US20100257212A1 (en) * 2009-04-06 2010-10-07 Caption Colorado L.L.C. Metatagging of captions
CN103703433A (en) * 2011-05-16 2014-04-02 触摸式有限公司 User input prediction
CN104809377A (en) * 2015-04-29 2015-07-29 西安交通大学 Method for monitoring network user identity based on webpage input behavior characteristics
CN105429937A (en) * 2015-10-22 2016-03-23 同济大学 Identity authentication method and system based on keystroke behaviors

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ARWA ALSULTAN 等: "Non-conventional keystroke dynamics for user authentication", 《PATTERN RECOGNITION LETTERS》 *
H. DAVOUDI 等: "A New Distance Measure for Free Text Keystroke Authentication", 《2009 14TH INTERNATIONAL CSI COMPUTER CONFERENCE》 *
宋梦玲 等: "基于加权相对距离的自由文本击键特征认证识别方法", 《现代计算机》 *
王林 等: "采用击键特征曲线差异度的用户身份认证方法", 《计算机工程与应用》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111988294A (en) * 2020-08-10 2020-11-24 中国平安人寿保险股份有限公司 User identity recognition method, device, terminal and medium based on artificial intelligence
CN111988294B (en) * 2020-08-10 2022-04-12 中国平安人寿保险股份有限公司 User identity recognition method, device, terminal and medium based on artificial intelligence

Also Published As

Publication number Publication date
CN109063431B (en) 2021-10-22

Similar Documents

Publication Publication Date Title
Lu et al. Continuous authentication by free-text keystroke based on CNN and RNN
Xiaofeng et al. Continuous authentication by free-text keystroke based on CNN plus RNN
Karnan et al. Biometric personal authentication using keystroke dynamics: A review
CN100356388C (en) Biocharacteristics fusioned identity distinguishing and identification method
Kabir et al. Normalization and weighting techniques based on genuine-impostor score fusion in multi-biometric systems
CN103530546B (en) A kind of identity identifying method based on the behavior of user's mouse
EP2523149A2 (en) A method and system for association and decision fusion of multimodal inputs
Qin et al. A fuzzy authentication system based on neural network learning and extreme value statistics
CN105184254B (en) A kind of identity identifying method and system
CN110309863B (en) Identity credibility evaluation method based on analytic hierarchy process and gray correlation analysis
Karnan et al. Bio password—keystroke dynamic approach to secure mobile devices
WO2017075913A1 (en) Mouse behaviors based authentication method
JP2001516474A (en) User identification confirmation method for data processing device that generates alphabetic characters by keyboard operation
Wang et al. Improving reliability: User authentication on smartphones using keystroke biometrics
Tsai et al. An approach for user authentication on non-keyboard devices using mouse click characteristics and statistical-based classification
Lv et al. Biologic verification based on pressure sensor keyboards and classifier fusion techniques
Sae-Bae et al. Distinctiveness, complexity, and repeatability of online signature templates
Li et al. Enhanced free-text keystroke continuous authentication based on dynamics of wrist motion
Quraishi et al. Keystroke dynamics biometrics, a tool for user authentication–review
Sabareeswari et al. Identification of a person using multimodal biometric system
CN109063431A (en) Weight the method for identifying ID of keystroke characteristic curve diversity factor
Wang et al. Face-palm identification system on feature level fusion based on CCA
Neha et al. Biometric re-authentication: An approach towards achieving transparency in user authentication
Yang et al. Person authentication using finger snapping—a new biometric trait
Shen et al. Handedness recognition through keystroke-typing behavior in computer forensics analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20211022