CN109063431A - Weight the method for identifying ID of keystroke characteristic curve diversity factor - Google Patents
Weight the method for identifying ID of keystroke characteristic curve diversity factor Download PDFInfo
- Publication number
- CN109063431A CN109063431A CN201810644782.0A CN201810644782A CN109063431A CN 109063431 A CN109063431 A CN 109063431A CN 201810644782 A CN201810644782 A CN 201810644782A CN 109063431 A CN109063431 A CN 109063431A
- Authority
- CN
- China
- Prior art keywords
- keystroke
- diversity factor
- data set
- interval time
- temporal characteristics
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 39
- 230000002123 temporal effect Effects 0.000 claims abstract description 64
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000012360 testing method Methods 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 4
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 claims description 3
- 239000000203 mixture Substances 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 8
- 230000008859 change Effects 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 6
- 238000013480 data collection Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000013145 classification model Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/316—User authentication by observing the pattern of computer usage, e.g. typical user behaviour
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Collating Specific Patterns (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses the method for identifying ID of weighting keystroke characteristic curve diversity factor, specific step is as follows, first keystroke interval time data set and half temporal characteristics data set are extracted, then the mean value and standard deviation of calculating keystroke interval time data set and half time data, the up/down boundary of keystroke interval time indicatrix and the up/down boundary of half temporal characteristics curve and keystroke interval time weighted feature curve diversity factor and half temporal characteristics curve diversity factor, finally identify user identity using weighted curve diversity factor and indicatrix diversity factor.The present invention is using the method for identifying ID for weighting keystroke characteristic curve diversity factor, compared with traditional keystroke identifying algorithm of keystroke duration and keystroke time interval is used only, user identity authentication recognizer performance based on indicatrix diversity factor is more preferable, false rejection rate, false acceptance rate and equal error rate are reduced, the accuracy rate of identification is improved.
Description
Technical field
The invention belongs to biological authentication method technical fields, are related to a kind of use using weighting keystroke characteristic curve diversity factor
Family personal identification method.
Background technique
In recent years, we are using a large amount of online web application, these programs include social media platform (such as
Facebook, Twitter, Weibo), cloud storage service (such as Drobox, Google Drive) and some online games.
However these web application bring network crimes are unconsciously being spread to all over the world unexpectedly.Serious network
Crime refers to that some offenders using the account of internet intrusion victim, steal quick including password and financial assets
Feel information, in order to solve theft problem, we are entered in sequence of threads or equipment by a kind of additional biological identification mechanism
To improve the safety of user account.In current various computer security measures, one is use the tradition based on password
Identity validation technology, but password is easy leakage;Another kind is to replace simple challenge using some physical tokens (smart card etc.),
But this method requires system to be equipped with corresponding hardware device, this meeting so that increased costs and there is also physical token loss,
The problems such as stealing, replicating.Due to the biological characteristic of people have it is not reproducible, the characteristics such as be difficult to change so that living things feature recognition
Technology becomes research hotspot.Common biometrics identification technology has: fingerprint identification technology, face recognition technology, iris recognition
Technology etc..But above-mentioned technology requires to be equipped with the higher hardware device of cost, keeps its application inconvenient and is difficult to popularize.
Keystroke dynamic identity authentication is a kind of to carry out identity based on keystroke characteristic (such as: keystroke time delay, keystroke strength etc.)
The biometrics of identification, this method acquire keystroke data, the keystroke behavior to user by the keyboard input of monitoring user
Feature carries out classification model construction, thus carries out the differentiation of user identity.Keystroke dynamic identity authentication not only solves tradition and is based on
The safety issue of password authentication, while being compared with other biological identification technology, the hardware for not needing additional expensive is set, standby
Have many advantages, such as that at low cost, flexibility is high.
Summary of the invention
The object of the present invention is to provide a kind of method for identifying ID using weighting keystroke characteristic curve diversity factor, solutions
Identity knowledge is carried out only with the size of each keystroke characteristic included in keystroke characteristic vector in existing authentication method of having determined
Not, using the change rate between two adjacent characteristic values, so as to cause the not high problem of accuracy.
The technical scheme adopted by the invention is that the method for identifying ID of weighting keystroke characteristic curve diversity factor, tool
Body follows the steps below to implement:
Step 1, acquisition data, establish half temporal characteristics data set and keystroke interval time data set;
Step 2, the mean value for calculating separately keystroke interval time data set and standard deviation are equal with half temporal characteristics data set
Value and standard deviation;
Step 3 calculates keystroke interval time indicatrix according to the mean value and standard deviation of keystroke interval time data set
Up/down boundary calculates the up/down boundary of half temporal characteristics curve according to the mean value of half temporal characteristics data set and standard deviation;
Step 4, the up/down feature modeling keystroke interval time weighted feature curve according to keystroke interval time indicatrix
Diversity factor, according to the half temporal characteristics curve diversity factor of up/down feature modeling of half temporal characteristics curve;
Step 5 identifies user identity using weighted curve diversity factor and indicatrix diversity factor.
The features of the present invention also characterized in that
Step 1 specific implementation step is as follows:
1.1, k representative specific double bond character strings, group are filtered out from the original keystroke information of free text
At specific character sequence set SK;
1.2, the frequency of use λ of each double bond is calculatedj, j=1,2 ..., k construct the keystroke interval time data set of user
SppWith half temporal characteristics data set Sst, SppAnd SstBe expressed as follows:
Sst={ Vi st=[WPMi,Pi,N_UD,Pi,error,Pi,CapsLock,Pi,Shift] | i=1,2 ..., n } (2)
Wherein: k is the specific double bond character string number screened, V in formulai pp∈RkFor i-th of keystroke interval time
Vector sample,For the keystroke interval time of the specific double bond character string of the last one in i-th of sample,For i-th of sample
In j-th of specific double bond character string keystroke interval time (j=1 ..., k), m be collected keystroke interval time vector
Number of samples;Vi st∈R5For i-th of half temporal characteristics vector samples, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftPoint
It Wei not the average keystroke speed of i-th of sample, the frequency of occurrences of negative interval time RP, error rate for input, cap lock key use
Frequency and shift key frequency of use, PN_UD、Perror、PShiftAnd PCapsLockVariation range be [0,1], average keystroke speed
The variation range of WPM be [0 ,+∞), under normal circumstances, the magnitude of WPM is 102, exist with the magnitude of other half temporal characteristics aobvious
Difference is write, n is collected half temporal characteristics vector number of samples;
1.3, double of temporal characteristics data set SstIn average keystroke speed WPM normalization formula is normalized
Are as follows:
In formula: max { WPMi| i=1 ..., n } it is that maximum in sample is averaged keystroke speed, it is denoted as WPMmax, by normalizing
After change processing, by half temporal characteristics data set SstIt is abbreviated as
Sst={ Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5] | i=1,2 ... n } (4)
In formula:vi,2=Pi,N_UD, vi,3=Pi,error, vi,4=Pi,CapsLock, vi,5=Pi,Shift。
The mean value and standard of the mean value and standard deviation of keystroke interval time data set and half temporal characteristics data set in step 2
The calculation method of difference are as follows:
If data set SppThe mean value of middle all elements isData set SstThe mean value of middle all elements
ForThen
If data set SppThe standard deviation of middle all elements isData set SstIncluded in element
Standard deviation beThen
The meter on the up/down boundary of keystroke interval time indicatrix and the up/down boundary of half temporal characteristics curve in step 3
Calculation method are as follows:
If data set SppIncluded in the coboundary vector sum lower boundary vector of element be respectivelyData set SstIncluded in element coboundary vector sum lower boundary vector
RespectivelyThe then coboundary of keystroke interval time indicatrixBelow
BoundaryCalculating such as following formula (9), the coboundary v of half temporal characteristics curveu,l, lower boundary vd,lCalculating such as following formula (10):
In formula:WithFor adjustable threshold value.
The calculation method of keystroke interval time weighted feature curve diversity factor and half temporal characteristics curve diversity factor in step 4
Are as follows:
If any keystroke interval time vector sampleThen the sample is in data set SppIn plus
Weigh indicatrix diversity factorCalculation formula are as follows:
In formula:
Wherein: λjFor the frequency of use of each specific double bond character string, j=1,2 ..., k;
If any half temporal characteristics vector sampleIn data set SstMiddle indicatrix diversity factorFor
In formula:
Keystroke interval time data set S is calculated according to the frequency of use of double bond each in set SK and formula (11)ppIn it is each
The weighted feature curve diversity factor of element, and constitute keystroke interval time indicatrix diversity factor set Qpp;It is calculated by formula (12)
Half temporal characteristics data set SstIn each element indicatrix diversity factor, and constitute half temporal characteristics curve diversity factor set
Qst, the above-mentioned definition respectively gathered is
In formula:Indicate data set SppMiddle element Vi pp∈RkWeighted feature curve diversity factor,It indicates
Data set SstMiddle element Vi st∈R5Indicatrix diversity factor.
Knowledge method for distinguishing is carried out to user identity using weighted curve diversity factor and indicatrix diversity factor in step 5 are as follows:
Test sample is determined according to following inequality
In formula:WithFor adjustable threshold;
If inequality (15) and formula (16) are set up simultaneously, assert that this test sample belongs to the user;Otherwise, assert this survey
Sample is originally not belonging to the user.
Threshold value in step 4WithValue range be 0~3.
Threshold value in step 5WithValue range be not less than 0.
The invention has the advantages that using the method for identifying ID of weighting keystroke characteristic curve diversity factor, and only
It is compared using the keystroke duration with traditional keystroke identifying algorithm of keystroke time interval, the user based on indicatrix diversity factor
Authentication recognizer performance is more preferable, reduces false rejection rate (FRR), false acceptance rate (FAR) and equal error rate
(ERR), the accuracy rate of identification is improved.
Detailed description of the invention
Fig. 1 is keystroke duration of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor
Indicatrix;
Fig. 2 is that the present invention is bent using half temporal characteristics of the method for identifying ID of weighting keystroke characteristic curve diversity factor
Line;
Fig. 3 is data set S of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factorppKeystroke
The upper and lower boundary curve figure of feature;
Fig. 4 is data set S of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factorstKeystroke
The upper and lower boundary curve figure of feature;
Fig. 5 is that the present invention is special using the free text keystroke of method for identifying ID of weighting keystroke characteristic curve diversity factor
Identifying algorithm performance indicator ERR is levied with the change curve of TP;
Fig. 6 is keystroke data Ji Qu of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor
Domain divides schematic diagram;
Fig. 7 is internal specimen signal of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor
Figure;
Fig. 8 is external samples signal of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor
Figure;
Fig. 9 is weighted curve feature of the present invention using the method for identifying ID of weighting keystroke characteristic curve diversity factor
Curve diversity factor schematic diagram.
Specific embodiment
The following describes the present invention in detail with reference to the accompanying drawings and specific embodiments.
The present invention weights the method for identifying ID of keystroke characteristic curve diversity factor, is specifically implemented according to the following steps:
Step 1, acquisition data, establish half temporal characteristics data set and keystroke interval time data set, specific implementation step
It is as follows:
1.1, k representative specific double bond character strings, group are filtered out from the original keystroke information of free text
At specific character sequence set SK;
1.2, the frequency of use λ of each double bond is calculatedj, j=1,2 ..., k construct the keystroke interval time data set of user
SppWith half temporal characteristics data set Sst, SppAnd SstBe expressed as follows:
Sst={ Vi st=[WPMi,Pi,N_UD,Pi,error,Pi,CapsLock,Pi,Shift] | i=1,2 ..., n } (2)
Wherein: k is the specific double bond character string number screened, V in formulai pp∈RkFor i-th of keystroke interval time
Vector sample,For the keystroke interval time of the specific double bond character string of the last one in i-th of sample,For i-th of sample
In j-th of specific double bond character string keystroke interval time (j=1 ..., k), m be collected keystroke interval time vector
Number of samples;Vi st∈R5For i-th of half temporal characteristics vector samples, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftPoint
It Wei not the average keystroke speed of i-th of sample, the frequency of occurrences of negative interval time RP, error rate for input, cap lock key use
Frequency and shift key frequency of use, PN_UD、Perror、PShiftAnd PCapsLockVariation range be [0,1], average keystroke speed
The variation range of WPM be [0 ,+∞), under normal circumstances, the magnitude of WPM is 102, exist with the magnitude of other half temporal characteristics aobvious
Difference is write, n is collected half temporal characteristics vector number of samples;
1.3, double of temporal characteristics data set SstIn average keystroke speed WPM normalization formula is normalized
Are as follows:
In formula: max { WPMi| i=1 ..., n } it is that maximum in sample is averaged keystroke speed, it is denoted as WPMmax, by normalizing
After change processing, by half temporal characteristics data set SstIt is abbreviated as
Sst={ Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5] | i=1,2 ... n } (4)
In formula:vi,2=Pi,N_UD, vi,3=Pi,error, vi,4=Pi,CapsLock, vi,5=Pi,Shift。
Step 2, the mean value for calculating separately keystroke interval time data set and standard deviation are equal with half temporal characteristics data set
Value and standard deviation, circular are as follows:
By data set SppIn either element Vi ppRepresented by curve, abscissa j, ordinate isWherein j=1, L,
k;Similarly, by data set SstIn either element Vi prRepresented by curve, abscissa l, ordinate is followed successively by vi,l, wherein l=
1, L, 5, for ease of description, by data set SppIn either element Vi ppCurve be known as keystroke interval time indicatrix, will count
According to collection SstIn either element Vi prCurve be referred to as half temporal characteristics curve, keystroke characteristic curve can also be referred to as.
If data set SppThe mean value of middle all elements isData set SstThe mean value of middle all elements
ForThen
If data set SppThe standard deviation of middle all elements isData set SstIncluded in element
Standard deviation isThen
Step 3 calculates keystroke interval time indicatrix according to the mean value and standard deviation of keystroke interval time data set
Up/down boundary calculates the up/down boundary of half temporal characteristics curve, tool according to the mean value of half temporal characteristics data set and standard deviation
Body calculation method are as follows:
If data set SppIncluded in the coboundary vector sum lower boundary vector of element be respectivelyData set SstIncluded in element coboundary vector sum lower boundary vector
RespectivelyThe then coboundary of keystroke interval time indicatrixBelow
BoundaryCalculating such as following formula (9), the coboundary v of half temporal characteristics curveu,l, lower boundary vd,lCalculating such as following formula (10):
In formula:WithFor adjustable threshold value,WithValue range be 0~3;
WithValue range be according to central-limit theorem (i.e. assuming acquisition keystroke temporal characteristics amount obey just
State distribution) and determine,WithValue is bigger, and the range on upper and lower boundary is bigger, and sample is in the increase of the probability in boundary, from
And reduce FRR value, the increase of FAR value;WithIt is worth smaller, the range on upper and lower boundary is smaller, and sample is in general in boundary
Rate reduces, so that FRR value increases, FAR value reduces.It choosesWithValue should make as far as possible EER value reach minimum value,WithValue range is 0~3, can generally choose 2.
Step 4, the up/down feature modeling keystroke interval time weighted feature curve according to keystroke interval time indicatrix
Diversity factor, according to the half temporal characteristics curve diversity factor of up/down feature modeling of half temporal characteristics curve, circular are as follows:
Data set SppAnd SstUp/down boundary curve entire two-dimensional surface is divided into interior zone and perimeter, such as
Shown in Fig. 6.For any keystroke interval time vector sampleIfIt is all satisfiedThen the sample is completely in SppInterior zone, be called data set SppInternal specimen, as shown in Figure 7;
Otherwise, then it is called data set SppExternal samples, as shown in Figure 8.It can similarly obtain, for any half temporal characteristics vector sampleIfThere is vd,l≤vs,l≤vu,l, then it is called data set SstInternal specimen;Otherwise,
Then it is called data set SstExternal samples.
According to above-mentioned definition it is found that if when a sample is the external samples of some data set, in corresponding outside area
In domain, the indicatrix of this sample inherently constitutes several enclosed areas with the coboundary of its data set or lower boundary curve
Domain, as shown in the shadow region in Fig. 8.The gross area of all closed areas is bigger, indicates that the difference of sample and this data set is got over
Greatly, it is bigger to be not belonging to a possibility that this data set for sample.The feature in conjunction with possessed by free text keystroke characteristic information, this chapter pairs
Fixed text keystroke characteristic curve diversity factor is suitably modified, and the concept of weighting keystroke characteristic curve diversity factor is extracted,
It is associated with it with the generation of the frequency of use of specific double bond character string.
In the research of fixed text keystroke characteristic, the physical meaning of the keystroke characteristic curve diversity factor of any sample is the sample
Whole envelopes that this keystroke characteristic curve and the coboundary of corresponding data collection or lower boundary curve are constituted in its perimeter
The sum of closed region area, the area of each closed area depend primarily on each element in feature vector and exceed up or down boundary
D in distance, such as Fig. 92、d4And d7.In view of the use frequency of the specific double bond character string filtered out from free text
Rate can have differences, by element each in feature vector beyond up or down boundary distance multiplied by corresponding weight coefficient, such as
λ in Fig. 82d2、λ4d4And λ7d7, so that the waviness tolerance range of specific double bond keystroke interval time is inversely proportional with its frequency of use
Relationship.The weight coefficient that each element is multiplied in feature vector to its corresponding to the frequency of use of double bond character string it is directly proportional
Relationship can directly be chosen use frequency as weight coefficient under normal circumstances.
Compared with the keystroke characteristic curve diversity factor in fixed text, according to the obtained weighting keystroke of above-mentioned design method
Indicatrix diversity factor is unique in that when any two element exceeds the absolute of up or down frontier distance in feature vector
When being worth equal, the variable quantity of indicatrix diversity factor caused by the big element of weight coefficient is greater than the small element institute of weight coefficient
The variable quantity of caused indicatrix diversity factor.In view of frequency of use high double bond keystroke interval time is lower than frequency of use
Double bond stability is good, fluctuating range is small, it should differentiation processing is carried out to it, so that when the high double bond keystroke interval of frequency of use
Between waviness tolerance range be less than the low double bond of frequency of use.Therefore, it uses and adds in free text keystroke characteristic verification process
It is more suitable to weigh keystroke characteristic curve diversity factor.
If any keystroke interval time vector sampleThen the sample is in data set SppIn plus
Weigh indicatrix diversity factorCalculation formula are as follows:
In formula:
Wherein: λjFor the frequency of use of each specific double bond character string, j=1,2 ..., k;
If any half temporal characteristics vector sampleIn data set SstMiddle indicatrix diversity factorFor
In formula:
Keystroke interval time data set S is calculated according to the frequency of use of double bond each in set SK and formula (11)ppIn it is each
The weighted feature curve diversity factor of element, and constitute keystroke interval time indicatrix diversity factor set Qpp;It is calculated by formula (12)
Half temporal characteristics data set SstIn each element indicatrix diversity factor, and constitute half temporal characteristics curve diversity factor set
Qst, the above-mentioned definition respectively gathered is
In formula:Indicate data set SppMiddle elementWeighted feature curve diversity factor,Table
Show data set SstMiddle element Vi st∈R5Indicatrix diversity factor.
Step 5 identifies user identity using weighted curve diversity factor and indicatrix diversity factor method particularly includes:
Assuming that sampleAnd VstIt is set S respectivelyppAnd SstInternal specimen, then defining its indicatrix diversity factor is
Zero;Otherwise, the indicatrix diversity factor of sample is equal to the indicatrix of the sample and the up/down boundary characteristic song of corresponding data collection
The sum of whole closed area areas that line is constituted in its perimeter.
Then test sample is determined according to following inequality
In formula:WithFor adjustable threshold,WithValue range be not less than 0;
If inequality (15) and formula (16) are set up simultaneously, assert that this test sample belongs to the user;Otherwise, assert this survey
Sample is originally not belonging to the user.
Embodiment 1
Introduce the example of a specific user identity identification.
Step 1: acquisition data establish half temporal characteristics data set and keystroke interval time data set
Experimental data acquisition mainly carries out above and below the PC machine of installation Windows system, and conventional mechanical keyboard is selected to make
Equipment is acquired for keystroke information, in addition, having write a user keystroke information acquisition program based on VC++6.0 exploitation environment, is led to
Crossing the program can freely tap user the keystroke information storage of keyboard into specified file.Start it in data collection task
Before, the keystroke information capture program write is installed in the computer used by a user for participating in experiment first.In data
During acquisition, it is desirable that user just runs keystroke information capture program after opening computer every time, and program display interface is such as
Shown in Fig. 9.After user click [beginning] button, program just starts to acquire the free keystroke letter of user in a manner of running background
Breath, and be stored in key_record.txt file.In data acquisition, keystroke information capture program will not be bothered
User's normal use computer.Before user shuts down computer every time, [end] button is clicked to exit keystroke information acquisition
Program.
After completing the raw data acquisition work of whole participants, pair that each participant gets used to therefrom is extracted
Key characters sequence and access times (frequency), statistical result is shown in Table 1.
Table 1
Listed in table 1 each participant during the experiment in frequency of use come first 15 (by frequency of use by height to
Low sequence) double bond character string and access times.By analysis it is found that double bond character string " in ", " an ", " ng ",
" zh ", " wo ", " en ", " sh ", " ji " are that all participants are jointly owned and frequency of use is higher, can also be reflected above-mentioned double
Key characters sequence has certain generality.Therefore, it is specific that above-mentioned 8 double bond character strings composition is chosen in the experiment of this chapter
Character string set SK, i.e. SK={ in, an, ng, zh, wo, en, sh, ji }.
After the selected specific character sequence set SK, made according to each double bond in acquired original data set of computations SK
With frequency, it is denoted as λj, indicate set SK in j-th of double bond frequency of use, j=1,2, L, 8.Each participant is in Freely input
During, each a period of time will collect a double bond keystroke interval time vector sample and half temporal characteristics vector sample
This, in conjunction with the data in table 1, each participant at least has 200 double bond keystroke interval time vector samples and 200 half
Temporal characteristics vector sample.
The mean value and mark of step 2, the mean value for calculating separately keystroke interval time data set and standard deviation and half time data
It is quasi- poor
It is substantially similar with fixed text for the experimental program of free text keystroke characteristic authentication, only make in experiment
Keystroke characteristic information and identifying algorithm different from.
Successively concentrating from each participant's keystroke data takes preceding 20%, 40%, 60% and 80% sample as sample
Originally the keystroke characteristic model of the participant was established.The sample of above-mentioned participant to be indicated with variable TP convenient for analysis of experimental results
This quantity accounts for the percentage of total number of samples amount.
Then, it concentrates after taking 80%, 60%, 40% and 20% sample to be used as from each participant's keystroke data respectively to survey
Sample sheet calculates the false rejection rate FRR of the participant.
Next, using whole samples of other 9 participants as test sample, which is attacked
It hits, calculates the false acceptance rate FAR of the participant.
The above process can recycle down, until the FRR and FAR of 10 users are all calculated.Finally, taking all participations
Performance indicator of the average value of person FRR and FAR as identity authentication algorithm.
Step 4, user identity identification
Experiment has obtained user's sample size to account for the percentage TP of total number of samples amount being respectively 20%, 40%, 60% and
In the case of 80%, false rejection rate (FRR), false acceptance rate (FAR) and the equal error rate (EER) of various algorithms, experiment knot
Fruit is shown in Table 2.Through the experimental result in table 2 it is found that in the case where TP value is different, based on weighting keystroke characteristic curve difference
The equal error rate (EER) of the identifying algorithm of degree is respectively 20.11%, 16.28%, 13.48% and 10.32%, significant excellent
In other 2 kinds of alignment algorithms, accuracy height is authenticated, it is more preferable to the certification effect of characteristics of user keystroke.This is primarily due to Man Ha
Keystroke interval time and half temporal characteristics are used only as keystroke characteristic progress user's body in distance algorithm and relative distance algorithm
Part certification, and the identifying algorithm based on weighting keystroke characteristic curve diversity factor that this chapter is proposed is calculating weighting keystroke characteristic song
Traditional keystroke interval time is not only contained during line diversity factor, also introduces the change rate and double bond of interval time
The information such as the frequency of use of character string.Therefore the mentioned algorithm of this chapter can more accurately describe the keystroke characteristic of user, into
And the accuracy rate of authentication can be improved.
The performance indicator of free text keystroke characteristic identifying algorithm is as shown in table 2, free text keystroke characteristic identifying algorithm
Performance indicator ERR is as shown in Figure 5 with the change curve of TP.
The performance indicator of the free text keystroke characteristic identifying algorithm of table 2
It can be seen that the method for identifying ID using weighting keystroke characteristic curve diversity factor from above-mentioned experimental result,
Compared with traditional keystroke identifying algorithm of keystroke duration and keystroke time interval is used only, based on indicatrix diversity factor
User identity authentication recognizer performance is more preferable, reduces false rejection rate (FRR), false acceptance rate (FAR) and equal error
Rate (ERR), improves the accuracy rate of identification.
Claims (8)
1. weighting the method for identifying ID of keystroke characteristic curve diversity factor, which is characterized in that specifically real according to the following steps
It applies:
Step 1, acquisition data, establish half temporal characteristics data set and keystroke interval time data set;
The mean value of step 2, the mean value for calculating separately keystroke interval time data set and standard deviation and half temporal characteristics data set and
Standard deviation;
Step 3, the up/down that keystroke interval time indicatrix is calculated according to the mean value and standard deviation of keystroke interval time data set
Boundary calculates the up/down boundary of half temporal characteristics curve according to the mean value of half temporal characteristics data set and standard deviation;
Step 4, the up/down feature modeling keystroke interval time weighted feature curve difference according to keystroke interval time indicatrix
Degree, according to the half temporal characteristics curve diversity factor of up/down feature modeling of half temporal characteristics curve;
Step 5 identifies user identity using weighted feature curve diversity factor and indicatrix diversity factor.
2. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that
Step 1 specific implementation step is as follows:
1.1, k representative specific double bond character strings are filtered out from the original keystroke information of free text, composition is special
Determine character string set SK;
1.2, the frequency of use λ of each double bond is calculatedj, j=1,2 ..., k construct the keystroke interval time data set S of userppWith
Half temporal characteristics data set Sst, SppAnd SstBe expressed as follows:
Wherein: k is the specific double bond character string number screened, V in formulai pp∈RkFor i-th of keystroke interval time vector
Sample,For the keystroke interval time of the specific double bond character string of the last one in i-th of sample,It is in i-th of sample
The keystroke interval time (j=1 ..., k) of j specific double bond character strings, m are collected keystroke interval time vector sample
Number;Vi st∈R5For i-th of half temporal characteristics vector samples, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftRespectively
The average keystroke speed of i-th of sample, the frequency of occurrences of negative interval time RP, error rate for input, cap lock key frequency of use
With shift key frequency of use, PN_UD、Perror、PShiftAnd PCapsLockVariation range be [0,1], average keystroke speed WPM's
Variation range be [0 ,+∞), under normal circumstances, the magnitude of WPM is 102, there are significance differences with the magnitude of other half temporal characteristics
Different, n is collected half temporal characteristics vector number of samples;
1.3, double of temporal characteristics data set SstIn average keystroke speed WPM normalization formula is normalized are as follows:
In formula: max { WPMi| i=1 ..., n } it is that maximum in sample is averaged keystroke speed, it is denoted as WPMmax, at normalization
After reason, by half temporal characteristics data set SstIt is abbreviated as
Sst={ Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5] | i=1,2 ... n } (4)
In formula:vi,2=Pi,N_UD, vi,3=Pi,error, vi,4=Pi,CapsLock, vi,5=Pi,Shift。
3. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that
The mean value and standard deviation of keystroke interval time data set and the mean value of half temporal characteristics data set and standard deviation in the step 2
Calculation method are as follows:
If data set SppThe mean value of middle all elements isData set SstThe mean value of middle all elements isThen
If data set SppThe standard deviation of middle all elements isData set SstIncluded in element standard
Difference isThen
4. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that
The calculating side on the up/down boundary of keystroke interval time indicatrix and the up/down boundary of half temporal characteristics curve in the step 3
Method are as follows:
If data set SppIncluded in the coboundary vector sum lower boundary vector of element be respectivelyData set SstIncluded in element coboundary vector sum lower boundary vector
RespectivelyThe then coboundary of keystroke interval time indicatrixBelow
BoundaryCalculating such as following formula (9), the coboundary v of half temporal characteristics curveu,l, lower boundary vd,lCalculating such as following formula (10):
In formula:WithFor adjustable threshold value.
5. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that
The calculation method of keystroke interval time weighted feature curve diversity factor and half temporal characteristics curve diversity factor in the step 4 are as follows:
If any keystroke interval time vector sampleThen the sample is in data set SppIn weighting it is special
Levy curve diversity factorCalculation formula are as follows:
In formula:
Wherein: λjFor the frequency of use of each specific double bond character string, j=1,2 ..., k;
If any half temporal characteristics vector sample Vs st=[vs,1,vs,2,…,vs,5] in data set SstMiddle indicatrix diversity factorFor
In formula:
Keystroke interval time data set S is calculated according to the frequency of use of double bond each in set SK and formula (11)ppIn each element
Weighted feature curve diversity factor, and constitute keystroke interval time indicatrix diversity factor set Qpp;When calculating half by formula (12)
Between characteristic data set SstIn each element indicatrix diversity factor, and constitute half temporal characteristics curve diversity factor set Qst, on
The definition for stating each set is
In formula:Indicate data set SppMiddle element Vi pp∈RkWeighted feature curve diversity factor,Indicate data
Collect SstMiddle element Vi st∈R5Indicatrix diversity factor.
6. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that
Knowledge method for distinguishing is carried out to user identity using weighted feature curve diversity factor and indicatrix diversity factor in the step 5 are as follows:
Test sample is determined according to following inequality
In formula:WithFor adjustable threshold;
If inequality (15) and formula (16) are set up simultaneously, assert that this test sample belongs to the user;Otherwise, assert this test specimens
Originally it is not belonging to the user.
7. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 1, which is characterized in that
Threshold value in the step 4WithValue range be 0~3.
8. the method for identifying ID of weighting keystroke characteristic curve diversity factor according to claim 5, which is characterized in that
Threshold value in the step 5WithValue range be not less than 0.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810644782.0A CN109063431B (en) | 2018-06-21 | 2018-06-21 | User identity recognition method for weighting keystroke characteristic curve difference degree |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810644782.0A CN109063431B (en) | 2018-06-21 | 2018-06-21 | User identity recognition method for weighting keystroke characteristic curve difference degree |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109063431A true CN109063431A (en) | 2018-12-21 |
CN109063431B CN109063431B (en) | 2021-10-22 |
Family
ID=64821322
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810644782.0A Expired - Fee Related CN109063431B (en) | 2018-06-21 | 2018-06-21 | User identity recognition method for weighting keystroke characteristic curve difference degree |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109063431B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111988294A (en) * | 2020-08-10 | 2020-11-24 | 中国平安人寿保险股份有限公司 | User identity recognition method, device, terminal and medium based on artificial intelligence |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101478401A (en) * | 2009-01-21 | 2009-07-08 | 东北大学 | Authentication method and system based on key stroke characteristic recognition |
US7649478B1 (en) * | 2005-11-03 | 2010-01-19 | Hyoungsoo Yoon | Data entry using sequential keystrokes |
US20100257212A1 (en) * | 2009-04-06 | 2010-10-07 | Caption Colorado L.L.C. | Metatagging of captions |
CN103703433A (en) * | 2011-05-16 | 2014-04-02 | 触摸式有限公司 | User input prediction |
CN104809377A (en) * | 2015-04-29 | 2015-07-29 | 西安交通大学 | Method for monitoring network user identity based on webpage input behavior characteristics |
CN105429937A (en) * | 2015-10-22 | 2016-03-23 | 同济大学 | Identity authentication method and system based on keystroke behaviors |
-
2018
- 2018-06-21 CN CN201810644782.0A patent/CN109063431B/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7649478B1 (en) * | 2005-11-03 | 2010-01-19 | Hyoungsoo Yoon | Data entry using sequential keystrokes |
CN101478401A (en) * | 2009-01-21 | 2009-07-08 | 东北大学 | Authentication method and system based on key stroke characteristic recognition |
US20100257212A1 (en) * | 2009-04-06 | 2010-10-07 | Caption Colorado L.L.C. | Metatagging of captions |
CN103703433A (en) * | 2011-05-16 | 2014-04-02 | 触摸式有限公司 | User input prediction |
CN104809377A (en) * | 2015-04-29 | 2015-07-29 | 西安交通大学 | Method for monitoring network user identity based on webpage input behavior characteristics |
CN105429937A (en) * | 2015-10-22 | 2016-03-23 | 同济大学 | Identity authentication method and system based on keystroke behaviors |
Non-Patent Citations (4)
Title |
---|
ARWA ALSULTAN 等: "Non-conventional keystroke dynamics for user authentication", 《PATTERN RECOGNITION LETTERS》 * |
H. DAVOUDI 等: "A New Distance Measure for Free Text Keystroke Authentication", 《2009 14TH INTERNATIONAL CSI COMPUTER CONFERENCE》 * |
宋梦玲 等: "基于加权相对距离的自由文本击键特征认证识别方法", 《现代计算机》 * |
王林 等: "采用击键特征曲线差异度的用户身份认证方法", 《计算机工程与应用》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111988294A (en) * | 2020-08-10 | 2020-11-24 | 中国平安人寿保险股份有限公司 | User identity recognition method, device, terminal and medium based on artificial intelligence |
CN111988294B (en) * | 2020-08-10 | 2022-04-12 | 中国平安人寿保险股份有限公司 | User identity recognition method, device, terminal and medium based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
CN109063431B (en) | 2021-10-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lu et al. | Continuous authentication by free-text keystroke based on CNN and RNN | |
Xiaofeng et al. | Continuous authentication by free-text keystroke based on CNN plus RNN | |
Karnan et al. | Biometric personal authentication using keystroke dynamics: A review | |
CN100356388C (en) | Biocharacteristics fusioned identity distinguishing and identification method | |
Kabir et al. | Normalization and weighting techniques based on genuine-impostor score fusion in multi-biometric systems | |
CN103530546B (en) | A kind of identity identifying method based on the behavior of user's mouse | |
EP2523149A2 (en) | A method and system for association and decision fusion of multimodal inputs | |
Qin et al. | A fuzzy authentication system based on neural network learning and extreme value statistics | |
CN105184254B (en) | A kind of identity identifying method and system | |
CN110309863B (en) | Identity credibility evaluation method based on analytic hierarchy process and gray correlation analysis | |
Karnan et al. | Bio password—keystroke dynamic approach to secure mobile devices | |
WO2017075913A1 (en) | Mouse behaviors based authentication method | |
JP2001516474A (en) | User identification confirmation method for data processing device that generates alphabetic characters by keyboard operation | |
Wang et al. | Improving reliability: User authentication on smartphones using keystroke biometrics | |
Tsai et al. | An approach for user authentication on non-keyboard devices using mouse click characteristics and statistical-based classification | |
Lv et al. | Biologic verification based on pressure sensor keyboards and classifier fusion techniques | |
Sae-Bae et al. | Distinctiveness, complexity, and repeatability of online signature templates | |
Li et al. | Enhanced free-text keystroke continuous authentication based on dynamics of wrist motion | |
Quraishi et al. | Keystroke dynamics biometrics, a tool for user authentication–review | |
Sabareeswari et al. | Identification of a person using multimodal biometric system | |
CN109063431A (en) | Weight the method for identifying ID of keystroke characteristic curve diversity factor | |
Wang et al. | Face-palm identification system on feature level fusion based on CCA | |
Neha et al. | Biometric re-authentication: An approach towards achieving transparency in user authentication | |
Yang et al. | Person authentication using finger snapping—a new biometric trait | |
Shen et al. | Handedness recognition through keystroke-typing behavior in computer forensics analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20211022 |