CN109063431B - User identity recognition method for weighting keystroke characteristic curve difference degree - Google Patents
User identity recognition method for weighting keystroke characteristic curve difference degree Download PDFInfo
- Publication number
- CN109063431B CN109063431B CN201810644782.0A CN201810644782A CN109063431B CN 109063431 B CN109063431 B CN 109063431B CN 201810644782 A CN201810644782 A CN 201810644782A CN 109063431 B CN109063431 B CN 109063431B
- Authority
- CN
- China
- Prior art keywords
- keystroke
- data set
- characteristic curve
- time
- interval time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 239000013598 vector Substances 0.000 claims description 28
- 238000012360 testing method Methods 0.000 claims description 11
- 238000004364 calculation method Methods 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 9
- 239000000203 mixture Substances 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 7
- 238000002474 experimental method Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 238000013480 data collection Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/316—User authentication by observing the pattern of computer usage, e.g. typical user behaviour
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Social Psychology (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Collating Specific Patterns (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses a user identity recognition method of weighted keystroke characteristic curve difference, which comprises the following specific steps of firstly extracting a keystroke interval time data set and a half-time characteristic data set, then calculating the mean value and the standard deviation of the keystroke interval time data set and the half-time data, the upper/lower boundary of the keystroke interval time characteristic curve and the upper/lower boundary of the half-time characteristic curve, and the keystroke interval time weighted characteristic curve difference and the half-time characteristic curve difference, and finally recognizing the user identity by utilizing the weighted curve difference and the characteristic curve difference. Compared with the traditional keystroke authentication algorithm only using the keystroke duration and the keystroke time interval, the user identity authentication and identification method based on the characteristic curve difference has better performance, reduces the error rejection rate, the error acceptance rate and the equal error rate, and improves the identification accuracy.
Description
Technical Field
The invention belongs to the technical field of biometric authentication methods, and relates to a user identity identification method adopting a weighted keystroke characteristic curve difference degree.
Background
In recent years, we have used a large number of online web applications, including social media platforms (e.g., Facebook, Twitter, Weibo), cloud storage services (e.g., Drobox, Google Drive), and some online web games. However, cyber crimes from these Web applications have been unknowingly spread around the world. The serious cyber crime means that some criminals invade the account of a victim by using the internet and steal sensitive information including passwords and financial properties, and in order to solve the problem of the theft, an additional biometric authentication mechanism is introduced into an online program or equipment to improve the security of the user account. Among various current computer security measures, one is to use the traditional authentication technology based on passwords, but the passwords are easy to leak; another is to use some physical tokens (smart cards, etc.) instead of simple passwords, but this method requires the system to be equipped with corresponding hardware devices, which increases the cost and also has problems of loss, theft, duplication, etc. of the physical tokens. Since human biometrics have the characteristics of being non-reproducible, difficult to change and the like, biometric identification technology becomes a research hotspot. Common biometric techniques are: fingerprint identification technology, face identification technology, iris identification technology and the like. However, the above techniques all need to be equipped with hardware devices with high cost, which makes the application thereof inconvenient and difficult to popularize.
The keystroke dynamic identity authentication is a biometric authentication technology for identity recognition based on keystroke characteristics (such as keystroke delay, keystroke force and the like), and the method carries out the identification of the identity of a user by monitoring the keyboard input of the user, collecting keystroke data and carrying out classification modeling on the keystroke behavior characteristics of the user. Compared with other biological identification technologies, the keystroke dynamic identity authentication has the advantages of low cost, high flexibility and the like, and does not need extra expensive hardware equipment.
Disclosure of Invention
The invention aims to provide a user identity recognition method adopting the difference degree of a weighted keystroke characteristic curve, which solves the problem that the prior authentication method only adopts the size of each keystroke characteristic contained in a keystroke characteristic vector to carry out identity recognition, does not utilize the change rate between two adjacent characteristic values, and thus has low accuracy.
The technical scheme adopted by the invention is that the user identity identification method for weighting the difference degree of the keystroke characteristic curves is implemented according to the following steps:
and 5, identifying the user identity by using the difference degree of the weighting curve and the difference degree of the characteristic curve.
The present invention is also characterized in that,
the step 1 comprises the following concrete implementation steps:
1.1, screening k representative specific double-key character sequences from original keystroke information of a free text to form a specific character sequence set SK;
1.2 calculating the frequency of use λ of each double bondjJ-1, 2, …, k, constructing a user' S keystroke interval time dataset SppAnd a half-time feature data set Sst,SppAnd SstIs expressed as follows:
Sst={Vi st=[WPMi,Pi,N_UD,Pi,error,Pi,CapsLock,Pi,Shift]|i=1,2,…,n} (2)
wherein: wherein k is the number of the selected specific double bond character sequences, Vi pp∈RkThe time vector sample is spaced for the ith keystroke,the inter-keystroke interval for the last specific double-bond character sequence in the ith sample,the key stroke interval time (j is 1, …, k) of j-th specific double-key character sequence in the ith sample, and m is the number of collected key stroke interval time vector samples; vi st∈R5For the ith half-time eigenvector sample, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftAverage key stroke speed, occurrence frequency of negative interval time RP, input error rate, usage frequency of CapsLock key and usage frequency of Shift key, P, of ith sampleN_UD、Perror、PShiftAnd PCapsLockHas a variation range of [0,1 ]]The average keystroke speed WPM varies in the range of [0, + ∞), and typically the WPM is on the order of 102The magnitude of the half-time characteristic is obviously different from that of other half-time characteristics, and n is the number of collected half-time characteristic vector samples;
1.3 half-time feature data set SstThe normalization formula of the average keystroke speed WPM in (1) for normalization processing is as follows:
in the formula: max { WPM i1, …, n is the maximum average keystroke velocity in the sample, denoted WPMmaxAfter normalization, the half-time feature data set S is processedstIt is briefly described as
Sst={Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5]|i=1,2,…n} (4)
the method for calculating the mean and standard deviation of the keystroke interval time data set and the mean and standard deviation of the half-time characteristic data set in the step 2 comprises the following steps:
set data set SppThe mean value of all elements in the formula isData set SstThe mean value of all elements in the formula isThen
Set data set SppThe standard deviation of all elements in the composition isData set SstThe standard deviation of the elements contained in (A) isThen
The method for calculating the upper/lower boundary of the keystroke interval time characteristic curve and the upper/lower boundary of the half-time characteristic curve in the step 3 comprises the following steps:
set data set SppThe upper and lower boundary vectors of the elements contained in (1) are respectivelyData set SstThe upper and lower boundary vectors of the elements contained in (1) are respectivelyThe upper boundary of the inter-keystroke time characteristic curveLower boundaryIs calculated as the following equation (9), upper boundary v of the half-time characteristic curveu,lLower boundary vd,lIs calculated as follows (10):
The method for calculating the difference degree of the keystroke interval time weighting characteristic curve and the difference degree of the half-time characteristic curve in the step 4 comprises the following steps:
sample time vector for setting any one keystroke intervalThe sample is in the data set SppWeighted feature curve difference degree in (1)The calculation formula of (2) is as follows:
in the formula:
wherein: lambda [ alpha ]jFor each specific double-bond character sequence, j ═ 1,2, …, k;
let any half-time eigenvector sampleIn a data set SstDegree of difference of medium characteristic curveIs composed of
In the formula:
a keystroke interval time data set S is calculated from the frequency of use of each double key in set SK and equation (11)ppThe difference degree of the weighted characteristic curve of each element in the key stroke interval time characteristic curve is formed into a key stroke interval time characteristic curve difference degree set Qpp(ii) a Calculating a half-time feature data set S from equation (12)stThe difference degree of the characteristic curve of each element in the graph is formed into a half-time characteristic curve difference degree set QstThe above-mentioned sets are defined as
In the formula:representing a data set SppMiddle element Vi pp∈RkThe degree of difference of the weighted characteristic curves of (1),representing a data set SstMiddle element Vi st∈R5The degree of difference in characteristic curves of (a).
The method for identifying the user identity by using the difference degree of the weighting curve and the difference degree of the characteristic curve in the step 5 comprises the following steps:
the test sample is judged according to the following inequality
if inequality (15) and equation (16) are both true, the test sample is determined to belong to the user; otherwise, the test sample is deemed not to belong to the user.
Compared with the traditional keystroke authentication algorithm only using the keystroke duration and the keystroke time interval, the user identity authentication and identification algorithm based on the characteristic curve difference has better performance, reduces the error rejection rate (FRR), the error acceptance rate (FAR) and the equal error rate (ERR), and improves the identification accuracy.
Drawings
FIG. 1 is a key stroke duration characteristic curve of a user identification method using weighted key stroke characteristic curve disparity according to the present invention;
FIG. 2 is a half-time characteristic curve of a user identification method using weighted keystroke characteristic curve diversity in accordance with the present invention;
FIG. 3 is a data set S of a user identification method using weighted keystroke profile differences according to the present inventionppThe upper and lower boundary graphs of the keystroke characteristic;
FIG. 4 is a data set S of a user identification method using weighted keystroke profile differences according to the present inventionstThe upper and lower boundary graphs of the keystroke characteristic;
FIG. 5 is a graph showing the variation of the performance index ERR of the free text keystroke characteristic authentication algorithm with TP according to the method for identifying a user identity using the difference of weighted keystroke characteristic curves of the present invention;
FIG. 6 is a schematic diagram of the division of keystroke data set regions in the user identification method using weighted keystroke profile differences according to the present invention;
FIG. 7 is a schematic diagram of an internal sample of the method for identifying a user identity using a weighted keystroke profile difference according to the present invention;
FIG. 8 is a schematic diagram of an external sample of the method for identifying a user identity using a weighted keystroke profile difference according to the present invention;
FIG. 9 is a diagram of the difference of the weighted key characteristic curves of the user identification method using the difference of the weighted key characteristic curves according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a user identity recognition method for weighting the difference degree of a keystroke characteristic curve, which is implemented according to the following steps:
1.1, screening k representative specific double-key character sequences from original keystroke information of a free text to form a specific character sequence set SK;
1.2 calculating the frequency of use λ of each double bondjJ-1, 2, …, k, constructing a user' S keystroke interval time dataset SppAnd a half-time feature data set Sst,SppAnd SstIs expressed as follows:
Sst={Vi st=[WPMi,Pi,N_UD,Pi,error,Pi,CapsLock,Pi,Shift]|i=1,2,…,n} (2)
wherein: wherein k is the number of the selected specific double bond character sequences, Vi pp∈RkThe time vector sample is spaced for the ith keystroke,the inter-keystroke interval for the last specific double-bond character sequence in the ith sample,the key stroke interval time (j is 1, …, k) of j-th specific double-key character sequence in the ith sample, and m is the number of collected key stroke interval time vector samples; vi st∈R5For the ith half-time eigenvector sample, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftAverage key stroke speed, occurrence frequency of negative interval time RP, input error rate, usage frequency of CapsLock key and usage frequency of Shift key, P, of ith sampleN_UD、Perror、PShiftAnd PCapsLockHas a variation range of [0,1 ]]The average keystroke speed WPM varies in the range of [0, + ∞), and typically the WPM is on the order of 102The magnitude of the half-time characteristic is obviously different from that of other half-time characteristics, and n is the number of collected half-time characteristic vector samples;
1.3 half-time feature data setSstThe normalization formula of the average keystroke speed WPM in (1) for normalization processing is as follows:
in the formula: max { WPM i1, …, n is the maximum average keystroke velocity in the sample, denoted WPMmaxAfter normalization, the half-time feature data set S is processedstIt is briefly described as
Sst={Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5]|i=1,2,…n} (4)
data set SppAny one element V ofi ppExpressed by a curve with the abscissa j and the ordinate jWherein j is 1, L, k; in the same way, the data set SstAny one element V ofi prRepresented by a curve with the abscissa l and the ordinate v in turni,lWhere L is 1, L,5, the data set S will be described for convenienceppAny one element V ofi ppIs called the key stroke interval time characteristic curve, and sets the data SstAny one element V ofi prThe curve of (a) is referred to as a half-time characteristic curve, which may also be referred to collectively as a keystroke characteristic curve.
Set data set SppThe mean value of all elements in the formula isData set SstThe mean value of all elements in the formula isThen
Set data set SppThe standard deviation of all elements in the composition isData set SstThe standard deviation of the elements contained in (A) isThen
set data set SppThe upper and lower boundary vectors of the elements contained in (1) are respectivelyData set SstThe upper and lower boundary vectors of the elements contained in (1) are respectivelyThe upper boundary of the inter-keystroke time characteristic curveLower boundaryIs calculated as the following equation (9), upper boundary v of the half-time characteristic curveu,lLower boundary vd,lIs calculated as follows (10):
in the formula:andis an adjustable threshold value for the threshold value,andthe value ranges of (1) are all 0-3;
andthe value range is determined according to the central limit theorem (i.e. the collected key stroke time characteristic quantity is assumed to be taken)From a normal distribution) of the measured values,andthe larger the value is, the larger the range of the upper and lower boundaries is, the probability that the sample is in the boundaries is increased, so that the FRR value is reduced and the FAR value is increased;andthe smaller the value, the smaller the range of the upper and lower boundaries, the lower the probability that the sample is within the boundaries, thereby increasing the FRR value and decreasing the FAR value. Selected byAndthe value should be as minimal as possible to bring the EER value to a minimum,andthe value range is 0-3, and 2 can be selected generally.
data set SppAnd SstThe upper/lower boundary curve of (a) divides the entire two-dimensional plane into an inner region and an outer region, as shown in fig. 6. Time vector samples for any keystroke intervalIf it isAll satisfyThe sample is completely at SppIs called the data set SppAs shown in fig. 7; otherwise, it is called dataset SppAs shown in fig. 8. Similarly, for any half-time eigenvector sample, we can deriveIf it isAll have vd,l≤vs,l≤vu,lThen it is called data set SstAn inner sample; otherwise, it is called dataset SstAn external sample.
According to the above definition, if a sample is an external sample of a data set, the characteristic curve of the sample must form several closed regions with the upper or lower boundary curve of the data set in the corresponding external region, as shown by the shaded region in fig. 8. The greater the total area of all enclosed regions, the greater the difference between the representative sample and this dataset, and the greater the likelihood that the sample does not belong to this dataset. In combination with the characteristics of the keystroke characteristic information of the free text, the chapter improves the difference degree of the keystroke characteristic curve of the fixed text appropriately, extracts the concept of the difference degree of the weighted keystroke characteristic curve, and leads the concept to be associated with the use frequency of a specific double-key character sequence.
In the fixed text keystroke characteristic study, the physical meaning of the keystroke characteristic curve difference of any sample is the sum of all closed regions formed by the keystroke characteristic curve of the sample and the upper boundary or the lower boundary curve of the corresponding data set in the external region of the sample, and the area of each closed region mainly depends on the distance of each element in the characteristic vector exceeding the upper boundary or the lower boundary, such as d in fig. 92、d4And d7. Considering that there is a difference in the frequency of use of a specific double-bond character sequence screened from the free text, the distance of each element in the feature vector beyond the upper or lower boundary is multiplied by a corresponding weight coefficient, such as λ in fig. 82d2、λ4d4And λ7d7So that the allowable fluctuation range of the interval time of a specific double key stroke is inversely proportional to the frequency of use thereof. The weight coefficient multiplied by each element in the feature vector is in direct proportion to the use frequency of the corresponding double-bond character sequence, and the use frequency can be directly selected as the weight coefficient under general conditions.
Compared with the difference of the keystroke characteristic curve in the fixed text, the difference of the weighted keystroke characteristic curve obtained by the design method is characterized in that when the absolute values of the distances between any two elements in the characteristic vector exceeding the upper boundary or the lower boundary are equal, the variation of the difference of the characteristic curve caused by the element with a large weight coefficient is larger than that caused by the element with a small weight coefficient. Considering that the interval time of double key strokes with high frequency of use has better stability and smaller fluctuation amplitude than the double keys with low frequency of use, the interval time of double key strokes with high frequency of use should be differentiated so that the allowable fluctuation range of the interval time of double key strokes with high frequency of use is smaller than that of the double keys with low frequency of use. Therefore, it is more appropriate to use the weighted keystroke characteristic curve difference degree in the free text keystroke characteristic authentication process.
Sample time vector for setting any one keystroke intervalThe sample is in the data set SppWeighted feature curve difference degree in (1)The calculation formula of (2) is as follows:
in the formula:
wherein: lambda [ alpha ]jFor each specific double-bond character sequence, j ═ 1,2, …, k;
let any half-time eigenvector sampleIn a data set SstDegree of difference of medium characteristic curveIs composed of
In the formula:
a keystroke interval time data set S is calculated from the frequency of use of each double key in set SK and equation (11)ppThe difference degree of the weighted characteristic curve of each element in the key stroke interval time characteristic curve is formed into a key stroke interval time characteristic curve difference degree set Qpp(ii) a Calculating a half-time feature data set S from equation (12)stThe difference degree of the characteristic curve of each element in the graph is formed into a half-time characteristic curve difference degree set QstThe above-mentioned sets are defined as
In the formula:representing a data set SppMiddle elementThe degree of difference of the weighted characteristic curves of (1),representing a data set SstMiddle element Vi st∈R5The degree of difference in characteristic curves of (a).
hypothesis sampleAnd VstAre respectively a set SppAnd SstDefining the difference degree of the characteristic curve of the internal sample as zero; otherwise, the difference of the characteristic curve of the sample is equal to the sum of the areas of all closed areas formed by the characteristic curve of the sample and the upper/lower boundary characteristic curve of the corresponding data set in the outer area of the sample.
The test sample is judged according to the following inequality
In the formula:andin order to be able to adjust the threshold value,andthe value range of (A) is not less than 0;
if inequality (15) and equation (16) are both true, the test sample is determined to belong to the user; otherwise, the test sample is deemed not to belong to the user.
Example 1
A specific example of user identification is introduced.
Step 1: collecting data, establishing a half-time feature data set and a keystroke interval time data set
The experimental data acquisition is mainly carried out on a PC (personal computer) provided with a Windows system, a conventional mechanical keyboard is selected as keystroke information acquisition equipment, in addition, a user keystroke information acquisition program is written based on a VC + +6.0 development environment, and keystroke information of a user freely knocking the keyboard can be stored in a designated file through the program. Before the data collection job is started, the written keystroke information collection program is first installed into a computer used by a user participating in the experiment. During the data collection period, the user is required to run the keystroke information collection program after each turn-on of the computer, and the program display interface is shown in fig. 9. After the user clicks the 'start' button, the program starts to collect the free keystroke information of the user in a background running mode and stores the information into a key _ record. In the data acquisition process, the keystroke information acquisition program does not disturb the normal use of the computer by a user. Before the user turns off the computer each time, click [ end ] button to exit the keystroke information collection program.
After the raw data collection work of all participants is completed, the double-bond character sequence and the using times (frequency) used by each participant are extracted, and the statistical result is shown in table 1.
TABLE 1
The double bond character sequence with the frequency of use ranked first 15 (in order of frequency of use from high to low) and the number of uses by each participant during the experiment are listed in table 1. As can be seen from analysis, the double bond character sequences "in", "an", "ng", "zh", "wo", "en", "sh" and "ji" are commonly owned by all participants and have high use frequency, and can reflect that the double bond character sequences have certain universality. Therefore, in the experiments in this section, the 8 double-bond character sequences are selected to form a specific character sequence set SK, i.e., SK ═ { in, an, ng, zh, wo, en, sh, ji }.
After a specific character sequence set SK is selected, the use frequency of each double bond in the set SK is calculated according to the original collected data and is recorded as lambdajThe use frequency of the jth double key in the set SK is shown, j being 1,2, L, 8. In the process of free input, each participant collects a double key stroke interval time vector sample and a half-time feature vector sample in each period of time, and combines the data in the table 1, and each participant has at least 200 double key stroke interval time vector samples and 200 half-time feature vector samples.
The experimental scheme for identity authentication with the keystroke characteristic of the free text is basically similar to that of the fixed text, and only the keystroke characteristic information and the authentication algorithm used in the experiment are different.
The first 20%, 40%, 60% and 80% of the sample samples from each participant keystroke data set in turn are used as samples to establish the keystroke signature model for that participant. For the purpose of analysis of the experimental results, the variable TP represents the percentage of the number of samples of the above participants to the total number of samples.
Then, the last 80%, 60%, 40% and 20% of the samples from each participant keystroke data set were taken as test samples, respectively, and the error rejection rate FRR of that participant was calculated.
Next, all the sample samples of the other 9 participants were used as test samples, and the participant was attacked to calculate the false acceptance rate FAR of the participant.
The process loops until the FRR and FAR for 10 users are all calculated. And finally, taking the average value of all the FRRs and FARs of the participants as the performance index of the identity authentication algorithm.
The results of the experiments show that the error rejection rate (FRR), the error acceptance rate (FAR) and the Equal Error Rate (EER) of the various algorithms are obtained when the percentage TP of the number of user samples to the total number of samples is 20%, 40%, 60% and 80%, respectively, and the results of the experiments are shown in table 2. The experimental results in table 2 show that, under the condition that TP values are different, the Equal Error Rates (EERs) of the authentication algorithm based on the difference degree of the weighted keystroke characteristic curves are 20.11%, 16.28%, 13.48% and 10.32%, respectively, which are significantly better than those of the other 2 comparison algorithms, the authentication accuracy is high, and the authentication effect on the user keystroke characteristics is more ideal. The authentication algorithm based on the difference degree of the weighted keystroke characteristic curve provided in this chapter not only contains the traditional keystroke interval time, but also introduces the change rate of the interval time, the use frequency of the double-key character sequence and other information in the process of calculating the difference degree of the weighted keystroke characteristic curve. Therefore, the algorithm provided by the chapter can more accurately describe the keystroke characteristics of the user, and further can improve the accuracy of identity authentication.
The performance index of the free text keystroke characteristic authentication algorithm is shown in table 2, and the change curve of the performance index ERR of the free text keystroke characteristic authentication algorithm along with TP is shown in fig. 5.
TABLE 2 Performance indicators for free text keystroke feature authentication algorithms
From the above experimental results, it can be seen that the user identity recognition method using the weighted keystroke characteristic curve difference has better performance than the conventional keystroke authentication algorithm using only the keystroke duration and the keystroke time interval, reduces the False Rejection Rate (FRR), the False Acceptance Rate (FAR) and the equal error rate (ERR), and improves the recognition accuracy.
Claims (1)
1. The user identity recognition method for weighting the difference degree of the keystroke characteristic curve is characterized by comprising the following steps of:
step 1, collecting data, and establishing a half-time characteristic data set and a keystroke interval time data set;
the specific implementation steps are as follows:
1.1, screening k representative specific double-key character sequences from original keystroke information of a free text to form a specific character sequence set SK;
1.2 calculating the frequency of use λ of each double bondjJ-1, 2, …, k, constructing a user' S keystroke interval time dataset SppAnd a half-time feature data set Sst,SppAnd SstIs expressed as follows:
Sst={Vi st=[WPMi,Pi,N_UD,Pi,error,Pi,CapsLock,Pi,Shift]|i=1,2,…,n} (2)
wherein: wherein k is the number of the selected specific double bond character sequences, Vi pp∈RkThe time vector sample is spaced for the ith keystroke,the inter-keystroke interval for the last specific double-bond character sequence in the ith sample,the key stroke interval time (j is 1, …, k) of j-th specific double-key character sequence in the ith sample, and m is the number of collected key stroke interval time vector samples; vi st∈R5For the ith half-time eigenvector sample, WPMi、Pi,N_UD、Pi,error、Pi,CapsLockAnd Pi,ShiftAverage keystroke rates for the ith sample, respectivelyDegree, occurrence frequency of negative interval time RP, input error rate, CapsLock key use frequency and Shift key use frequency, PN_UD、Perror、PShiftAnd PCapsLockHas a variation range of [0,1 ]]The average keystroke speed WPM varies in the range of [0, + ∞), and typically the WPM is on the order of 102The magnitude of the half-time characteristic is obviously different from that of other half-time characteristics, and n is the number of collected half-time characteristic vector samples;
1.3 half-time feature data set SstThe normalization formula of the average keystroke speed WPM in (1) for normalization processing is as follows:
in the formula: max { WPMi1, …, n is the maximum average keystroke velocity in the sample, denoted WPMmaxAfter normalization, the half-time feature data set S is processedstIt is briefly described as
Sst={Vi st=[vi,1,vi,2,vi,3,vi,4,vi,5]|i=1,2,…n} (4)
step 2, respectively calculating the mean value and standard deviation of the keystroke interval time data set and the mean value and standard deviation of the half-time characteristic data set;
the specific calculation method comprises the following steps:
set data set SppThe mean value of all elements in the formula isData set SstThe mean value of all elements in the formula isThen
Set data set SppThe standard deviation of all elements in the composition isData set SstThe standard deviation of the elements contained in (A) isThen
Step 3, calculating the upper/lower boundary of the keystroke interval time characteristic curve according to the mean value and the standard deviation of the keystroke interval time data set, and calculating the upper/lower boundary of the half-time characteristic curve according to the mean value and the standard deviation of the half-time characteristic data set;
the specific calculation method comprises the following steps:
set data set SppThe upper and lower boundary vectors of the elements contained in (1) are respectivelyData set SstUpper and lower boundary vectors of elements contained in (1)Respectively in the amount ofThe upper boundary of the inter-keystroke time characteristic curveLower boundaryIs calculated as the following equation (9), upper boundary v of the half-time characteristic curveu,lLower boundary vd,lIs calculated as follows (10):
in the formula:andbeing adjustable threshold values, threshold valuesAndthe value ranges of (1) are all 0-3;
step 4, calculating the difference degree of the keystroke interval time weighting characteristic curve according to the upper/lower boundary of the keystroke interval time characteristic curve, and calculating the difference degree of the half-time characteristic curve according to the upper/lower boundary of the half-time characteristic curve;
the specific calculation method comprises the following steps:
sample time vector for setting any one keystroke intervalThe sample is in the data set SppWeighted feature curve difference degree in (1)The calculation formula of (2) is as follows:
in the formula:
wherein: lambda [ alpha ]jFor each specific double-bond character sequence, j ═ 1,2, …, k;
let any half-time eigenvector sampleIn a data set SstDegree of difference of medium characteristic curveIs composed of
In the formula:
a keystroke interval time data set S is calculated from the frequency of use of each double key in set SK and equation (11)ppThe difference degree of the weighted characteristic curve of each element in the key stroke interval time characteristic curve is formed into a key stroke interval time characteristic curve difference degree set Qpp(ii) a Calculating a half-time feature data set S from equation (12)stThe difference degree of the characteristic curve of each element in the graph is formed into a half-time characteristic curve difference degree set QstThe above-mentioned sets are defined as
In the formula:representing a data set SppMiddle element Vi pp∈RkThe degree of difference of the weighted characteristic curves of (1),representing a data set SstMiddle element Vi st∈R5Degree of difference in characteristic curves of
Step 5, identifying the user identity by utilizing the difference degree of the keystroke interval time weighting characteristic curve and the difference degree of the half-time characteristic curve;
the specific method for identifying the identity comprises the following steps:
the test sample is judged according to the following inequality
if inequality (15) and equation (16) are both true, the test sample is determined to belong to the user; otherwise, the test sample is deemed not to belong to the user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810644782.0A CN109063431B (en) | 2018-06-21 | 2018-06-21 | User identity recognition method for weighting keystroke characteristic curve difference degree |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810644782.0A CN109063431B (en) | 2018-06-21 | 2018-06-21 | User identity recognition method for weighting keystroke characteristic curve difference degree |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109063431A CN109063431A (en) | 2018-12-21 |
CN109063431B true CN109063431B (en) | 2021-10-22 |
Family
ID=64821322
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810644782.0A Expired - Fee Related CN109063431B (en) | 2018-06-21 | 2018-06-21 | User identity recognition method for weighting keystroke characteristic curve difference degree |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109063431B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111988294B (en) * | 2020-08-10 | 2022-04-12 | 中国平安人寿保险股份有限公司 | User identity recognition method, device, terminal and medium based on artificial intelligence |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101478401A (en) * | 2009-01-21 | 2009-07-08 | 东北大学 | Authentication method and system based on key stroke characteristic recognition |
US7649478B1 (en) * | 2005-11-03 | 2010-01-19 | Hyoungsoo Yoon | Data entry using sequential keystrokes |
CN103703433A (en) * | 2011-05-16 | 2014-04-02 | 触摸式有限公司 | User input prediction |
CN104809377A (en) * | 2015-04-29 | 2015-07-29 | 西安交通大学 | Method for monitoring network user identity based on webpage input behavior characteristics |
CN105429937A (en) * | 2015-10-22 | 2016-03-23 | 同济大学 | Identity authentication method and system based on keystroke behaviors |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9245017B2 (en) * | 2009-04-06 | 2016-01-26 | Caption Colorado L.L.C. | Metatagging of captions |
-
2018
- 2018-06-21 CN CN201810644782.0A patent/CN109063431B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7649478B1 (en) * | 2005-11-03 | 2010-01-19 | Hyoungsoo Yoon | Data entry using sequential keystrokes |
CN101478401A (en) * | 2009-01-21 | 2009-07-08 | 东北大学 | Authentication method and system based on key stroke characteristic recognition |
CN103703433A (en) * | 2011-05-16 | 2014-04-02 | 触摸式有限公司 | User input prediction |
CN104809377A (en) * | 2015-04-29 | 2015-07-29 | 西安交通大学 | Method for monitoring network user identity based on webpage input behavior characteristics |
CN105429937A (en) * | 2015-10-22 | 2016-03-23 | 同济大学 | Identity authentication method and system based on keystroke behaviors |
Non-Patent Citations (4)
Title |
---|
A New Distance Measure for Free Text Keystroke Authentication;H. Davoudi 等;《2009 14th International CSI Computer Conference》;20091021;第570-575页 * |
Non-conventional keystroke dynamics for user authentication;Arwa Alsultan 等;《Pattern Recognition Letters》;20170401;第89卷;第53-59页 * |
基于加权相对距离的自由文本击键特征认证识别方法;宋梦玲 等;《现代计算机》;20160205;第7-11页 * |
采用击键特征曲线差异度的用户身份认证方法;王林 等;《计算机工程与应用》;20180313;第54卷(第22期);第160-166,196页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109063431A (en) | 2018-12-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Kabir et al. | Normalization and weighting techniques based on genuine-impostor score fusion in multi-biometric systems | |
Kabir et al. | A multi-biometric system based on feature and score level fusions | |
CN109447099B (en) | PCA (principal component analysis) dimension reduction-based multi-classifier fusion method | |
WO2016049983A1 (en) | User keyboard key-pressing behavior mode modeling and analysis system, and identity recognition method thereof | |
EP2523149A2 (en) | A method and system for association and decision fusion of multimodal inputs | |
Mhenni et al. | Double serial adaptation mechanism for keystroke dynamics authentication based on a single password | |
Bharadi et al. | Off-line signature recognition systems | |
WO2017075913A1 (en) | Mouse behaviors based authentication method | |
Sae-Bae et al. | Distinctiveness, complexity, and repeatability of online signature templates | |
Tsai et al. | An approach for user authentication on non-keyboard devices using mouse click characteristics and statistical-based classification | |
Kong et al. | A hierarchical classification method for finger knuckle print recognition | |
Silasai et al. | The study on using biometric authentication on mobile device | |
CN115204238B (en) | PPG signal identity recognition method for wearable equipment and wearable equipment | |
CN109063431B (en) | User identity recognition method for weighting keystroke characteristic curve difference degree | |
Quraishi et al. | Keystroke dynamics biometrics, a tool for user authentication–review | |
Sun et al. | Smartphone User Authentication Based on Holding Position and Touch-Typing Biometrics. | |
CN113627238B (en) | Biological identification method, device, equipment and medium based on vibration response characteristics of hand structure | |
Shanmugapriya et al. | Virtual key force—a new feature for keystroke | |
Neha et al. | Biometric re-authentication: An approach towards achieving transparency in user authentication | |
Jeong et al. | Effect of Smaller Fingerprint Sensors on the Security of Fingerprint Authentication | |
Shen et al. | Handedness recognition through keystroke-typing behavior in computer forensics analysis | |
CN110298159A (en) | A kind of smart phone dynamic gesture identity identifying method | |
CN111159698B (en) | Terminal implicit identity authentication method based on Sudoku password | |
Vasuhi et al. | An efficient multi-modal biometric person authentication system using fuzzy logic | |
Fan | Applying generative adversarial networks for the generation of adversarial attacks against continuous authentication |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20211022 |