CN111124860A - Method for identifying user by using keyboard and mouse data in uncontrollable environment - Google Patents

Method for identifying user by using keyboard and mouse data in uncontrollable environment Download PDF

Info

Publication number
CN111124860A
CN111124860A CN201911291751.2A CN201911291751A CN111124860A CN 111124860 A CN111124860 A CN 111124860A CN 201911291751 A CN201911291751 A CN 201911291751A CN 111124860 A CN111124860 A CN 111124860A
Authority
CN
China
Prior art keywords
keyboard
mouse
action
user
key
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911291751.2A
Other languages
Chinese (zh)
Other versions
CN111124860B (en
Inventor
廖永建
王栋
梁艺宽
吴宇
王勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201911291751.2A priority Critical patent/CN111124860B/en
Publication of CN111124860A publication Critical patent/CN111124860A/en
Application granted granted Critical
Publication of CN111124860B publication Critical patent/CN111124860B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3438Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment monitoring of user actions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3041Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is an input/output interface
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis

Abstract

The invention discloses a method for identifying a user by using keyboard and mouse data in an uncontrollable environment, which comprises the following steps: step 1, data acquisition: deploying a keyboard action acquisition program and a mouse action acquisition program in a computer, and acquiring keyboard metadata and mouse metadata in daily operation of the computer; step 2, feature extraction: extracting keyboard action characteristics and mouse action characteristics from the collected keyboard metadata and mouse metadata; step 3, training a model: training an evolutionary neural network with enhanced topology among each user by using the extracted keyboard action characteristics and mouse action characteristics to obtain a user identification model; step 4, user identification: and identifying the user to be identified by utilizing the user identification model. The invention combines the keyboard action characteristic and the mouse action characteristic, is more effective than a method only using a single characteristic, and has higher recognition rate than the traditional SVM and neural network algorithm after being trained by using a NEAT algorithm.

Description

Method for identifying user by using keyboard and mouse data in uncontrollable environment
Technical Field
The invention relates to the technical field of network space behavior identification, in particular to a method for identifying a user by using keyboard and mouse data in an uncontrollable environment.
Background
In the network space, the user identification has wide applications, such as personalized recommendation, system security, and the like. Biometric systems are a relatively common application that relies on the measurement of physiological or behavioral characteristics to determine or verify the identity of an individual. In cyberspace, behavioral biometric systems rely primarily on input devices such as keyboards and mice, which are already commonly available in most computers and therefore less costly without additional device requirements.
Analysis of keyboard and mouse typing rhythms, known as keystroke dynamics and mouse dynamics, has received increasing attention in recent years. Keystroke dynamics refers to the process of measuring and evaluating human typing rhythms on digital devices, which are quite unique to everyone due to unique neurophysiological factors. The earliest use of Keystroke times for identity verification was proposed in 1980 by Gaines in the paper Authentication by Keystick Timing, Some preliminary results. In 2003, Gamboa and Fred collected mouse movement and mouse click data of volunteers playing a memory game on a web page for 10-15 minutes and used these behavioral information to verify the identity of individuals.
In the last few years, most studies have collected a set of features of several volunteers in a controlled environment to improve the accuracy of user recognition in terms of keystroke dynamics and mouse dynamics. Many statistical and machine learning identification algorithms are widely used for user identification based on keystroke and mouse dynamics. For example, in 2012, Traore et al introduced a risk-based authentication system for an experimental social networking site that achieved an EER of 8.21% using the BN model. In 2015, Wu et al proposed an active user behavior recognition data loss prevention model that combines user keystroke and mouse behavior. However, due to the difference between the data acquisition environment and the data set, the results of these user identification methods are significantly different, which makes it difficult to reproduce the experimental results in the practical application environment.
For actual human-computer interaction, mouse operation and keyboard operation are integrated, that is, a user completes a series of clicking and inputting actions through subsequent keyboard and mouse operations. There have been many studies on keystroke dynamics and mouse dynamics, respectively. However, some studies have considered real-environment human-computer interaction (HCI) behavior, which is a fusion of keyboard and mouse behavior data. Some existing researches on HCI behaviors mainly utilize traditional keyboard and mouse features and combine with shallow-layer machine learning algorithms such as Support Vector Machine (SVM) and Decision Tree (DT), and few researches consider differences of user keyboard and mouse operation behaviors and effective methods of keyboard and mouse integration. Another problem is that most of these studies are focused on controlled environments, with only a fairly low user recognition accuracy if uncontrolled environment datasets are used.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: in order to solve the existing problems, the method for identifying the user by using the keyboard and the mouse data in the uncontrollable environment is provided, and the method can extract the user characteristics by detecting the mouse and keyboard behaviors in the daily life of the user in the uncontrollable environment so as to achieve the aim of realizing user identification.
The technical scheme adopted by the invention is as follows:
a method for identifying a user using keyboard and mouse data in an uncontrolled environment, comprising the steps of:
step 1, data acquisition: deploying a keyboard action acquisition program and a mouse action acquisition program in a computer, and acquiring keyboard metadata and mouse metadata in daily operation of the computer;
step 2, feature extraction: extracting keyboard action characteristics and mouse action characteristics from the collected keyboard metadata and mouse metadata;
step 3, training a model: training an evolutionary neural network with enhanced topology among each user by using the extracted keyboard action characteristics and mouse action characteristics to obtain a user identification model;
step 4, user identification: and identifying the user to be identified by utilizing the user identification model.
Further, the method of step 1 is: a keyboard action acquisition program and a mouse action acquisition program deployed in a computer acquire keyboard metadata and mouse metadata in daily operation of the computer through a hook chain table; the keyboard metadata comprises a key name, a key press timestamp and a key release timestamp; the mouse metadata includes an operation type, a timestamp, and x and y coordinates of a mouse pointer position.
Further, the method for extracting the keyboard action features in the step 2 comprises the following steps:
step 211, dividing the keyboard metadata into keyboard data sets of a time window t 1;
step 212, calculating the duration of each key and the delay time of the previous key and the next key by using the keyboard data set; the duration of each key is the average value of each pressing time of each key in a time window t1, and the pressing time is the difference between the key release time stamp of each key and the key pressing time stamp; the delay time is the difference value of the key pressing time stamp of the next key and the key releasing time stamp of the previous key;
step 213, if the duration of each key in the keyboard data set is a duration characteristic, and the delay time of the previous key and the next key is a delay time characteristic, the keyboard action characteristics include k duration characteristics and k2And a delay time characteristic, wherein k is the number of keys of a keyboard used by the computer.
Further, before the keyboard action features are extracted, a delay threshold T1 is set, and when the time for the user to stop using the keyboard exceeds the delay threshold T1, a new keyboard data sequence is divided when the delay occurs, so as to remove the data blank in the keyboard action features.
Further, the method for extracting the mouse action features in the step 2 comprises the following steps:
step 221, dividing the mouse metadata into mouse data sets of a time window t 2;
step 222, calculating the direction, curvature angle and curvature distance of the mouse action by using the mouse data set:
(1) the direction is as follows: for any two continuous points B and C, the point B to the point C runs along a straight line, and the included angle between the straight line and the horizontal line is the action direction of the mouse;
(2) angle of curvature: for any three continuous points A, B and C, the included angle between the straight line from the point A to the point B and the straight line from the point B to the point C is the curvature angle;
(3) curvature distance: for any three points A, B and C in succession, the curvature distance of point B is the ratio of the perpendicular distance from point B to the line between point A and point C to the straight line distance from point A to point C;
step 223, respectively calculating the cumulative distribution function of the mouse motion direction, curvature angle and curvature distance in the mouse data set as the mouse motion characteristics.
Further, before the mouse action characteristic is extracted, a delay threshold T2 is set, and when the time for the user to stop using the mouse exceeds a delay threshold T2, a new mouse data sequence is divided when the delay occurs, so as to remove data blanks in the mouse action characteristic.
Further, the method of step 3 is:
step 31, performing z-score normalization processing on the extracted keyboard action features and mouse action features to enable the average value of the input keyboard action features and mouse action features to be 0 and the variance to be 1; after normalization processing, dividing all keyboard action characteristics and mouse action characteristics into a training set and a verification set;
step 32, using a one-to-one method to divide the multi-classification problem of the N users into two classification problems between N x (N-1)/2 users, wherein each two classification problem corresponds to an evolutionary neural network;
step 33, setting the number of input nodes of each evolutionary neural network as the number of keyboard action features and mouse action features in a training set, setting the number of hidden nodes as 0 and the number of output nodes as 2, wherein the hidden nodes respectively correspond to the similarity degrees of two users in classification, and the highest similarity degree is 1 and the lowest similarity degree is 0; setting the Fitness function Fitness of each evolutionary neural network as:
Fitness=fitness-(output[0]-xo[0])**2-(output[1]-xo[1])**2
wherein, the fitness is a numerical value expected to be reached, is the number of samples of the training set, i.e. 2, the output is the output corresponding to each sample of the training set input by the current evolutionary neural network, and the xo is the expected output; the larger the Fitness is, the higher the recognition success rate of the evolutionary neural network is, and when the output of each sample in the training set is consistent with the expected output, the Fitness reaches the maximum value;
and step 34, training the N x (N-1)/2 evolutionary neural networks to obtain N x (N-1)/2 evolutionary neural network models with the highest user classification success rate one by one, and taking the models as user identification models.
Preferably, after the normalization process in step 31, 70% of all the keyboard motion features and mouse motion features are used as a training set, and 30% are used as a verification set.
Further, the method of step 4 is:
step 41, collecting keyboard metadata and mouse metadata of a user to be identified by using the method in step 1;
step 42, extracting the keyboard action characteristics and the mouse action characteristics of the user to be identified by using the method in the step 2;
and 43, inputting the keyboard action characteristics and the mouse action characteristics of the user to be identified into the N x (N-1)/2 evolutionary neural network models obtained in the step 3, taking the similarity output by each evolutionary neural network model as the voting scores of the users, and correspondingly adding the voting scores according to the users to obtain the user with the highest score, namely the user to be identified.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1. the invention has high recognition rate and more comprehensive data sources: the invention combines the keyboard action characteristic and the mouse action characteristic, has more comprehensive data sources than a method only using a single characteristic, and has higher recognition rate than the traditional SVM and neural network algorithm after training by using a NEAT algorithm.
2. The invention is not influenced by the user operation environment: after using the direction, curvature angle, and curvature distance features of the mouse action, it is relatively platform independent since the direction and curvature angle are not based on screen size or other elements of the user's environment. This angle-based metric is relatively stable on different platforms and is not affected by the user's operating environment.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a schematic diagram of a method of identifying a user using keyboard and mouse data in an uncontrolled environment according to the present invention.
FIG. 2 is a diagram illustrating a mouse action feature extraction method according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the detailed description and specific examples, while indicating the preferred embodiment of the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
As shown in FIG. 1, a method for identifying a user using keyboard and mouse data in an uncontrolled environment according to the present invention comprises the steps of:
step 1, data acquisition: deploying a keyboard action acquisition program and a mouse action acquisition program in a computer, and acquiring keyboard metadata and mouse metadata in daily operation of the computer;
step 2, feature extraction: extracting keyboard action characteristics and mouse action characteristics from the collected keyboard metadata and mouse metadata;
step 3, training a model: training an evolutionary neural network with enhanced topology among each user by using the extracted keyboard action characteristics and mouse action characteristics to obtain a user identification model;
step 4, user identification: identifying a user to be identified using a user identification model
The features and properties of the present invention are described in further detail below with reference to examples.
Example 1
(1) Data acquisition:
a keyboard action acquisition program and a mouse action acquisition program deployed in a computer acquire keyboard metadata and mouse metadata in daily operation of the computer through a hook linked list (hook); in order to meet the requirement of data acquisition in an actual environment, a keyboard action acquisition program and a mouse action acquisition program can be developed in a common Windows operating system of a computer, and it should be noted that, for the requirement, a corresponding keyboard action acquisition program and a corresponding mouse action acquisition program can also be developed in other computer operating systems. The two programs automatically run in the background of the computer operating system, and collect the keyboard metadata and the mouse metadata of the daily keyboard and mouse operation of the user as much as possible.
In this embodiment, the keyboard metadata includes a key name, a key press timestamp, and a key release timestamp; the mouse metadata includes an operation type, a timestamp, and x and y coordinates of a mouse pointer position. Generally, the KEYBOARD metadata is obtained through a global KEYBOARD hook (WH _ KEYBOARD), and the MOUSE metadata is obtained through a global MOUSE hook (WH _ MOUSE). All keyboard metadata and mouse metadata are stored in multiple files for later processing.
(2) Feature extraction:
(2.1) keyboard action characteristics:
before extracting the keyboard action characteristic, a delay threshold T1 is set, generally T1 is 1 minute, when the time for the user to stop using the keyboard exceeds the delay threshold T1, a new keystroke data sequence is divided when the delay occurs, and the data blank in the keyboard action characteristic is removed. Then, extracting the keyboard action characteristics:
step 211, dividing the keyboard metadata into keyboard data sets of a time window t 1; taking t1 for 5 minutes generally;
step 212, calculating the duration of each key and the delay time of the previous key and the next key by using the keyboard data set; the duration of each key is the average value of each pressing time of each key in a time window t1, and the pressing time is the difference between the key release time stamp of each key and the key pressing time stamp; the delay time is the difference value of the key pressing time stamp of the next key and the key releasing time stamp of the previous key;
step 213, if the duration of each key in the keyboard data set is a duration characteristic, and the delay time of the previous key and the next key is a delay time characteristic, the keyboard action characteristics include k duration characteristics and k2And a delay time characteristic, wherein k is the number of keys of a keyboard used by the computer. For example, a typical keyboard has 110 keys, the keyboard action features include 110 duration features and 12100 delay time features. However, since the key arrangement is required to occur infrequently during the use of the keyboard by the user, the influence on the entire data set is small, and thus the delay time characteristics having small influence can be deleted to reduce the amount of calculation.
(2.2) mouse action characteristics
Before the mouse action characteristic is extracted, a delay threshold T2 is set, generally T2 is 30 seconds, and when the time for a user to stop using the mouse exceeds the delay threshold T2, a new mouse data sequence is divided when the delay occurs, so as to remove data blanks in the mouse action characteristic. Then, extracting mouse action characteristics:
step 221, dividing the mouse metadata into mouse data sets of a time window t 2; taking t2 for 5 minutes generally;
step 222, as shown in fig. 2, calculating the direction, curvature angle and curvature distance of the mouse action by using the mouse data set:
(1) the direction is as follows: for any two continuous points B and C, the point B to the point C travel along a straight line, and the included angle between the straight line and the horizontal line is the action direction of the mouse, such as the included angle X in FIG. 2;
(2) angle of curvature: for any three consecutive points A, B and C, the angle between the straight line from point A to point B and the straight line from point B to point C is the curvature angle, such as the angle Y in FIG. 2;
(3) curvature distance: for any three points A, B and C in succession, the curvature distance of point B is the ratio of the perpendicular distance from point B to the line between point A and point C to the straight line distance from point A to point C; the curvature distance is unitless because it is the ratio of the two distances.
Step 223, calculating the cumulative distribution function of the mouse action direction (the numerical range is 0-360 °), the curvature angle (the numerical range is 0-180 °) and the curvature distance (the numerical range is 0-200) in the mouse data set as the mouse action characteristics. If the numerical values of the direction, the curvature angle and the curvature distance exceed the numerical range, the upper limit or the lower limit is correspondingly taken.
(3) Model training:
step 31, performing z-score normalization processing on the extracted keyboard action features and mouse action features to enable the average value of the input keyboard action features and mouse action features to be 0 and the variance to be 1; after normalization processing, dividing all keyboard action characteristics and mouse action characteristics into a training set and a verification set; generally, 70% of all the keyboard action features and mouse action features are taken as a training set, and 30% are taken as a verification set. The training set is used as input of training the evolutionary neural network, and the verification set is used for verifying the recognition success rate of the evolutionary neural network model obtained through training.
Step 32, using a one-to-one method (one VS one) to divide the multi-class problem of N users into two classes problems between N × N-1/2 users, where each two class problem corresponds to an evolutionary neural Network (NEAT), such as 1VS 1, … …, 1VS N, 2VS 1, … …, 2VS N, … …, and N-1VS N shown in fig. 2;
among them, the one-to-one method (one vs one) means: if n classes are provided, a binomial classifier is established for every two classes, and k is n (n-1)/2 classifiers are obtained. When new data is classified, the k classifiers are used for classification in sequence, each classification is equivalent to one voting, and the classification result is equivalent to which class is voted. After all k classifiers are used for classification, the class with the most votes is selected as the final classification result, which is equivalent to k times of voting. Therefore, in this embodiment, the one-to-one method (one vs one) is used to divide the multi-classification problem of N users into two classification problems between N × N-1/2 users, which correspond to N × N-1/2 evolved neural networks, and each evolved neural network is a classifier. The training set input by each classifier is the keyboard action characteristics and the mouse action characteristics of two users, and the output is the similarity of the two users;
step 33, setting the number of input nodes of each evolutionary neural network as the number of keyboard action features and mouse action features in a training set, setting the number of hidden nodes as 0 and the number of output nodes as 2, wherein the hidden nodes respectively correspond to the similarity degrees of two users in classification, and the highest similarity degree is 1 and the lowest similarity degree is 0; setting the Fitness function Fitness of each evolutionary neural network as:
Fitness=fitness-(output[0]-xo[0])**2-(output[1]-xo[1])**2
wherein, the fitness is a numerical value expected to be reached, is the number of samples of the training set, i.e. 2, the output is the output corresponding to each sample of the training set input by the current evolutionary neural network, and the xo is the expected output; the larger the Fitness is, the higher the recognition success rate of the evolutionary neural network is, and when the output of each sample in the training set is consistent with the expected output, the Fitness reaches the maximum value;
and step 34, training the N x (N-1)/2 evolutionary neural networks to obtain N x (N-1)/2 evolutionary neural network models with the highest user classification success rate one by one, and taking the models as user identification models.
(4) User identification:
step 41, collecting keyboard metadata and mouse metadata of a user to be identified by using the method in step 1;
step 42, extracting the keyboard action characteristics and the mouse action characteristics of the user to be identified by using the method in the step 2;
and 43, inputting the keyboard action characteristics and the mouse action characteristics of the user to be identified into the N (N-1)/2 evolutionary neural network models obtained in the step 3, taking the similarity output by each evolutionary neural network model as the voting scores of the user, and correspondingly adding the voting scores according to the users to obtain the user with the highest score, namely the user to be identified.
As can be seen from the above, the present invention has the following beneficial effects:
1. the invention has high recognition rate and more comprehensive data sources: the invention combines the keyboard action characteristic and the mouse action characteristic, has more comprehensive data sources than a method only using a single characteristic, and has higher recognition rate than the traditional SVM and neural network algorithm after training by using a NEAT algorithm.
2. The invention is not influenced by the user operation environment: after using the direction, curvature angle, and curvature distance features of the mouse action, it is relatively platform independent since the direction and curvature angle are not based on screen size or other elements of the user's environment. This angle-based metric is relatively stable on different platforms and is not affected by the user's operating environment.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (9)

1. A method for identifying a user using keyboard and mouse data in an uncontrolled environment, comprising the steps of:
step 1, data acquisition: deploying a keyboard action acquisition program and a mouse action acquisition program in a computer, and acquiring keyboard metadata and mouse metadata in daily operation of the computer;
step 2, feature extraction: extracting keyboard action characteristics and mouse action characteristics from the collected keyboard metadata and mouse metadata;
step 3, training a model: training an evolutionary neural network with enhanced topology among each user by using the extracted keyboard action characteristics and mouse action characteristics to obtain a user identification model;
step 4, user identification: and identifying the user to be identified by utilizing the user identification model.
2. The method for identifying a user using keyboard and mouse data in an uncontrolled environment as recited in claim 1, wherein the method of step 1 is: a keyboard action acquisition program and a mouse action acquisition program deployed in a computer acquire keyboard metadata and mouse metadata in daily operation of the computer through a hook chain table; the keyboard metadata comprises a key name, a key press timestamp and a key release timestamp; the mouse metadata includes an operation type, a timestamp, and x and y coordinates of a mouse pointer position.
3. The method for identifying a user using keyboard and mouse data in an uncontrolled environment as claimed in claim 2, wherein the method for extracting the keyboard action features in step 2 is:
step 211, dividing the keyboard metadata into keyboard data sets of a time window t 1;
step 212, calculating the duration of each key and the delay time of the previous key and the next key by using the keyboard data set; the duration of each key is the average value of each pressing time of each key in a time window t1, and the pressing time is the difference between the key release time stamp of each key and the key pressing time stamp; the delay time is the difference value of the key pressing time stamp of the next key and the key releasing time stamp of the previous key;
step 213, if the duration of each key in the keyboard data set is a duration characteristic, and the delay time of the previous key and the next key is a delay time characteristic, the keyboard action characteristics include k duration characteristics and k2And a delay time characteristic, wherein k is the number of keys of a keyboard used by the computer.
4. The method of claim 3, wherein a delay threshold T1 is set before the keyboard action feature is extracted, and when the time for the user to stop using the keyboard exceeds the delay threshold T1, a new keyboard data sequence is divided to remove the data blank in the keyboard action feature when the delay occurs.
5. The method for identifying a user using keyboard and mouse data in an uncontrolled environment as claimed in claim 1, wherein the method for extracting the mouse action feature in step 2 is:
step 221, dividing the mouse metadata into mouse data sets of a time window t 2;
step 222, calculating the direction, curvature angle and curvature distance of the mouse action by using the mouse data set:
(1) the direction is as follows: for any two continuous points B and C, the point B to the point C runs along a straight line, and the included angle between the straight line and the horizontal line is the action direction of the mouse;
(2) angle of curvature: for any three continuous points A, B and C, the included angle between the straight line from the point A to the point B and the straight line from the point B to the point C is the curvature angle;
(3) curvature distance: for any three points A, B and C in succession, the curvature distance of point B is the ratio of the perpendicular distance from point B to the line between point A and point C to the straight line distance from point A to point C;
step 223, respectively calculating the cumulative distribution function of the mouse motion direction, curvature angle and curvature distance in the mouse data set as the mouse motion characteristics.
6. The method of claim 3, wherein a delay threshold T2 is set before the mouse action feature is extracted, and when the time for the user to stop using the mouse exceeds the delay threshold T2, a new mouse data sequence is divided to remove the data blank in the mouse action feature when the delay occurs.
7. The method for identifying a user using keyboard and mouse data in an uncontrolled environment as recited in claim 3, wherein the method of step 3 is:
step 31, performing z-score normalization processing on the extracted keyboard action features and mouse action features to enable the average value of the input keyboard action features and mouse action features to be 0 and the variance to be 1; after normalization processing, dividing all keyboard action characteristics and mouse action characteristics into a training set and a verification set;
step 32, using a one-to-one method to divide the multi-classification problem of the N users into two classification problems between N x (N-1)/2 users, wherein each two classification problem corresponds to an evolutionary neural network;
step 33, setting the number of input nodes of each evolutionary neural network as the number of keyboard action features and mouse action features in a training set, setting the number of hidden nodes as 0 and the number of output nodes as 2, wherein the hidden nodes respectively correspond to the similarity degrees of two users in classification, and the highest similarity degree is 1 and the lowest similarity degree is 0; setting the Fitness function Fitness of each evolutionary neural network as:
Fitness=fitness-(output[0]-xo[0])**2-(output[1]-xo[1])**2
wherein, the fitness is a numerical value expected to be reached, is the number of samples of the training set, i.e. 2, the output is the output corresponding to each sample of the training set input by the current evolutionary neural network, and the xo is the expected output; the larger the Fitness is, the higher the recognition success rate of the evolutionary neural network is, and when the output of each sample in the training set is consistent with the expected output, the Fitness reaches the maximum value;
and step 34, training the N x (N-1)/2 evolutionary neural networks to obtain N x (N-1)/2 evolutionary neural network models with the highest user classification success rate one by one, and taking the models as user identification models.
8. The method for identifying a user using keyboard and mouse data in an uncontrolled environment as claimed in claim 7, wherein 70% of all the keyboard action features and mouse action features are used as a training set and 30% are used as a verification set after normalization in step 31.
9. The method for identifying a user using keyboard and mouse data in an uncontrolled environment as recited in claim 7, wherein the method of step 4 is:
step 41, collecting keyboard metadata and mouse metadata of a user to be identified by using the method in step 1;
step 42, extracting the keyboard action characteristics and the mouse action characteristics of the user to be identified by using the method in the step 2;
and 43, inputting the keyboard action characteristics and the mouse action characteristics of the user to be identified into the N x (N-1)/2 evolutionary neural network models obtained in the step 3, taking the similarity output by each evolutionary neural network model as the voting scores of the users, and correspondingly adding the voting scores according to the users to obtain the user with the highest score, namely the user to be identified.
CN201911291751.2A 2019-12-16 2019-12-16 Method for identifying user by using keyboard and mouse data in uncontrollable environment Active CN111124860B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911291751.2A CN111124860B (en) 2019-12-16 2019-12-16 Method for identifying user by using keyboard and mouse data in uncontrollable environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911291751.2A CN111124860B (en) 2019-12-16 2019-12-16 Method for identifying user by using keyboard and mouse data in uncontrollable environment

Publications (2)

Publication Number Publication Date
CN111124860A true CN111124860A (en) 2020-05-08
CN111124860B CN111124860B (en) 2021-04-27

Family

ID=70499002

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911291751.2A Active CN111124860B (en) 2019-12-16 2019-12-16 Method for identifying user by using keyboard and mouse data in uncontrollable environment

Country Status (1)

Country Link
CN (1) CN111124860B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112269937A (en) * 2020-11-16 2021-01-26 加和(北京)信息科技有限公司 Method, system and device for calculating user similarity
CN116633586A (en) * 2023-04-07 2023-08-22 北京胜博雅义网络科技有限公司 Identification authentication analysis system based on Internet of things

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040095384A1 (en) * 2001-12-04 2004-05-20 Applied Neural Computing Ltd. System for and method of web signature recognition system based on object map
US20070060114A1 (en) * 2005-09-14 2007-03-15 Jorey Ramer Predictive text completion for a mobile communication facility
CN104809377A (en) * 2015-04-29 2015-07-29 西安交通大学 Method for monitoring network user identity based on webpage input behavior characteristics
CN106445101A (en) * 2015-08-07 2017-02-22 飞比特公司 Method and system for identifying user
CN107423549A (en) * 2016-04-21 2017-12-01 唯亚威解决方案股份有限公司 Healthy tracking equipment
CN109871673A (en) * 2019-03-11 2019-06-11 重庆邮电大学 Based on the lasting identity identifying method and system in different context environmentals
CN110443012A (en) * 2019-06-10 2019-11-12 中国刑事警察学院 Personal identification method based on keystroke characteristic

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040095384A1 (en) * 2001-12-04 2004-05-20 Applied Neural Computing Ltd. System for and method of web signature recognition system based on object map
US20070060114A1 (en) * 2005-09-14 2007-03-15 Jorey Ramer Predictive text completion for a mobile communication facility
CN104809377A (en) * 2015-04-29 2015-07-29 西安交通大学 Method for monitoring network user identity based on webpage input behavior characteristics
CN106445101A (en) * 2015-08-07 2017-02-22 飞比特公司 Method and system for identifying user
CN107423549A (en) * 2016-04-21 2017-12-01 唯亚威解决方案股份有限公司 Healthy tracking equipment
CN109871673A (en) * 2019-03-11 2019-06-11 重庆邮电大学 Based on the lasting identity identifying method and system in different context environmentals
CN110443012A (en) * 2019-06-10 2019-11-12 中国刑事警察学院 Personal identification method based on keystroke characteristic

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
PURVASHI BAYNATH: "Pattern representation using Neuroevolution of the augmenting topology (NEAT) on Keystroke dynamics features in Biometrics", 《IEEE》 *
王勇: "一种基于用户行为状态特征的流量识别方法", 《计算机应用研究》 *
王振辉: "基于鼠标和键盘行为特征组合的用户身份认证", 《计算机应用与软件》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112269937A (en) * 2020-11-16 2021-01-26 加和(北京)信息科技有限公司 Method, system and device for calculating user similarity
CN112269937B (en) * 2020-11-16 2024-02-02 加和(北京)信息科技有限公司 Method, system and device for calculating user similarity
CN116633586A (en) * 2023-04-07 2023-08-22 北京胜博雅义网络科技有限公司 Identification authentication analysis system based on Internet of things

Also Published As

Publication number Publication date
CN111124860B (en) 2021-04-27

Similar Documents

Publication Publication Date Title
CN104809377B (en) Network user identity monitoring method based on webpage input behavior feature
Sae-Bae et al. Online signature verification on mobile devices
Uludag et al. Biometric template selection and update: a case study in fingerprints
Giot et al. Unconstrained keystroke dynamics authentication with shared secret
EP2477136A1 (en) Method for continuously verifying user identity via keystroke dynamics
Ansari et al. Online signature verification using segment‐level fuzzy modelling
Brocardo et al. Toward a framework for continuous authentication using stylometry
Shimshon et al. Clustering di-graphs for continuously verifying users according to their typing patterns
Chang et al. Capturing cognitive fingerprints from keystroke dynamics
CN111124860B (en) Method for identifying user by using keyboard and mouse data in uncontrollable environment
Gamaarachchige et al. Multi-task, multi-channel, multi-input learning for mental illness detection using social media text
Van Nguyen et al. Finger-drawn pin authentication on touch devices
Kratky et al. Recognition of web users with the aid of biometric user model
Sae-Bae et al. Distinctiveness, complexity, and repeatability of online signature templates
Sudhish et al. Adaptive fusion of biometric and biographic information for identity de-duplication
Kochegurova et al. Aspects of continuous user identification based on free texts and hidden monitoring
Khoh et al. Score level fusion approach in dynamic signature verification based on hybrid wavelet‐Fourier transform
Zhao Learning user keystroke patterns for authentication
Nyssen et al. A multi-stage online signature verification system
Yu et al. Mental workload classification via online writing features
KR101248156B1 (en) Method and apparatus for user authentication based on keystroke dynamics pattern data
Monaco Classification and authentication of one-dimensional behavioral biometrics
Earl et al. Identifying soft biometric features from a combination of keystroke and mouse dynamics
Pelto et al. Your Identity is Your Behavior-Continuous User Authentication based on Machine Learning and Touch Dynamics
Gunetti et al. Dealing with different languages and old profiles in keystroke analysis of free text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant