CN113742669A

CN113742669A - User authentication method based on twin network

Info

Publication number: CN113742669A
Application number: CN202110948622.7A
Authority: CN
Inventors: 朱添田; 应杰
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2021-08-18
Filing date: 2021-08-18
Publication date: 2021-12-03
Anticipated expiration: 2041-08-18

Abstract

The invention discloses a twin network-based user authentication method, which comprises the steps of obtaining motion sensor data on mobile equipment of a user; performing machine abnormal noise filtering on the acquired motion sensor data; performing artificial noise filtration on the motion sensor data subjected to abnormal noise filtration by the machine; taking the LSTM neural network as two sub-networks of the twin network, and replacing an output layer of the LSTM neural network with a stretching layer to obtain the constructed twin network; training the twin network by utilizing the motion sensor data after artificial noise filtration to obtain an optimal twin network; and performing user authentication on the motion sensor data on the mobile equipment of the user after real-time collection and machine abnormal noise filtration based on the optimal twin network. The twin network-based user authentication method is high in accuracy and capable of protecting user privacy to the greatest extent.

Description

User authentication method based on twin network

Technical Field

The application belongs to the technical field of mobile equipment user authentication, and particularly relates to a twin network-based user authentication method.

Background

The rapid growth in storage space and computing power has made mobile devices a key tool for engaging in internet interactions. Today's mobile devices, especially smart phones, have become an important platform for users and various data and media interactions.

Meanwhile, in order to prevent the privacy of the user from being revealed, many techniques have been proposed to authenticate the device user. In different aspects, the authentication method can be mainly divided into two types: knowledge-based authentication and biometric-based authentication. Knowledge-based authentication methods require the user to provide specific information (e.g., password, PIN, and gesture) before proceeding with subsequent actions. This approach is indeed economical, but has several disadvantages, such as the need to repeat the input in a small dialog box, the presence of a variety of representative attacks (brute force attacks, shoulder attacks, surfing attacks, touch screen smudges and sensor-based inferences). Accordingly, biometric-based authentication methods (e.g., static biometric indicators such as fingerprints and faces) are highly popular among users because of their high efficiency and accuracy. However, the above-mentioned static biometric-based techniques require the user to explicitly participate in the authentication process. For example, the user must look straight at the camera or press a finger against the fingerprint sensor. Frequent human-computer interactions undoubtedly affect the user experience. In addition, users may also worry about privacy disclosure caused by collecting biometric information.

With the continuous improvement of the requirements on the safety, usability and privacy of the authentication system, an authentication system which is user-friendly, suitable for various scenes, high in precision and privacy protection is urgently needed. In recent years, there have been many studies on dynamic user authentication based on motion sensors. These methods typically rely on handset data from a number of motion sensors, such as acceleration sensors, gravity sensors, gyroscope sensors, and the like. By applying different machine learning or deep learning algorithms, the unique gait or posture of the user can be identified, so that the aim of user authentication is fulfilled. Among these studies, espiacal work has been performed by espiago (General, Efficient and Robust Mobile User Authentication in noise Environment), which enables a versatile, Efficient, and Robust Implicit User Authentication system. However, the problems of low coverage rate, weak denoising capability, insufficient mobility, privacy disclosure, accuracy and the like of the human-computer interaction mode still exist, so that the dynamic user authentication technology based on the motion sensor is difficult to fall to the ground in real life.

Disclosure of Invention

The application aims to provide a twin network-based user authentication method which is high in accuracy and capable of protecting user privacy to the greatest extent.

In order to achieve the purpose, the technical scheme adopted by the application is as follows:

a twin network based user authentication method, the twin network based user authentication method comprising:

step 1, data acquisition and filtering stage:

step 1.1, acquiring motion sensor data on a mobile device of a user;

step 1.2, performing machine abnormal noise filtration on the acquired motion sensor data;

step 1.3, filtering artificial noise of the motion sensor data subjected to abnormal noise filtering of the machine;

step 2, model construction and training stage:

step 2.1, twin network construction: taking the LSTM neural network as two sub-networks of the twin network, and replacing an output layer of the LSTM neural network with a stretching layer to obtain the constructed twin network;

step 2.2, twin network training: training the twin network by utilizing the motion sensor data after artificial noise filtration to obtain an optimal twin network;

step 3, a real-time authentication stage: and performing user authentication on the motion sensor data on the mobile equipment of the user after real-time collection and machine abnormal noise filtration based on the optimal twin network.

Several alternatives are provided below, but not as an additional limitation to the above general solution, but merely as a further addition or preference, each alternative being combinable individually for the above general solution or among several alternatives without technical or logical contradictions.

Preferably, the motion sensor includes an acceleration sensor, a gravity sensor, and a gyro sensor.

Preferably, the filtering the artificial noise of the motion sensor data filtered by the abnormal noise of the machine includes:

1.3.1, dividing user related data filtered by abnormal noise of a machine into motion sensor data of a user U and motion sensor data of other users, sequencing all the motion sensor data of the other users according to ARSSA, continuously dividing all the sequenced data into 5 parts, randomly extracting the same number of data fragments from each part, setting the labels of the data fragments to be 0, setting the label of the motion sensor data of the user U to be 1, and merging the data with the labels of 0 and 1 to obtain a data set TD;

step 1.3.2, random downsampling is carried out on the basis of a data set TD to generate a data set SD;

step 1.3.3, training a deep learning model based on the data set TD to obtain a TD model, training the deep learning model based on the data set SD to obtain an SD model, and calculating a loss value under each epoch for each data segment in the data set TD and the data set SD in model training;

step 1.3.4, for each data segment in the data set TD and the data set SD, concatenating the corresponding loss values under different epochs to form a loss sequence, and for each data segment in the data set SD, concatenating the loss sequence of the corresponding data segment in the data set TD with the loss value of the corresponding data segment in the data set TD to form a loss vector;

step 1.3.5, performing anomaly detection on the loss vector of each data segment in the data set SD to obtain an outlier, if the data segment corresponding to the outlier does not belong to the user U, deleting the data segment corresponding to the outlier in all the data segments of the user U according to the preset probability of dropout, and returning to the step 1.3.2 to continue iteration until the preset iteration frequency is reached; and if the data segment corresponding to the outlier belongs to the user U, returning to the step 1.3.2 to continue iteration until the preset iteration number is reached.

Preferably, the training the twin network by using the motion sensor data after being filtered by the artificial noise includes:

step 2.2.1, the filtered motion sensor data is taken, three data segments belonging to the same user in the motion sensor data and one data segment of other users are combined to be used as a training group, the training group is divided into two data pairs, the data pairs of the two data segments belonging to the same user are set as positive samples, and the data pairs of the two data segments not belonging to the same user are set as negative samples;

step 2.2.2, a training set is taken, the two data pairs are respectively input into two sub-networks of the twin network, and the distance between the stretching vectors output by the two sub-networks is calculated;

2.2.3, successively inputting the calculated distance into a full connection layer with an activation function of RELU and an output layer with an activation function of Sigmoid to finally obtain a confidence value;

step 2.2.4, comparing the confidence value with the authentication threshold value, and if the confidence value is greater than or equal to the authentication threshold value, indicating that the model authentication result is a legal user; otherwise, the model authentication result is an illegal user;

and 2.2.5, adjusting model parameters of the twin network according to the model authentication result and the data set as the positive sample or the negative sample, and repeatedly taking the training set for training until the set training end condition is reached.

The user authentication method based on the twin network provided by the application provides a differential training method, and the influence of a noise label is eliminated through two noise detection models; a twin network model based on LSTM is presented to process time series data. This will maximize the utilization of available data, fully cover interaction modalities and enable the migratability of authentication models; the trained model is deployed on the end side to protect the privacy of the user to the maximum extent, the time and economic consumption of a supplier are reduced to the maximum extent, and the usability of the system in the real world is improved.

Drawings

FIG. 1 is a flow chart of a twin network based user authentication method of the present application;

FIG. 2 is a flow chart of the generation of the data set TD of the present application;

FIG. 3 is a flow chart of the present application for filtering artifacts based on differential training;

FIG. 4 is a flow chart of the present application for outputting confidence based on twin networks.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.

Aiming at the problems of scarce training samples, low data utilization rate, weak denoising capability, poor mobility, privacy disclosure, low accuracy and the like existing in the user authentication research of the current mobile equipment, the embodiment provides a user authentication method based on a twin network.

As shown in fig. 1, the twin network-based user authentication method of the present embodiment includes the following steps:

step 1, data collection and filtering stage (i.e. data collection stage in the figure).

Step 1.1, motion sensor data (i.e. sensor characteristic data) on the user's mobile device is acquired.

In the model training stage, users participating in model training can be regarded as volunteers; in the real-time authentication phase, the user authenticated with the model may be considered as a new user.

For data collection, reading sensor data that is not related to privacy does not invoke authority related to privacy, and in this embodiment, an acceleration sensor (or called accelerometer), a gravity sensor (or called gravity meter), and a gyroscope sensor (or called gyroscope) are used as our data sources.

This embodiment has two corresponding collection states: an idle state (mobile device screen off or foreground with no new applications launched), and an active state (mobile device screen on and foreground with new applications launched). In particular, the data collection process is only performed in the active state, since only in the active state the sensor data may effectively represent the user's human-machine interaction pattern. In order to reduce the battery overhead, the sampling frequency is set to be 10Hz in the idle state and 50Hz in the active state.

In addition, the duration of the data collection phase is 3s, which has been discussed as being optimal in RISKCOG (unobtriven Real-Time User Authentication on Mobile Devices in the Wild, e.g., the related introduction in the link https:// ieeexplore. ie. org/abstract/document/8611185) and ESPIOGALC (e.g., the related introduction in the link https:// ieeexplore. org/abstract/document/9151330). Finally, the data from each user will be pre-processed and formatted into a uniform format (i.e., a 0.75s segment with 0.375s overlap).

Noisy data is essential in real world environments. The present embodiment divides the noise data into two types: machine anomalies and artifacts (mislabels). Noise in the data set will affect the performance of the authentication model to a large extent. Therefore, we need to adopt an effective method to eliminate the noise and improve the final accuracy.

And 1.2, performing machine abnormal noise filtration on the acquired motion sensor data.

Machine abnormal noise is due to abnormal moving equipment sensors. At the hardware level, the built-in sensors of the mobile device will be influenced by internal factors (e.g., manufacturing process, materials, actual) as well as external factors (e.g., environment, humidity). There is a difference between the direct reading of the sensor and the true value, which this embodiment refers to as a machine anomaly. Specifically, there are three types of anomalies: an equivalence exception, a skip exception, and a zero value exception. For the definition and filtering method of the abnormal noise data of the machine, reference is made to the disclosure in working espiaccog, and the detailed description is omitted here.

And step 1.3, performing artificial noise filtration on the motion sensor data subjected to abnormal noise filtration of the machine.

The artifact is because there is no restriction on human behavior during data collection. Because the supplier has no requirements on the behavior of volunteers using the mobile phone daily, the collected data contains as many complex and diverse human-computer interaction modes as possible, and the aim is to train a robust model. However, the following scenario is considered: a is a volunteer who is involved in the data collection activity. One day he temporarily lent his friend B to the mobile phone. But at this stage the sensor data is still uploaded to the cloud and labeled a. The correlation data of B, labeled a, is therefore referred to as artifacts. Such noise labels will undoubtedly influence the training process, and the final performance of the model.

Thus, the present embodiment uses differential training to eliminate noise labels. Specifically, two deep learning models (i.e., noise detection models, which may be any deep learning models) of the same architecture are iteratively trained in the differential training. In the denoising stage, the present embodiment uses two classifications to detect the noise label of each user. For each user, the data of the user is regarded as class 1, and the data of other users is regarded as class 0. The first model is trained using a data set that contains all of the data of the selected user and a portion of the data of the other people. For the second model, a second data set is generated from the first data set using random downsampling for training. For convenience, we refer to the first Data set as TD (Total Data) and the second downsampled Data set as SD (Sub-sampled Data). Accordingly, the corresponding noise detection models are a TD model (first model) and an SD model (second model).

The filtering of the artificial noise in this embodiment specifically includes the following steps:

step 1.3.1, in iteratively selecting a user, assume that the currently selected user is U. The binary task needs to solve the following two problems: 1) if only the data of the user U is selected to train the model, the fitting speed of the noise detection model is too fast (because all the labels are 1), resulting in too short a loss vector, from which the noise label cannot be found by the anomaly detection algorithm. 2) Such unbalanced data sets can severely impact the model training speed if the data of user U is combined with the data of all other users as input to the model. To solve the above problem, the present embodiment uses hierarchical sampling to meet the practical requirements of differential training.

As shown in fig. 2, the user-related data after being filtered by the machine abnormal noise is divided into motion sensor data of a user U and motion sensor data of other users (user 1 to user N), all data of the other users are sorted according to the ars (average root square of acceleration value and root mean square), all the sorted data are continuously divided into 5 parts, then the same number of data segments are randomly extracted from each part and the tags of the data segments are set to 0, the motion sensor data of the user U is set to 1, the data sets TD are obtained by combining the data sets TD with the tags of 0 and 1, and finally, the positive and negative tags are 1: 1, data set TD.

For each iteration, the noise label will be detected and deleted. The iteration comprises four steps, namely data set down sampling, TD and SD model training, loss vector generation and abnormal value detection, and specifically comprises the following steps as shown in FIG. 3:

step 1.3.2, firstly, data set down-sampling is carried out. The difference training randomly down-samples the data set TD to generate a small data set SD. The present embodiment selects a ratio of 0.2 from the data set TD, which satisfies both requirements: 1) the size of the data set SD is as small as possible, 2) the data set SD can converge in the model training. Thus, this embodiment will have 5 (1/0.2) data sets SD to cover the entire TD data, i.e., 5 iterations of the loop are required.

Step 1.3.3, a TD model is obtained by training a deep learning model based on the data set TD, an SD model is obtained by training the deep learning model based on the data set SD, and loss values under each epoch (time concept in deep learning) are calculated for each data segment in the data set TD and the data set SD in model training.

The present embodiment iteratively trains two deep learning models of the same architecture in differential training, and may be any deep learning model (e.g., RNN, CNN, etc.). In the embodiment, in consideration of the characteristics of time series data, LSTM is selected to detect a noise tag, the number of layers is selected to be 2, the number of neurons in each layer is 32, and a sigmoid layer is selected as an output layer. Tensorflow toolbox was used to train both models, with all other parameters being default parameters in the toolbox.

And step 1.3.4, for each data segment in the data set TD and the data set SD, concatenating the corresponding loss values under different epochs to form a loss sequence, and for each data segment in the data set SD, concatenating the loss sequence with the corresponding data segment in the data set TD to form a loss vector.

Since data is selected from the data set TD at a certain rate for downsampling when generating the data set SD, each data segment in the data set SD has a corresponding loss vector. And generating a data set SD by iteration each time, detecting outliers of each data segment in the data set SD, if the detected outliers do not belong to the user U, deleting the outliers at a probability equal to dropout until the data set TD is completely covered after multiple iterations (here, 5 iterations) and all data of the user U are subjected to abnormal detection of artificial noise once, namely, deleting the data.

In outlier detection, the loss vector generated by each data segment in the data set SD is collected and then input into the anomaly detection algorithm to obtain outliers. When outliers are deleted, the dropou probability dropout is set to simulate dropout in machine learning, so that the phenomenon of 'overfitting' is relieved to a certain extent, and possible deletion of mistaken detection points is reduced. Dropout in this embodiment is set to 0.5.

Since most anomaly detection algorithms require a containment rate (containment rate) as an input parameter as a threshold for detecting anomalies. The inclusion rate is set to 1-alpha in this embodiment_TDIn which α is_TDIs the average accuracy of 5-fold cross validation on the data set TD. In order to avoid instability from a single anomaly Detection algorithm, the present embodiment uses 7 anomaly Detection algorithms (Angle-Based outside Detector (ABOD), Clustering Based Local Outside Factor (CBLOF), Histogram-Based outside Detector (HBOS), isolationnforst outside Detector (I-forest), k-Nearest Neighbors Detector (kNN), Local Outside Factor (LOF), Principal Component Analysis outside Detector (PCA)), and uses a voting mechanism to decide the final Detection result. In addition, the embodiment provides an early-stopping mechanism in the training phase of the noise detection model (i.e., the loss value in the training phase varies less than the set threshold of 0.001), which greatly reduces the time for differential training (the differential training time in the experiment of the embodiment is reduced by 54%). In the experiments, the early stop mechanism was implemented using API early stop _ callback () from the TensorFlow toolkit, where all parameters had chosen default values. The supplier can dynamically adjust the threshold value according to the demand.

Step 2, model construction and training stage:

step 2.1, twin network construction:

the twin network is based on a framework of two neural networks coupled. The input to the twin network is two samples. Its two subnetworks each accept an input and then output a corresponding representation that is mapped in the upper level space. The distance between these two representations, for example Euclidean distance, will be calculated to refer to the similarity of the two inputs.

Twin networks are very good at judging how similar two inputs are. A well-trained twin neural network will maximize the representation differences of different label data and minimize the representation differences of the same label data.

Selecting LSTM As a subnetwork of the twin network can maximally learn the context-dependent Time Series Data (Format As Time Series Data As a Format of the Time Series Data). Meanwhile, the twin network has the function of enabling the model to learn whether two input data segments belong to the same user or not. This is the core capability of implicit user authentication, that is, whether the current user is a legitimate user is determined.

As shown in fig. 4, the present embodiment proposes a completely new model architecture. Because the network structure has commonality in the same task, the LSTM model here sets the number of layers to still be 2 and the number of neurons per layer to be 32, and the two subnetworks of the twin network share weights. At this point, the output layer of the LSTM neural network is fine-tuned in order to incorporate the twin neural network. The original output layer (softmax) is modified to a stretched layer (warped layer) in order to output a stretched vector for subsequent distance calculations.

Step 2.2, twin network training: and training the twin network by utilizing the motion sensor data after artificial noise filtration to obtain an optimal twin network.

And 2.2.1, obtaining the filtered motion sensor data, combining three data segments belonging to the same user in the motion sensor data and one data segment of other users to form a training group, dividing the training group into two data pairs, setting the data pairs of the two data segments belonging to the same user as positive samples, and setting the data pairs of the two data segments not belonging to the same user as negative samples.

For example, on a total data set with a total of N people, we randomly selected T users as a training set and M people as a test set. The data under each user is time slice data that is uniformly formatted to be imported into the model (i.e., the twin network). The final data segment format for each user is a two-dimensional vector [ R, C ], where R represents the size of the formatted data segment and C represents the axial sum of the selected sensors (3 axial, xyz, per sensor). In each iteration, 3 data segments are selected from the data of the user, and one data segment is selected from other users to feed the twin network. The 4 data fragments are divided into two data pairs p1 and p2 (i.e., time series data X1 and X2), where the label of p1 is 1 and the label of p2 is 0. Tag 1 indicates that the two data segments belong to the same user, and tag 0 is opposite. The data division method has the following advantages: 1) generating a positive label 1: 1 training set, 2) full use of all Data (High Data Utilization).

Step 2.2.2, a training set is taken, the two data pairs are respectively input into two sub-networks of the twin network, and the distance between the stretching vectors output by the two sub-networks is calculated; the calculation formula is as follows:

in the formula (I), the compound is shown in the specification,

for the jth element in the stretch vector F1 output by one of the subnetworks,

j, which is the length of the stretching vector of the jth element in the stretching vector F2 output for another sub-network, is 1,2,3, …, J.

2.2.3, successively inputting the calculated distance into a full connection layer with an activation function of RELU and an output layer with an activation function of Sigmoid to finally obtain a confidence value; confidence value

In the formula (I), the compound is shown in the specification,

time-series data X1 representing the input twin network at the ith iteration,

time series data X2 representing the input twin network at the ith iteration, p representing the similarity of the model to the input time series data X1 and X2 at the ith iteration, and c being a decimal between 0 and 1.

Step 2.2.4, comparing the confidence value with the authentication threshold value, and if the confidence value is greater than or equal to the authentication threshold value, indicating that the model authentication result is a legal user; otherwise, the model authentication result is an illegal user.

The present embodiment provides an authentication threshold t (t is between 0 and 1, preferably 0.5) to facilitate authentication. When c is greater than or equal to t, the user is accepted. Otherwise, the user may be rejected. Note that t is adjustable. T can be increased if one wants to distinguish illegal users more strictly. T may be reduced if a more accurate identification of legitimate users is desired.

And 2.2.5, adjusting model parameters of the twin network according to the model authentication result and the data set as the positive sample or the negative sample, and repeatedly taking the training set for training until the set training end condition is reached. The set training end condition may be the maximum number of iterations or the model output accuracy.

Like conventional model training, the present embodiment also adjusts model parameters by using a prediction tag (model authentication result) and an actual tag (positive and negative samples), and uses a two-class cross entropy as a loss function, and meanwhile, in order to solve the problem of sample imbalance, the present embodiment selects a twin network, and may iterate multiple times of input data pairs, and the positive and negative samples of the input data pairs for each iteration are 1: 1, the problem of sample imbalance is solved, and the finally obtained twin network is used as a transferable model in a subsequent real-time authentication stage.

In the real-time authentication stage, an artificial noise process is not needed, because the process of filtering the artificial noise occurs in the model training stage, in order to filter the self-collected data noise, the training is ensured to obtain an effective and reliable identification model. In the client stage (i.e. the real-time authentication stage) after model deployment, the process of filtering the artificial noise is not needed, and the model only needs to detect whether the current user is a real user in real time.

And in the real-time authentication stage, data pairs are not formed among the motion sensor data collected in real time and are input into the twin network, but the data pairs are paired with data segments which are determined to be legal users in the past and are input into the twin network. In order to obtain the data segments determined as the legal users, heuristic data collection (that is, data are collected at 50Hz/s for the first 3 seconds when the user opens a new application, and the data segments are sliced into 3 data segments) can be performed to obtain the data of the legal users, the data segments collected in real time and the data segments stored in the mobile terminal before are sufficiently paired for many times and input into a network for judgment, and whether the user is the legal user or not is determined according to the judgment result.

Experiment:

to evaluate the user authentication method (referred to as TRAPCOG for short) of the present embodiment, the following indexes are set: TP: the legal user is identified as a legal user; FP: the illegal user is identified as a legal user; FN: a legitimate user is identified as an illegitimate user; TN: the illegal user is identified as an illegal user.

The validity in the model evaluation is therefore: effectiveness includes true positive rate: TPR ═ TP/(TP + FN), true negative rate: TNR ═ TN/(TN + FP) and accuracy: the accuracy is (TP + TN)/(TP + FP + FN + TN).

In this experiment, the performance of TRAPCOG is compared with that of existing two working RISKCOG and ESPIALCOG. The TRAPCOG was trained on a 500-denoised user set and evaluated on a 500-denoised user test set. The RISKCOG and ESPIALCOG used 6 user trains and were tested on 1513 users. The training and test sets of RISKCOG and ESPIALCOG produce overlap, but TRAPCOG does not. The results of the experiment are shown in table 1:

table 1 experimental results of each model

The user state in table 1 refers to the exercise state of the user when data are collected, the RISKCOG method performs model training in a static state and a dynamic state, respectively, while the espiaccog method and the TRAPCOG method do not distinguish between the static state and the dynamic state, and data of various exercise states are used together for model training.

As can be seen from Table 1, the TPR and TNR of TRAPCOG of the application are far higher than those of RISKCOG and ESPIALCOG, and the TRAPCOG has extremely high effectiveness and accuracy. The main reason is that the application uses a twin network to more fully cover the man-machine interaction model. In detail, TRAPCOG combines the time series data learning ability of LSTM and the ability of twin network to judge input similarities and differences to achieve high accuracy.

A traditional method of data collection in user authentication is to invite participants to provide sensor data in certain specific situations, such as walking, going up and down stairs, picking up a cell phone and touching the screen of a mobile device, etc. The number and complexity of human-computer interaction patterns in the real world far outweigh the data collected from the laboratory. Therefore, a model obtained based on these limited data would not satisfy the utility of implicit real-time authentication.

Therefore, the user behavior is not limited when data are collected in experiments, and the experiments show that large data sets of thousands of people are collected for RISKCOG and ESPIALCOG. However, both can only use data of 6 persons in the data set, and obviously have the problems of narrow model coverage and low data utilization rate. And the TRAPCOG of this application can all data that collect by high-efficient utilization, that is to say, TRAPCOG can cover complicated changeable man-machine interaction mode for TRAPCOG can deploy in the real world.

The user authentication method provided by the application uses a twin network based on an LSTM neural network to execute implicit real-time mobile user identity verification. The problem of false labeling is solved through differential training based on downsampling, and the data set is fully utilized to cover the human-computer interaction mode of the real world to the maximum extent through a twin network with the LSTM as a sub-network. Meanwhile, the twin network has the capability of distinguishing similar degrees of input, so that the twin network trained by the application has strong mobility. Under the condition, the user can carry out identity verification only by locally operating the user authentication method, private data do not need to be uploaded, and the privacy of the user is protected to the maximum extent. The user authentication method provided by the application meets the requirements of security, privacy and usability of mobile user authentication, and real deployment is feasible.

The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A twin network based user authentication method, comprising:

step 1, data acquisition and filtering stage:

step 1.1, acquiring motion sensor data on a mobile device of a user;

step 2, model construction and training stage:

2. The twin network based user authentication method of claim 1, wherein the motion sensor includes an acceleration sensor, a gravity sensor, and a gyro sensor.

3. The twin network based user authentication method of claim 1, wherein the artificially noise filtering the motion sensor data after being noise-filtered by machine anomaly comprises:

4. The twin network based user authentication method of claim 1, wherein training the twin network with artifact filtered motion sensor data comprises: