CN113742669B

CN113742669B - User authentication method based on twin network

Info

Publication number: CN113742669B
Application number: CN202110948622.7A
Authority: CN
Inventors: 朱添田; 应杰
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2021-08-18
Filing date: 2021-08-18
Publication date: 2024-05-14
Anticipated expiration: 2041-08-18
Also published as: CN113742669A

Abstract

The application discloses a user authentication method based on a twin network, which comprises the steps of obtaining motion sensor data on mobile equipment of a user; filtering abnormal noise of the machine on the acquired motion sensor data; filtering the artificial noise of the motion sensor data after filtering the abnormal noise of the machine; taking an LSTM neural network as two sub-networks of the twin network, and replacing an output layer of the LSTM neural network with a tensile layer to obtain a constructed twin network; training the twin network by utilizing the motion sensor data filtered by artificial noise to obtain an optimal twin network; and carrying out user authentication on the motion sensor data on the mobile equipment of the user after real-time acquisition and machine abnormal noise filtering based on the optimal twin network. The user authentication method based on the twin network is high in accuracy, and user privacy can be protected to the greatest extent.

Description

User authentication method based on twin network

Technical Field

The application belongs to the technical field of mobile equipment user authentication, and particularly relates to a user authentication method based on a twin network.

Background

The rapid increase in memory and computing power has made mobile devices a key tool for engaging in internet interactions. Mobile devices, especially smartphones, today have become an important platform for users and various data to interact with media.

Meanwhile, in order to prevent the user privacy from being compromised, a variety of techniques have been proposed to authenticate the device user. From different aspects, the authentication method can be largely divided into two types: knowledge-based authentication and biometric-based authentication. Knowledge-based authentication methods require that the user provide specific information (say passwords, PINs and gestures) before proceeding with subsequent actions. This approach is truly economical but suffers from several drawbacks, such as the need to re-enter in a small dialog box, and various representative attacks (brute force, shoulder, surfing, touch screen stains and sensor-based inferences). Correspondingly, the authentication method based on the biological index (such as fingerprint, face and other static biological indexes) is widely appreciated by users due to high efficiency and high accuracy. However, the above-mentioned static biometric-based techniques require explicit participation by the user in the authentication process. For example, the user must either hold the camera straight or hold the finger to the fingerprint sensor. Frequent human-machine interactions undoubtedly affect the user experience. In addition, users are also concerned about personal privacy concerns associated with the collection of biometric information.

With the increasing demands on the security, availability and privacy of authentication systems, there is an urgent need for an authentication system that satisfies user friendliness, is suitable for various scenarios, and is high-precision and privacy-preserving. In recent years, there have been many studies on dynamic user authentication based on motion sensors. These methods typically come from some motion sensors, such as acceleration sensors, gravity sensors, gyroscopic sensors, etc. to phone data. By applying different machine learning or deep learning algorithms, the unique gait or gesture of the user can be identified, thereby achieving the purpose of user authentication. Among these studies, ESPIALCOG (General, EFFICIENT AND Robust Mobile User Implicit Authentication in Noisy Environment) is the most representative work, which implements a generic, efficient, robust implicit user authentication system. However, the coverage rate of the man-machine interaction mode is low, the denoising capability is weak, mobility is insufficient, privacy is revealed, the problems of accuracy and the like still exist, and the dynamic user authentication technology based on the motion sensor is difficult to land in real life.

Disclosure of Invention

The application aims to provide a user authentication method based on a twin network, which has high accuracy and can protect user privacy to the greatest extent.

In order to achieve the above purpose, the technical scheme adopted by the application is as follows:

a user authentication method based on a twin network, the user authentication method based on the twin network comprising:

step 1, data acquisition and filtering:

step 1.1, acquiring motion sensor data on mobile equipment of a user;

step 1.2, filtering abnormal noise of the machine on the acquired motion sensor data;

step 1.3, filtering the artificial noise of the motion sensor data after filtering the abnormal noise of the machine;

Step 2, constructing a model and training the model:

Step 2.1, constructing a twin network: taking an LSTM neural network as two sub-networks of the twin network, and replacing an output layer of the LSTM neural network with a tensile layer to obtain a constructed twin network;

Step 2.2, twin network training: training the twin network by utilizing the motion sensor data filtered by artificial noise to obtain an optimal twin network;

step 3, a real-time authentication stage: and carrying out user authentication on the motion sensor data on the mobile equipment of the user after real-time acquisition and machine abnormal noise filtering based on the optimal twin network.

The following provides several alternatives, but not as additional limitations to the above-described overall scheme, and only further additions or preferences, each of which may be individually combined for the above-described overall scheme, or may be combined among multiple alternatives, without technical or logical contradictions.

Preferably, the motion sensor includes an acceleration sensor, a gravity sensor, and a gyro sensor.

Preferably, the filtering the artificial noise of the motion sensor data after filtering the abnormal noise of the machine includes:

Step 1.3.1, dividing user related data filtered by machine abnormal noise into motion sensor data of a user U and motion sensor data of other users, sequencing all motion sensor data of other users according to ARSSA, continuously dividing all sequenced data into 5 parts, randomly extracting the same number of data fragments from each part, setting the labels of the data fragments as 0, setting the labels of the motion sensor data of the user U as 1, and combining the data with the labels as 0 and 1 to obtain a data set TD;

Step 1.3.2, randomly downsampling based on the data set TD to generate a data set SD;

Step 1.3.3, training a deep learning model based on the data set TD to obtain a TD model, training the deep learning model based on the data set SD to obtain an SD model, and calculating a loss value under each epoch for each data segment in the data set TD and the data set SD in model training;

Step 1.3.4, for each data segment in the data set TD and the data set SD, concatenating the corresponding loss values under different epochs to form a loss sequence, and for each data segment in the data set SD, concatenating the loss sequence with the corresponding loss sequence of the data segment in the data set TD to form a loss vector;

Step 1.3.5, carrying out anomaly detection on the loss vector of each data segment in the data set SD to obtain an outlier, if the data segment corresponding to the outlier does not belong to the user U, deleting the data segment corresponding to the outlier from all the data segments of the user U with the preset dropout probability, and returning to the step 1.3.2 to continue iteration until the preset iteration times are reached; if the data segment corresponding to the outlier belongs to the user U, returning to the step 1.3.2 to continue iteration until the preset iteration times are reached.

Preferably, the training the twin network using the motion sensor data after the artificial noise filtering includes:

2.2.1, the filtered motion sensor data are taken, three data fragments belonging to the same user in the motion sensor data and one data fragment of other users are combined to form a training group, the training group is divided into two data pairs, the data pairs of the two data fragments belonging to the same user are set as positive samples, and the data pairs of the two data fragments not belonging to the same user are set as negative samples;

step 2.2.2, taking a training set, respectively inputting two data pairs into two sub-networks of a twin network, and calculating the distance between stretching vectors output by the two sub-networks;

step 2.2.3, inputting the calculated distance into a full-connection layer with an activation function of RELU and an output layer with an activation function of Sigmoid in sequence, and finally obtaining a confidence value;

2.2.4, comparing the confidence value with an authentication threshold, and if the confidence value is greater than or equal to the authentication threshold, indicating that the model authentication result is a legal user; otherwise, the model authentication result is an illegal user;

And 2.2.5, adjusting model parameters of the twin network according to the model authentication result and the data set as the positive sample or the negative sample, and repeatedly taking the training set for training until the set training ending condition is reached.

The application provides a user authentication method based on a twin network, which provides a differential training method, wherein the influence of a noise label is deleted through two noise detection models; an LSTM-based twin network model is presented for processing time series data. This will maximize the utilization of the available data, fully covering the interaction modality and enabling the mobility of the authentication model; the trained model is deployed on the end side to protect the privacy of the user to the greatest extent, reduce the time and economic consumption of suppliers to the greatest extent, and promote the usability of the system in the real world.

Drawings

FIG. 1 is a flow chart of a twin network-based user authentication method of the present application;

FIG. 2 is a flow chart of the data set TD generation of the present application;

FIG. 3 is a flow chart of filtering artificial noise based on differential training in accordance with the present application;

FIG. 4 is a flow chart of the present application based on twin network output confidence.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein in the description of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application.

Aiming at the problems of sparse training samples, low data utilization rate, weak denoising capability, poor mobility, privacy disclosure, low accuracy and the like in the conventional user authentication research of mobile equipment, the embodiment provides a user authentication method based on a twin network.

As shown in fig. 1, the user authentication method based on the twin network of the present embodiment includes the following steps:

Step 1, data collection and filtering stage (i.e. data collection stage in the figure).

Step 1.1, motion sensor data (i.e. sensor feature data) on a mobile device of a user is acquired.

In the model training phase, users participating in model training can be regarded as volunteers; in the real-time authentication phase, a user authenticated with the model may be considered a new user.

For data collection, reading sensor data that is not related to privacy will not invoke privacy-related rights, in this embodiment an acceleration sensor (or accelerometer), gravity sensor (or gravitational meter), and a gyroscopic sensor (or gyroscopic device) are used as our data sources.

The present embodiment has two corresponding collection states: an idle state (mobile screen off or foreground with no new application on), and an active state (mobile screen on and foreground with new application on). In particular, the data collection process is only performed in the active state, since only in the active state the sensor data may effectively represent a user human interaction pattern. In order to reduce the battery overhead, the sampling frequency is set to be 10Hz in the idle state and set to be 50Hz in the active state.

In addition, the duration of the data collection phase is 3s, which has been discussed as optimal in RISKCOG (Unobtrusive Real-Time User Authentication on Mobile DEVICES IN THE WILD, e.g., link https:// ieeeexplore. Ieeee.org/abstract/document/8611185/related description) and ESPIALCOG (e.g., link https:// ieeeexplore. Org/abstract/document/9151330/related description). Finally, the data from each user will be preprocessed and formatted into a uniform format (i.e., a 0.75s segment with 0.375s overlap).

Noise data is essential in real world environments. The present embodiment classifies noise data into two types: machine anomalies and artifacts (false labels). Noise in the data set will affect the performance of the authentication model to a large extent. Therefore, there is a need to eliminate noise and improve the final accuracy using an efficient method.

And 1.2, filtering abnormal noise of the machine on the acquired motion sensor data.

Machine anomaly noise is caused by anomalous mobile device sensors. At the hardware level, the built-in sensors of the mobile device will be driven by internal factors (e.g., manufacturing process, materials, reality) as well as external factors (e.g., environment, humidity). There is a difference between the direct reading of the sensor and the true value, which is referred to as a machine anomaly in this embodiment. Specifically, there are three anomalies: equivalent anomalies, skip value anomalies, and zero value anomalies. For the definition of machine anomaly data and the filtering method, reference is made to the disclosure in work ESPIALCOG, and no further description is given here.

And 1.3, filtering the artificial noise of the motion sensor data after filtering the abnormal noise of the machine.

Artifacts are because there are no restrictions on the behavior of a person during the data collection process. Because the supplier does not have any requirement on the daily use of the mobile phone by the volunteer, the collected data contains as many man-machine interaction modalities as possible, with the aim of training a robust and robust model. However, consider the following scenario: a is a volunteer, engaged in data acquisition activities. There is a day that he temporarily borrows his friend B from his mobile phone. But at this stage, the sensor data is still uploaded to the cloud and is marked a. The relevant data of B, marked a, is therefore referred to as artifacts. Such noisy labels will undoubtedly affect the training process, as well as the final performance of the model.

Thus, the present embodiment uses differential training to eliminate noise tags. Specifically, the training iterated in differential training is two deep learning models (i.e., noise detection models, which may be any deep learning model) of the same architecture. In the denoising stage, this embodiment uses two classifications to detect the noise signature of each user. For each user, this user's own data is considered to be category 1, and the other users' data is considered to be category 0. The first model is trained using a dataset that contains all data of the selected user and a portion of the data of the other person. For the second model, a second data set is generated from the first data set using random downsampling for training. For convenience we refer to the first dataset as TD (Total Data) and the second downsampled dataset as SD (Sub-SAMPLED DATA). Accordingly, the corresponding noise detection models are a TD model (first model) and an SD model (second model).

The filtering of the artificial noise in this embodiment specifically includes the following steps:

step 1.3.1, in the process of iteratively selecting users, it is assumed that the currently selected user is U. Whereas the classification task needs to solve two problems: 1) If only user U's data is selected to train the model, the fitting speed of the noise detection model will be too fast (because all labels are 1), resulting in a loss vector that is too short for the anomaly detection algorithm to find noise labels from. 2) Such unbalanced data sets can seriously affect the model training speed if the data of user U and the data of all other users are combined as input to the model. To solve the above problems, the present embodiment uses hierarchical sampling to satisfy the actual demands of differential training.

As shown in fig. 2, the user related data after filtering the machine abnormal noise is divided into motion sensor data of a user U and motion sensor data of other users (user 1 to user N), all data of the other users are ordered according to ARSSA (average root sum square of acceleration, sum root mean square of acceleration values), all ordered data are continuously divided into 5 parts, then the same number of data fragments are randomly extracted from each part and the labels of the data fragments are set to be 0, the motion sensor data of the user U is set to be 1, the data with the labels of 0 and 1 are combined to obtain a data set TD, and finally positive and negative labels of 1 can be obtained: 1, data set TD.

For each iteration, the noise signature will be detected and deleted. The iteration comprises four steps, namely data set downsampling, training of TD and SD models, generation of loss vectors and outlier detection, and specifically comprises the following steps as shown in figure 3:

Step 1.3.2, data set downsampling is performed first. The differential training randomly downsamples the data set TD to generate a small data set SD. The ratio selected from the data set TD in this embodiment is 0.2, which meets both requirements: 1) The size of the data set SD is as small as possible, 2) the data set SD can converge in model training. Thus, in this embodiment, there will be 5 (1/0.2) data sets SD to cover the entire TD data, i.e. 5 loop iterations are required.

And 1.3.3, training a deep learning model based on the data set TD to obtain a TD model, training the deep learning model based on the data set SD to obtain an SD model, and calculating a loss value under each epoch (period concept in deep learning) for each data segment in the data set TD and the data set SD in model training.

The present embodiment iteratively trains two deep learning models of the same architecture at the time of differential training, and may be any deep learning model (e.g., RNN and CNN, etc.). In this embodiment, considering the characteristics of time-series data, LSTM is selected to detect noise tags, the number of layers is selected to be 2, the number of neurons per layer is 32, and a sigmoid layer is selected as an output layer. Tensorflow toolbox was used to train both models, with all other parameters being default parameters in toolbox.

Step 1.3.4, for each data segment in the data set TD and the data set SD, concatenating the corresponding penalty values at different epochs to form a penalty sequence, and for each data segment in the data set SD concatenating it with the penalty sequence of the corresponding data segment in the data set TD to form a penalty vector.

Since the data is selected from the data set TD for downsampling at a certain ratio when generating the data set SD, each data segment in the data set SD has a corresponding loss vector. And generating a data set SD by each iteration, detecting an outlier for each data segment in the data set SD, if the detected outlier does not belong to the user U, deleting the outlier with the probability equal to dropout until the data set TD is completely covered after a plurality of iterations (5 rounds are set here), and performing anomaly detection of artificial noise on all data of the user U at the moment, namely deleting the data.

Upon outlier detection, the loss vector generated by each data segment in the data set SD is collected and then input into an anomaly detection algorithm to obtain an outlier. When the outlier is deleted, the aim of setting dropou the probability dropout is to imitate dropout in machine learning, so that the phenomenon of 'over fitting' is relieved to a certain extent, and the deletion of possible false detection points is reduced. Dropout in this embodiment is set to 0.5.

Since most anomaly detection algorithms require an inclusion rate (inclusion rate) as an input parameter as a threshold for detecting anomalies. The inclusion rate in this example was set to 1-a _TD, where a _TD is the average accuracy of the 5-fold cross-validation of the data set TD. To avoid instability from a single anomaly detection algorithm, the present embodiment uses 7 anomaly detection algorithms (Angle-based Outlier Detector(ABOD)、Clustering Based Local Outlier Factor(CBLOF)、Histogram-based Outlier Detection(HBOS)、IsolationForest Outlier Detector(I-forest)、k-Nearest Neighbors Detector(kNN)、Local Outlier Factor(LOF)、Principal Component Analysis Outlier Detector(PCA)), and a voting mechanism to determine the final detection result. In addition, the present embodiment provides an early-stop mechanism (i.e., the loss value change during the training phase is less than the set threshold value of 0.001) during the training phase of the noise detection model, which greatly reduces the time of differential training (54% reduction in differential training time in the experiment of the present embodiment). In the experiment, API earlystop _callback () from TensorFlow toolkit was used to implement the early-stop mechanism, where all parameters selected the default value. The provider can dynamically adjust the threshold according to the needs.

Step 2, constructing a model and training the model:

step 2.1, constructing a twin network:

Twin networks are frameworks based on the coupling of two neural networks. The input to the twin network is two samples. Its two subnetworks each accept an input and then output a corresponding representation that maps to the higher-level space. The distance between the two representations, for example Eucldean distance, will be calculated to refer to the similarity of the two inputs.

Twin networks are very good at judging the degree of similarity of two inputs. A trained twin neural network will maximize the representation gap for different tag data and minimize the representation gap for the same tag data.

The sub-network selecting LSTM as the twin network can maximize learning context-dependent time-series data (Format AS TIME SERIES DATA as a Format of time-series data). The twin network is used for enabling the model to learn whether two input data fragments belong to the same user or not. This is the core capability of implicit user authentication, i.e., determining whether the current user is a legitimate user.

As shown in fig. 4, this embodiment proposes a completely new model architecture. Because the network structure has versatility in the same task, the LSTM model here sets the number of layers to still 2 and the number of neurons per layer to 32, and the two sub-networks of the twin network share weights. At this time, in order to combine the twin neural network, the output layer of the LSTM neural network is trimmed. The original output layer (softmax) is modified to be a stretched layer (FLATTENED LAYER) in order to output a stretched vector for subsequent distance calculations.

Step 2.2, twin network training: and training the twin network by utilizing the motion sensor data filtered by the artificial noise to obtain the optimal twin network.

And 2.2.1, the filtered motion sensor data are taken, three data fragments belonging to the same user in the motion sensor data and one data fragment of other users are combined to form a training group, the training group is divided into two data pairs, the data pairs of the two data fragments belonging to the same user are set as positive samples, and the data pairs of the two data fragments not belonging to the same user are set as negative samples.

For example, on the total data set of N total people, we randomly select T users as the training set and M people as the test set. The data under each user is time slice data that is uniformly formatted into an importable model (i.e., a twin network). The final data segment format under each user is a two-dimensional vector [ R, C ], where R represents the size of the formatted data segment and C represents the axial sum of the selected sensors (3 axes per sensor, xyz). In each iteration, 3 data segments are selected from the user's own data, and one data segment is selected from the other users to feed the twin network. The 4 data segments are divided into two data pairs p1 and p2 (i.e., time series data X1 and X2), where p1 has a tag of 1 and p2 has a tag of 0. Tag 1 indicates that two data fragments belong to the same user, and tag 0 is the opposite. The data dividing method has the following advantages: 1) Generating positive and negative labels 1:1, 2) fully utilizes all data (High Data Utilization high data utilization).

Step 2.2.2, taking a training set, respectively inputting two data pairs into two sub-networks of a twin network, and calculating the distance between stretching vectors output by the two sub-networks; the calculation formula is as follows: in the/> For the j-th element in the stretch vector F1 output by one of the subnetworks,/>J=1, 2,3, …, J is the length of the stretch vector in the stretch vector F2 output for the other sub-network.

Step 2.2.3, inputting the calculated distance into a full-connection layer with an activation function of RELU and an output layer with an activation function of Sigmoid in sequence, and finally obtaining a confidence value; confidence valueIn the/>Time series data X1,/>, representing input twin network at ith iterationRepresenting time series data X2 of the input twin network at the ith iteration, p represents the degree of similarity of the model to the input time series data X1 and X2 at the ith iteration, and c is a fraction between 0 and 1.

2.2.4, Comparing the confidence value with an authentication threshold, and if the confidence value is greater than or equal to the authentication threshold, indicating that the model authentication result is a legal user; otherwise, the model authentication result is an illegal user.

This embodiment sets an authentication threshold t (t between 0 and 1, preferably 0.5) to aid authentication. When c is equal to or greater than t, the user will be accepted. Otherwise, the user may be denied. Note that t is adjustable. If a more rigorous resolution of illegal users is desired, t may be increased. If a more accurate identification of legitimate users is desired, t may be reduced.

And 2.2.5, adjusting model parameters of the twin network according to the model authentication result and the data set as the positive sample or the negative sample, and repeatedly taking the training set for training until the set training ending condition is reached. The set training ending condition may be the maximum iteration number or the model output accuracy.

The method is the same as the conventional model training, in this embodiment, model parameters are also adjusted through a prediction tag (model authentication result) and an actual tag (positive and negative samples), and the two classes of cross entropy are used when the loss function is used, meanwhile, in order to solve the problem of sample imbalance, a twin network is selected in this embodiment, multiple iterative input data pairs can be selected, and the positive and negative samples of each iterative input data pair are 1:1, thus solving the problem of sample imbalance, and finally obtaining a twin network which is used as a movable model in the subsequent real-time authentication stage.

The artificial noise filtering process is not needed in the real-time authentication stage, and the artificial noise filtering process is performed in the model training stage, so that the self-collected data noise is filtered, and the training is ensured to obtain an effective and reliable recognition model. In the client stage (i.e., the real-time authentication stage) after model deployment, the process of filtering the artificial noise is unnecessary, and the model only needs to detect whether the current user is a real user in real time.

In the real-time authentication stage, the motion sensor data acquired in real time do not form a data pair to be input into the twin network, but are paired with the data fragments which are determined to be legal users in the past and are input into the twin network. In order to obtain the data segment determined to be the legal user, heuristic data collection can be performed (namely, the data is collected at 50Hz/s in the first 3 seconds when the user opens a new application, and 3 data segments are obtained after slicing), the data segment collected in real time and the data segment stored in the mobile terminal are sufficiently paired for a plurality of times and input into a network for judgment, and whether the user is the legal user or not is determined according to the judgment result.

Experiment:

In order to evaluate the user authentication method (abbreviated as TRAPCOG) of the present embodiment, the following indexes are set: TP: the legal user is identified as a legal user; FP: the illegal user is identified as a legal user; FN: the legal user is identified as an illegal user; TN: the illegal user is identified as an illegal user.

The effectiveness in model evaluation is therefore: validity includes true positive rate: tpr=tp/(tp+fn), true negative rate: tnr=tn/(tn+fp) and accuracy is: accuracy= (tp+tn)/(tp+fp+fn+tn).

The performance of TRAPCOG was compared in this experiment with the two existing works RISKCOG and ESPIALCOG. TRAPCOG was trained on a 500 denoised user set and evaluated on a 500 denoised user test set. RISKCOG and ESPIALCOG were trained on 6 users and tested on 1513 users. RISKCOG and ESPIALCOG produce overlapping training and test sets, but TRAPCOG does not. The experimental results are shown in table 1:

table 1 results of various model experiments

The user states in table 1 refer to the motion states of the user when collecting data, the RISKCOG method performs model training under static and dynamic conditions, respectively, whereas the ESPIALCOG method and the TRAPCOG method do not distinguish between static and dynamic conditions, and the data of various motion states are used together for model training.

It can be seen from Table 1 that both TPR and TNR of the present application TRAPCOG are much higher than RISKCOG and ESPIALCOG, with extremely high effectiveness and accuracy. The main reason is that the application uses the twin network to more comprehensively cover the man-machine interaction model. In detail, TRAPCOG combines the time series data learning capability of the LSTM and the capability of the twin network to determine the input dissimilarity to achieve high accuracy.

Traditional data collection methods in user authentication are to invite participants to provide sensor data in certain specific situations, such as walking, going up and down stairs, picking up a cell phone and touching the screen of a mobile device, etc. While the number and complexity of man-machine interaction patterns in the real world far exceeds the data collected from the laboratory. Thus, a model obtained based on these limited data will not meet the practicality of implicit real-time identity verification.

The present application therefore no longer limits user behavior when collecting data in experiments from which it can be seen that large datasets of thousands of people were collected for RISKCOG and ESPIALCOG. However, both can only use data of 6 persons in the data set, and there are obvious problems of narrow model coverage and low data utilization rate. While the TRAPCOG of the present application can efficiently utilize all the data collected, that is, TRAPCOG can cover complex and varied human-machine interaction modes, so that TRAPCOG can be deployed in the real world.

The user authentication method provided by the application uses a twin network based on an LSTM neural network to execute implicit real-time mobile user authentication. The problem of error marking is solved by differential training based on downsampling, and the data set is fully utilized to cover the real-world man-machine interaction mode to the maximum extent through a twin network taking LSTM as a sub-network. Meanwhile, the twin network has the capability of distinguishing the input similarity degree, so that the twin network trained by the method has strong mobility. Under the condition, the user can carry out identity verification only by locally operating the user authentication method of the application, private data is not required to be uploaded, and the privacy of the user is protected to the greatest extent. The user authentication method provided by the application meets the requirements of safety, privacy and availability of mobile user authentication, and the real deployment is feasible.

The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples illustrate only a few embodiments of the application, which are described in detail and are not to be construed as limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. The user authentication method based on the twin network is characterized by comprising the following steps of:

step 1, data acquisition and filtering:

step 1.1, acquiring motion sensor data on mobile equipment of a user;

Step 1.3, filtering artificial noise of the motion sensor data after filtering the abnormal noise of the machine, including:

Step 1.3.5, carrying out anomaly detection on the loss vector of each data segment in the data set SD to obtain an outlier, if the data segment corresponding to the outlier does not belong to the user U, deleting the data segment corresponding to the outlier from all the data segments of the user U with the preset dropout probability, and returning to the step 1.3.2 to continue iteration until the preset iteration times are reached; if the data segment corresponding to the outlier belongs to the user U, returning to the step 1.3.2 to continue iteration until the preset iteration times are reached;

Step 2, constructing a model and training the model:

step 2.2, twin network training: training the twin network by utilizing the motion sensor data filtered by artificial noise to obtain an optimal twin network, wherein the training comprises the following steps:

2.2.5, adjusting model parameters of the twin network according to the model authentication result and the data set as the positive sample or the negative sample, and repeatedly taking the training set for training until the set training ending condition is reached;

2. The twin network-based user authentication method of claim 1, wherein the motion sensor comprises an acceleration sensor, a gravity sensor, and a gyro sensor.