CN111626769B - Man-machine recognition method and device and storage medium - Google Patents

Man-machine recognition method and device and storage medium Download PDF

Info

Publication number
CN111626769B
CN111626769B CN202010360732.7A CN202010360732A CN111626769B CN 111626769 B CN111626769 B CN 111626769B CN 202010360732 A CN202010360732 A CN 202010360732A CN 111626769 B CN111626769 B CN 111626769B
Authority
CN
China
Prior art keywords
data
operation data
axis
coordinate
click
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010360732.7A
Other languages
Chinese (zh)
Other versions
CN111626769A (en
Inventor
郭翊麟
郭豪
蔡准
孙悦
郭晓鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Trusfort Technology Co ltd
Original Assignee
Beijing Trusfort Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Trusfort Technology Co ltd filed Critical Beijing Trusfort Technology Co ltd
Priority to CN202010360732.7A priority Critical patent/CN111626769B/en
Publication of CN111626769A publication Critical patent/CN111626769A/en
Application granted granted Critical
Publication of CN111626769B publication Critical patent/CN111626769B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0207Discounts or incentives, e.g. coupons or rebates
    • G06Q30/0225Avoiding frauds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2135Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on approximation criteria, e.g. principal component analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2433Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0463Neocognitrons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention discloses a man-machine identification method, a device and a computer readable storage medium, and provides a man-machine identification method, which comprises the following steps: receiving a login instruction of an object to be logged in; responding to the login instruction, and acquiring operation data of the object to be logged in within a set time period before the time when the login instruction is received; and identifying the operation data by utilizing a pre-constructed identification model to obtain an identification result. According to the embodiment of the invention, based on the operation data in the set time period before the login of the object to be logged in, the identification model is utilized to extract the corresponding statistical characteristics and nonlinear characteristics of a series of operations executed before the login, so that the execution main body of the login operation is judged, the man-machine detection identification rate is effectively improved, and illegal behaviors which are performed by using scripts and do not need root or information tampering and the like can be effectively identified.

Description

Man-machine recognition method and device and storage medium
Technical Field
The invention relates to the technical field of internet application, in particular to a man-machine identification method, a man-machine identification device and a computer readable storage medium.
Background
With the popularization of the internet and the rise of smart phones, manufacturers or service providers can use smart terminals to distribute information materials, and the information materials are obtained and used by users or clients. Some lawbreakers can use autojs (a JavaScript IDE on an Android platform supporting barrier-free services, which can write various automatic scripts), key sprites, one-touch-and-send and other scripting methods, and obtain the information materials by performing group control operation on a large number of mobile phones, so that users or clients who really need the information materials cannot obtain the information materials, and normal distribution of the information materials is influenced.
At present, the traditional detection means for user behavior mainly include: and detecting whether the user mobile phone is root or not, whether data information is tampered or not and the like. However, for the above problems, scripts such as autojs and one-touch instant messaging do not need to root the user's mobile phone, and do not need to tamper with data information and the like. Therefore, the conventional detection method has not been suitable for a new type of illegal means.
Disclosure of Invention
In order to solve the above problems in user behavior detection, embodiments of the present invention creatively provide a human-machine recognition method, device, and computer-readable storage medium.
According to a first aspect of the present invention, there is provided a human-machine recognition method, the method comprising: receiving a login instruction of an object to be logged in; responding to the login instruction, and acquiring operation data of the object to be logged in within a set time period before the time when the login instruction is received; and identifying the operation data by utilizing a pre-constructed identification model to obtain an identification result.
According to an embodiment of the present invention, the identifying the operation data by using a pre-constructed identification model to obtain an identification result includes: performing feature extraction on the operation data to obtain statistical features and nonlinear features corresponding to the operation data; and identifying the operation data by utilizing a pre-constructed neural network model according to the statistical characteristic and the nonlinear characteristic to obtain an identification result, wherein the identification result is used for indicating whether the operation data is data obtained by real user operation.
According to an embodiment of the present invention, before performing feature extraction on the operation data, the method further includes: and performing data cleaning pretreatment on the operation data to supplement missing data.
According to an embodiment of the present invention, the extracting the feature of the operation data includes: analyzing the spatial features and the motion direction features of the operation data to obtain statistical feature vectors of the operation data; extracting nonlinear characteristic vectors of the operation data by adopting a Principal Component Analysis (PCA) dimension reduction method; and carrying out vector splicing on the statistical feature vector and the nonlinear feature vector.
According to an embodiment of the invention, the method further comprises: and when the identification result shows that the operation data is data obtained by real user operation, self-learning is carried out by utilizing a neural network algorithm according to the operation data of the object to be logged in within a set time period before the time when the login instruction is received, so as to update the identification model.
According to a second aspect of the embodiments of the present invention, there is also provided a human-machine recognition apparatus, the apparatus including: the device comprises a receiving unit, a judging unit and a judging unit, wherein the receiving unit is used for receiving a login instruction of an object to be logged in; the acquisition unit is used for responding to the login instruction and acquiring the operation data of the object to be logged in a set time period before the time when the login instruction is received; and the identification unit is used for identifying the operation data by utilizing a pre-constructed identification model to obtain an identification result.
According to an embodiment of the present invention, the identification unit includes: the characteristic extraction module is used for extracting the characteristics of the operation data to obtain statistical characteristics and nonlinear characteristics corresponding to the operation data; and the identification module is used for identifying the operation data by utilizing a pre-constructed neural network model according to the statistical characteristic and the nonlinear characteristic to obtain an identification result, and the identification result is used for indicating whether the operation data is data obtained by real user operation.
According to an embodiment of the invention, the apparatus further comprises: and the preprocessing unit is used for performing data cleaning preprocessing on the operation data before performing feature extraction on the operation data so as to supplement missing data.
According to an embodiment of the present invention, the feature extraction module includes: the statistical analysis submodule is used for analyzing the spatial characteristics and the motion direction characteristics of the operation data to obtain statistical characteristic vectors of the operation data; the dimensionality reduction submodule is used for extracting the nonlinear characteristic vector of the operation data by adopting a Principal Component Analysis (PCA) dimensionality reduction method; and the splicing submodule is used for carrying out vector splicing on the statistical characteristic vector and the nonlinear characteristic vector.
According to a third aspect of the present invention, there is also provided a computer-readable storage medium comprising a set of computer-executable instructions which, when executed, are adapted to perform the above-described human recognition method.
According to the man-machine identification method, the man-machine identification device and the computer readable storage medium, on the basis of the operation data in the set time period before the login of the object to be logged in, the identification model is utilized to extract the statistical characteristics and the nonlinear characteristics corresponding to a series of operations executed before the login, so that the execution main body of the login operation is judged, the man-machine detection identification rate is effectively improved, and illegal behaviors which are performed by using scripts and do not need root or information tampering and the like can be effectively identified.
It is to be understood that the teachings of the present invention need not achieve all of the above-described benefits, but rather that specific embodiments may achieve specific technical results, and that other embodiments of the present invention may achieve benefits not mentioned above.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present invention will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the invention are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
FIG. 1 is a schematic diagram illustrating a flow chart of implementing a human-machine identification method according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating an implementation of identifying operational data according to an embodiment of the present invention;
FIG. 3 shows a neural network model diagram according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a flow chart of feature extraction according to an embodiment of the present invention;
fig. 5 is a schematic diagram illustrating a composition structure of a human-machine recognition device according to an embodiment of the present invention.
Detailed Description
The principles and spirit of the present invention will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given only to enable those skilled in the art to better understand and to implement the present invention, and do not limit the scope of the present invention in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The technical solution of the present invention is further elaborated below with reference to the drawings and the specific embodiments.
Fig. 1 shows a schematic flow chart of an implementation of the human-computer recognition method according to the embodiment of the present invention.
Referring to fig. 1, a man-machine recognition method according to an embodiment of the present invention at least includes the following operation flows: operation 101, receiving a login instruction of an object to be logged in; an operation 102, responding to the login instruction, and acquiring operation data of an object to be logged in within a set time period before the time when the login instruction is received; and an operation 103 of recognizing the operation data by using the pre-constructed recognition model to obtain a recognition result.
In operation 101, a login instruction of an object to be logged in is received.
The login instruction of the object to be logged in can be a certain website or a certain APP login key clicked by the user through the intelligent device.
For example, a user uses a smart phone to perform online shopping, for example: logging in the Jingdong APP for online shopping, and clicking a login key after inputting an account number and a password. The user logs in a Taobao shopping website of Aliskiba or an Aliwang client side for online shopping by using a computer, and needs to click a login key or a carriage return button on a keyboard to log in after inputting an account number and a password. The 'click on a login key' or 'click on an enter button on a keyboard' is a login instruction of the object to be logged in.
In operation 102, in response to the login instruction, operation data of the object to be logged in within a set time period before the time when the login instruction is received is acquired.
In an embodiment of the present invention, when a user uses a smart device, the following operation actions occur: the method comprises the steps of registration operation and login operation, wherein operation data are changed along with the operation of a user in the operation process, and all the operation data generated by the change are stored in a user database. When the user clicks and logs in the login page, the user operation data in the set time period before the user clicks and logs in can be obtained from the user database. For example: and acquiring the operation data of the user in 3 seconds before the user clicks the login.
In an embodiment of the present invention, the main source of the operation data is acceleration sensor data, which includes acceleration data (which may include acceleration data on 3 coordinate axes of x-axis, y-axis, and z-axis), click data (which may include click degree data on 2 coordinate axes of x-coordinate and y-coordinate), and screen sliding data. The acceleration sensor is also called G-sensor, and the collected acceleration value can be represented by a value in x, y and z three-dimensional coordinate axes, the value comprises the influence of gravity, and the unit is m/s2. For example, when a user logs in a website or an APP account on a mobile phone with an Android system, the state of the mobile phone can be shown according to data acquired by the acceleration sensor. For example: the mobile phone is horizontally placed on a desktop, the x-axis value of the acceleration sensor is defaulted to 0, the y-axis value of the acceleration sensor is defaulted to 0, and the z-axis value of the acceleration sensor is defaulted to 9.81; placing the mobile phone on a desktop downwards, and defaulting the z-axis value to-9.81; the mobile phone is inclined towards the left, and the x-axis value is defaulted to be a positive value; the mobile phone is inclined to the right, and the value of the x axis is defaulted to be a negative value; the mobile phone is inclined upwards, and the y-axis value is defaulted to be a negative value; the mobile phone is tilted downwards, and the y-axis value is defaulted to be a positive value. In addition, the click data and the screen sliding data of the intelligent devices such as the mobile phone belong to the conventional data acquisition function of the intelligent devices, and are not described herein again.
In operation 103, the operation data is recognized using the pre-constructed recognition model, and a recognition result is obtained.
In one embodiment of the present invention, first, feature extraction is performed on the operation data. For example, the PCA (Principal Components Analysis) method may be used to perform dimensionality reduction on the data to obtain operation data in a low-dimensional space, for example: can be represented by a non-linear feature vector. In another embodiment of the present invention, the frequency domain feature and the frequency domain feature of the operation data are extracted to obtain a statistical feature vector of the operation data.
It should be noted that, in the embodiment of the present invention, one of the nonlinear feature extraction and the statistical feature extraction may be performed on the operation data, or the nonlinear feature extraction and the statistical feature extraction may be performed on the operation data at the same time, which is not specifically limited in the present invention.
In an embodiment of the present invention, after feature extraction is performed on operation data, a pre-trained neural network model is called to identify the operation data in a set time period before a user logs in, so as to determine whether the login operation is a machine control operation or a real user operation. If the result obtained by analyzing the neural network model is that the login operation is the real user operation, displaying that the login is successful; otherwise, the login fails.
In an embodiment of the present invention, the operation data is identified by using a neural network model, and two probability values are obtained first: the first probability that the login operation is a real user operation and the second probability that the login operation is a machine control operation. Secondly, judging the magnitude relation of the two probability values, and if the first probability is obviously greater than the second probability, judging that the login operation is the real user operation; otherwise, the login operation is judged as the machine control operation.
The specific implementation process of identifying the operation data will be described in more detail with reference to fig. 2 to 4.
FIG. 2 is a first flow chart illustrating an implementation of identifying operation data according to an embodiment of the present invention;
referring to fig. 2, an implementation process of identifying operation data in an embodiment of the present invention at least includes the following operation processes: operation 201, performing feature extraction on the operation data to obtain statistical features and nonlinear features corresponding to the operation data; and operation 202, recognizing the statistical features and the nonlinear features by using a pre-constructed neural network model to obtain a recognition result, wherein the recognition result is used for indicating whether the operation data is data obtained by real user operation.
In operation 201, feature extraction is performed on the operation data to obtain statistical features and nonlinear features corresponding to the operation data.
In one embodiment of the present invention, the feature extraction of the operation data includes: statistical feature extraction and nonlinear feature extraction. Specifically, reference may be made to the implementation flow of feature extraction in the embodiment of the present invention described below with reference to fig. 4.
In one embodiment of the present invention, the statistical characteristics of the operation data are extracted by calculating the maximum value, the minimum value, the average value, the variance, the range, the skewness, the kurtosis, the median and the mode of each of the five data sequences of the acceleration x-axis, the y-axis and the z-axis and the x-coordinate and the y-coordinate of the click data. Then, the length of the click sequence and the frequency of clicks are extracted (e.g., the length of the click sequence may be divided by the time difference between the last data and the first data of the click sequence). And finally, converting the acceleration x-axis, y-axis and z-axis time sequence data into a frequency domain according to DFT (Discrete Fourier Transform). And obtaining a sliding gesture direction change sequence by utilizing an arctan function atan (y/x) according to the x coordinate and the y coordinate of the click data, and further calculating the maximum value, the minimum value, the average value, the variance, the range, the skewness, the kurtosis, the median and the mode of the sliding gesture direction change sequence.
Where the DFT Transform is a Fourier Transform that is Discrete in both the time and frequency domains, transforming the time-domain samples of the signal into frequency-domain samples of its DTFT (Discrete-time Fourier Transform). Therefore, the statistical feature extraction is carried out on the operation data, and the spatial feature and the motion direction feature of the operation data are collected at the same time. The operation data can be analyzed more accurately, and therefore a more accurate man-machine recognition result is obtained.
In an embodiment of the present invention, the nonlinear feature extraction of the operation data is mainly performed by vector splicing to obtain a data sequence of an x-axis, a y-axis, a z-axis of acceleration, an x-coordinate of click data, and a y-coordinate of the click data to form an n (n is a positive integer) dimensional feature, and then a Principal Component Analysis (PCA) method is used to reconstruct a k (k is a positive integer less than n) dimensional feature from the original n dimensional feature, so that the purposes of eliminating redundant data reasonably, reducing the dimensionality of the operation data, and retaining effective data information in the original operation data to the maximum extent are achieved.
In an embodiment of the present invention, before performing feature extraction on the operation data, data cleaning preprocessing is further performed on the operation data to supplement missing data.
For example, the acquired operational data includes: the data acquisition system comprises x-axis data, y-axis data and z-axis data acquired by an acceleration sensor, and a data sequence of x coordinates and y coordinates of click data. The purpose of performing data cleansing preprocessing on the operation data is to supplement missing data. The specific cleaning preprocessing method is to calculate the average value and the mode of data of a certain attribute, and then fill missing data in the data of the attribute so as to supplement the operation data of each attribute completely and ensure the integrity of the data.
For example: the acquired acceleration data acquired by the acceleration sensor are subjected to cleaning preprocessing, and a complete data sequence X of an acceleration X axis is obtained and is shown as the following formula (1):
X=[x0,x1,x2…xn] (1)
where n represents the sequence length of the data sequence X.
For other data sequences such as the y axis and the z axis of the acceleration sensor, the x coordinate and the y coordinate of click data and the like in the acquired operation data, the data cleaning pretreatment can be performed by adopting operations similar to the above operations, and other reasonable data cleaning pretreatment methods can also be adopted to pretreat the operation data.
In operation 202, the pre-constructed neural network model is used to identify the statistical features and the non-linear features to obtain an identification result, where the identification result is used to indicate whether the operation data is data obtained by a real user operation.
In one embodiment of the present invention, a large amount of operation data of a real user performing a login operation is collected in advance, including acceleration data, click data, sliding data, and the like within a set time period (e.g., 3s before login or 5s before login) before the real user performs the login operation, and data training is performed by using a neural network algorithm to obtain a neural network model for identifying the operation data. When the neural network algorithm is used for training the operation data, the characteristic extraction can be performed on the operation data to obtain a more accurate neural network model, so that the human-computer recognition efficiency and the accuracy are improved.
FIG. 3 shows a neural network model diagram according to an embodiment of the present invention. Referring to fig. 3, the neural network model is mainly composed of coding blocks, a pooling layer and a full-link layer, wherein the coding blocks are divided into 2 sizes, 3 × 1 × 128 and 3 × 1 × 64, respectively, and the coding block size represents the size of the convolutional layer convolution kernel inside the coding block. Each coding block is composed of 2 convolution layers and 1 residual connection (skip-connect), and a Relu (Rectified Linear Unit) activation function can be selected to perform nonlinear operation after convolution. Similarly, the pooling layers are divided into 2 types, one is a maximum pooling layer (max-pooling) and the other is an average pooling layer (mean-pooling). And finally, outputting the result of identifying the operation data through a full connection layer by the neural network model. For example: the operation may be a first probability that the operation is a real user operation and a second probability that the operation is a machine control operation, which are determined by the operation data, and whether the operation is the real user operation is determined according to the first probability and the second probability, and a determination result is output.
Fig. 4 is a schematic diagram illustrating an implementation flow of feature extraction according to an embodiment of the present invention, and performing feature extraction on operation data may include: operation 401, analyzing the spatial features and the motion direction features of the operation data to obtain statistical feature vectors of the operation data; operation 402, extracting a nonlinear feature vector of the operation data by adopting a Principal Component Analysis (PCA) dimensionality reduction method; in operation 403, vector splicing is performed on the statistical feature vector and the nonlinear feature vector.
First, it should be noted that, in the embodiment of the present invention, the acquisition frequency of the operation data may be determined according to actual requirements, for example, the acquisition frequency of the data may be determined to be 50Hz, and the operation data is acquired.
In operation 401, the spatial feature and the motion direction feature of the operation data are analyzed to obtain a statistical feature vector of the operation data.
In an embodiment of the present invention, the extracting statistical characteristics of the operation data may include the following specific extracting processes:
the method comprises the steps of firstly, calculating the maximum value, the minimum value, the average value, the variance, the range, the skewness, the kurtosis, the median and the mode of a data sequence (five data in total) consisting of acceleration data (acquired by an acceleration sensor and comprising three data of an x axis, a y axis and a z axis) and click data (comprising two data of an x coordinate and a y coordinate) according to operation data, and forming 45 time domain feature vectors.
In an embodiment of the present invention, in order to eliminate differences between acceleration sensor data of different devices (different smartphones, tablet computers, computer devices, and the like) during the process of extracting statistical features from operation data, difference operations are performed on each data sequence to calculate differences between data points in the data sequence, and then feature extraction is performed according to the difference sequences.
For example, the difference operation is performed on the acceleration X-axis data series X, so that a difference series X1 shown in the following formula (2) can be obtained.
X1=[x0,x1-x0,x2-x1…xn-xn-1] (2)
Secondly, according to the difference sequence X1, the value X in the difference sequence X1 is calculated0,x1-x0,x2-x1…xn-xn-1Maximum, minimum, mean, variance, range, skewness, kurtosis, median, mode.
For the other two data (y-axis data, z-axis data) of the acceleration data, and the click data (including two data of x-coordinate and y-coordinate), the similar processing to the acceleration x-axis data described above may be adopted, and thereby, 45 time-domain feature vectors are finally formed.
Secondly, the time series data of the acceleration data x axis, y axis and z axis are converted into frequency domain data by using Discrete Fourier Transform (DFT).
The DFT conversion process will be described below by taking a data sequence X of an acceleration data X axis in operation data acquired by the acceleration sensor after the data cleaning preprocessing as an example.
In an embodiment of the present invention, in the process of converting the time-series data into the frequency-domain data, the operation data sequence may also be calculated first to obtain a difference sequence as shown in the above formula (2), so as to eliminate the difference between the acceleration sensor data of different devices (different smartphones, tablet computers, computer devices, etc.). Secondly, calculating by adopting the following formula (3) to obtain a time domain sequence of the acceleration X-axis data sequence X:
Figure BDA0002474912410000091
wherein X2(k) represents the kth data in the frequency domain sequence after the data sequence X is transformed by DFT;
x1(n) represents the nth data in the difference sequence X1;
n represents the interval length of the frequency domain sequence after the data sequence X is transformed by DFT;
n, k and N are positive integers, k is less than N, and N is less than N.
And finally, obtaining 3 frequency domain sequences of an acceleration x axis, a y axis and a z axis through DFT conversion. For 3 frequency-domain sequences, the amplitude of the first center frequency (e.g., k ═ 1) and the amplitude of the second center frequency (e.g., k ═ 2) are extracted, respectively, and 6 frequency-domain feature vectors of the operation data can be obtained.
And thirdly, obtaining a sliding gesture direction change sequence by utilizing an arctangent atan (y/x) according to the x coordinate and the y coordinate of the click data. Further, the maximum value, the minimum value, the average value, the variance, the range, the skewness, the kurtosis, the median and the mode of the sliding gesture direction change sequence are calculated to obtain 9 time domain feature vectors. And extracting the length of the click data sequence (the click data sequence comprises an x coordinate sequence and a y coordinate sequence, and the lengths of the x coordinate sequence and the y coordinate sequence are consistent, so that the length of any one of the x coordinate sequence and the y coordinate sequence is taken as the length of the click data sequence), and dividing the length of the click sequence by the time difference between the last data and the first data of the click sequence to obtain 2 time domain feature vectors.
And synthesizing the three processes to obtain a statistical feature vector comprising 62 feature vectors (comprising 6 frequency domain feature vectors and 56 time domain feature vectors).
In operation 402, nonlinear feature vectors of the operational data are extracted using principal component analysis, PCA, dimension reduction.
In one embodiment of the invention, n (n is a positive integer) dimensional characteristics are formed by obtaining data sequences of an acceleration x axis, a y axis and a z axis, and an x coordinate and a y coordinate of click data mainly through vector splicing. And reconstructing the k (k is a positive integer less than n) dimensional characteristics of the operation data according to the n dimensional characteristics of the operation data by using a PCA dimension reduction method. The PCA dimension reduction method comprises the following steps:
1. assume the input of the PCA dimension reduction method is Pin=[x0,x1,x2…xn]Where n is the original data dimension, for example: the acquired operation data includes: acceleration data x-axis one-dimensional data sequence a ═ x1, x2, x3 … xk]Y-axis one-dimensional data sequence b ═ y1, y2, y3 … yk]Z-axis one-dimensional data sequence c ═ z1, z2, z3 … zk]The click data x-coordinate data sequence d is [ dm1, dm2, dm3 … dmi ═ d]And the click data y coordinate sequence e ═ dn1, dn2, dn3 … dni]Then, the vectors are longitudinally spliced to obtain the input P of the PCA dimension reduction methodin=[x1,x2,x3…xk,y1,y2,y3…yk,z1,z2,z3…zk,dm1,dm2,dm3…dmi,dm1,dn2,dn3…dni]。
2. Using the following formula (4), for PinCarrying out centralization treatment:
Figure BDA0002474912410000111
wherein x isiIs input PinSubtract the average of all original values from each original value (e.g., x1, x 2); x is the number ofiIs' as to xiThe resulting values were de-centered. In this manner, the user can easily and accurately select the desired target,can obtain the input PinData sequence P after decentralized processingin'. To PinThe decentralization processing is carried out, so that the error of operation data acquisition is eliminated, and the more accurate nonlinear feature vector after PCA dimension reduction can be obtained.
3. According to pair input PinData sequence P after decentralized processingin', calculating a data sequence PinCovariance matrix of's: pin′PinT
4. For covariance matrix Pin′PinTAnd (5) performing characteristic decomposition.
The feature decomposition yields eigenvalues and corresponding eigenvectors, since P is inputinThe original data which is equivalent to the acquired operation data comprises some effective information and redundant information, and the effective information is extracted from the original data by using a characteristic decomposition method. Specifically, the feature vector obtained by feature decomposition is valid information, and the feature value may represent the importance degree of the corresponding valid information.
5. Determining the need to input PinReducing the dimension to a low-dimensional space with the dimension of k (k < n), then k eigenvectors w with larger eigenvalue are needed to be taken1、w2、w3……wk
6. Outputting a reduced k-dimension data sequence W: w ═ W1,w2,w3…wk]The k-dimensional data sequence W is a nonlinear feature vector obtained by performing PCA dimension reduction on the operation data.
In operation 403, the statistical feature vector and the nonlinear feature vector are vector spliced. Specifically, the 62 statistical feature vectors obtained in operation 401 and the nonlinear feature vector W obtained in operation 402 are simply vector-spliced to obtain the feature vector which is finally input to the recognition model (e.g., the neural network model in operation 202). The vector stitching process may refer to step 1 input P of the flow of the PCA dimension reduction method in operation 402inOf each of the 62 statistical feature vectors and the nonlinear feature vector WThe vector is first represented by a one-dimensional data sequence, and then all the feature vectors are arranged according to a set order (for example, according to the order of a time domain feature vector, a frequency domain feature vector and a nonlinear feature vector) to form a new one-dimensional data sequence, which can be represented by Z. Z may be used as an input to a recognition model, such as the neural network model shown in fig. 3.
In an embodiment of the present invention, when the recognition result indicates that the operation data is data obtained by a real user operation, self-learning is performed by using a neural network algorithm according to the operation data of the object to be logged in within a set time period before the time when the login instruction is received, so as to update the recognition model.
As can be seen from the above description of operation 202, the recognition model is actually obtained by training a model using an intelligent algorithm (e.g., a neural network algorithm) based on a large amount of real user operation data. In order to improve the accuracy of the recognition model, when the recognition result indicates that the operation data is data obtained by real user operation, the obtained operation data can be used as training data for the recognition model to retrain the recognition model. In the actual operation process, the self-learning and the updating of the recognition model by using the neural network algorithm can be operated more simply and quickly, and the embodiment of the invention does not limit the operation.
The man-machine identification method provided by the embodiment of the invention can be applied to the process that a product provider or a service provider uses intelligent terminal equipment to send information materials, and the information materials are obtained and used by a user or a client.
By the man-machine identification method provided by the embodiment of the invention, whether the acquirer of the information material is the user or the client with the real requirement is determined, and the information material is normally distributed to the user or the client with the real requirement according to the identification result. For example: the method comprises the steps of identifying an object to be logged, which is obtained by information materials through judging a product merchant platform or a service provider platform (such as an e-commerce website, an e-commerce intelligent terminal APP, an e-commerce platform applet, a bank client side APP, a bank website and the like), so as to judge whether the object to be logged is a real user. And when the object to be logged in is judged to be the real user, generating an operation instruction for issuing the information material so as to fulfill the aim of normally issuing the information material.
Therefore, the method can effectively detect the process of issuing the information materials by the product provider or the service provider, further solve the problem that the information materials are intercepted by lawless persons by utilizing the script due to the fact that effective detection cannot be achieved, and further realize normal issuing of the information materials. For example, when lawbreakers (for example, some lawbreakers called black grey products use autojs, key sprites, one touch instant and other scripts to perform group control operation on a large number of mobile phones to finish coupon robbing activities, and then change the coupons by using a channel change mode to obtain high profits) obtain information materials by performing group control operation on a large number of mobile phones, the man-machine identification method provided by the embodiment of the invention can be used for judging that an object to be logged in (a user account obtained by the black grey products by group control of the mobile phones) is not a real user, and can generate an instruction for not issuing the information materials.
Therefore, the embodiment of the invention extracts the statistical characteristics and the nonlinear characteristics corresponding to a series of operations executed before login by using the identification model based on the operation data in the set time period before the login of the object to be logged in, so as to judge the execution main body of the login operation, effectively improve the man-machine detection identification rate, and effectively identify illegal behaviors which are not required to be performed by using scripts, such as root or information tampering.
Similarly, based on the above human-computer recognition method, an embodiment of the present invention further provides a computer-readable storage medium, where a program is stored, and when the program is executed by a processor, the processor is caused to perform at least the following operation steps: operation 101, receiving a login instruction of an object to be logged in; an operation 102, responding to the login instruction, and acquiring operation data of an object to be logged in within a set time period before the time when the login instruction is received; and an operation 103 of recognizing the operation data by using the pre-constructed recognition model to obtain a recognition result.
Further, based on the above human-computer recognition method, an embodiment of the present invention further provides a human-computer recognition apparatus, and fig. 5 shows a schematic structural diagram of the human-computer recognition apparatus according to the embodiment of the present invention. As shown in fig. 5, the apparatus 50 includes: a receiving unit 501, configured to receive a login instruction of an object to be logged in; an obtaining unit 502, configured to, in response to a login instruction, obtain operation data of an object to be logged in within a set time period before a time when the login instruction is received; the identifying unit 503 is configured to identify the operation data by using a pre-constructed identification model, and obtain an identification result.
According to an embodiment of the present invention, the identifying unit 503 includes: the characteristic extraction module is used for extracting the characteristics of the operation data to obtain statistical characteristics and nonlinear characteristics corresponding to the operation data; and the identification module is used for identifying the statistical characteristics and the nonlinear characteristics by utilizing a pre-constructed neural network model to obtain an identification result, and the identification result is used for indicating whether the operation data is data obtained by real user operation.
According to an embodiment of the invention, the apparatus 50 further comprises: and the preprocessing unit is used for performing data cleaning preprocessing on the operation data before performing feature extraction on the operation data so as to supplement missing data.
According to an embodiment of the present invention, the feature extraction module includes: the statistical analysis submodule is used for analyzing the spatial characteristics and the motion direction characteristics of the operation data to obtain statistical characteristic vectors of the operation data; the dimensionality reduction submodule is used for extracting the nonlinear feature vector of the operation data by adopting a Principal Component Analysis (PCA) dimensionality reduction method; and the splicing submodule is used for carrying out vector splicing on the statistical characteristic vector and the nonlinear characteristic vector.
Here, it should be noted that: the above description of the embodiment of the human-machine recognition device is similar to the description of the embodiment of the method shown in fig. 1 to 4, and has similar beneficial effects to the embodiment of the method shown in fig. 1 to 4, and therefore, the description thereof is omitted. For technical details not disclosed in the embodiment of the man-machine recognition device of the present invention, please refer to the description of the method embodiment shown in fig. 1 to 4 of the present invention for understanding, and therefore, for brevity, will not be described again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of a unit is only one logical function division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, all the functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.
Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.
Alternatively, the integrated unit of the present invention may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the embodiments of the present invention may be essentially implemented or a part contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods of the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (6)

1. A human-machine identification method, the method comprising:
receiving a login instruction of an object to be logged in;
responding to the login instruction, and acquiring operation data of the object to be logged in within a set time period before the time when the login instruction is received, wherein the operation data comprises: the system comprises acceleration data, click data and screen sliding data, wherein the acceleration data comprise acceleration data on 3 coordinate axes of an x axis, a y axis and a z axis, and the click data comprise click data on 2 coordinate axes of the x coordinate and the y coordinate; and
identifying the operation data by utilizing a pre-constructed identification model to obtain an identification result, wherein the identification result comprises the following steps:
performing feature extraction on the operation data to obtain statistical features and nonlinear features corresponding to the operation data;
identifying the operation data by utilizing a pre-constructed neural network model according to the statistical characteristic and the nonlinear characteristic to obtain an identification result, wherein the identification result is used for indicating whether the operation data is data obtained by real user operation;
the feature extraction of the operation data comprises:
analyzing the spatial features and the motion direction features of the operation data to obtain statistical feature vectors of the operation data;
extracting nonlinear characteristic vectors of the operation data by adopting a Principal Component Analysis (PCA) dimension reduction method;
performing vector splicing on the statistical feature vector and the nonlinear feature vector;
the analyzing the spatial features and the motion direction features of the operation data to obtain the statistical feature vector of the operation data comprises:
calculating the maximum value, the minimum value, the average value, the variance, the polar difference, the skewness, the kurtosis, the median and the mode of a data sequence consisting of acceleration data and click data according to operation data to form 45 time domain feature vectors, wherein the acceleration data comprises three data of an x axis, a y axis and a z axis, the click data comprises two data of an x coordinate and a y coordinate, and the data sequence consisting of the x axis, the y axis, the z axis, the x coordinate and the y coordinate comprises five data;
obtaining 3 frequency domain sequences of an acceleration x axis, a y axis and a z axis through DFT transformation, and respectively extracting the amplitude of a first central frequency and the amplitude of a second central frequency for the 3 frequency domain sequences to obtain 6 frequency domain feature vectors of operation data;
obtaining a sliding gesture direction change sequence by utilizing an arctangent atan (y/x) according to an x coordinate and a y coordinate of the click data, calculating the maximum value, the minimum value, the average value, the variance, the polar difference, the skewness, the kurtosis, the median and the mode of the sliding gesture direction change sequence to obtain 9 time domain feature vectors, and extracting the length of the click data sequence and dividing the length of the click sequence by the time difference between the last data and the first data of the click sequence to obtain 2 time domain feature vectors;
the statistical feature vectors include 6 frequency domain feature vectors and 56 time domain feature vectors.
2. The method of claim 1, wherein prior to feature extracting the operational data, the method further comprises:
and performing data cleaning pretreatment on the operation data to supplement missing data.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
when the identification result shows that the operation data is data obtained by real user operation,
and self-learning by utilizing a neural network algorithm according to the operation data of the object to be logged in within a set time period before the time of receiving the login instruction so as to update the identification model.
4. A human-machine identification apparatus, the apparatus comprising:
the device comprises a receiving unit, a judging unit and a judging unit, wherein the receiving unit is used for receiving a login instruction of an object to be logged in;
the acquisition unit is used for responding to the login instruction and acquiring the operation data of the object to be logged in a set time period before the time when the login instruction is received; the operational data includes: the system comprises acceleration data, click data and screen sliding data, wherein the acceleration data comprise acceleration data on 3 coordinate axes of an x axis, a y axis and a z axis, and the click data comprise click data on 2 coordinate axes of the x coordinate and the y coordinate;
the identification unit is used for identifying the operation data by utilizing a pre-constructed identification model to obtain an identification result;
the identification unit includes:
the characteristic extraction module is used for extracting the characteristics of the operation data to obtain statistical characteristics and nonlinear characteristics corresponding to the operation data;
the identification module is used for identifying the operation data by utilizing a pre-constructed neural network model according to the statistical characteristic and the nonlinear characteristic to obtain an identification result, and the identification result is used for indicating whether the operation data is data obtained by real user operation;
the feature extraction module includes:
the statistical analysis submodule is used for analyzing the spatial characteristics and the motion direction characteristics of the operation data to obtain statistical characteristic vectors of the operation data;
the dimensionality reduction submodule is used for extracting the nonlinear characteristic vector of the operation data by adopting a Principal Component Analysis (PCA) dimensionality reduction method;
the splicing submodule is used for carrying out vector splicing on the statistical characteristic vector and the nonlinear characteristic vector;
the statistical analysis submodule is configured to analyze the spatial features and the motion direction features of the operation data to obtain statistical feature vectors of the operation data, and includes:
calculating the maximum value, the minimum value, the average value, the variance, the polar difference, the skewness, the kurtosis, the median and the mode of a data sequence consisting of acceleration data and click data according to operation data to form 45 time domain feature vectors, wherein the acceleration data comprises three data of an x axis, a y axis and a z axis, the click data comprises two data of an x coordinate and a y coordinate, and the data sequence consisting of the x axis, the y axis, the z axis, the x coordinate and the y coordinate comprises five data;
obtaining 3 frequency domain sequences of an acceleration x axis, a y axis and a z axis through DFT transformation, and respectively extracting the amplitude of a first central frequency and the amplitude of a second central frequency for the 3 frequency domain sequences to obtain 6 frequency domain feature vectors of operation data;
obtaining a sliding gesture direction change sequence by utilizing an arctangent atan (y/x) according to an x coordinate and a y coordinate of the click data, calculating the maximum value, the minimum value, the average value, the variance, the polar difference, the skewness, the kurtosis, the median and the mode of the sliding gesture direction change sequence to obtain 9 time domain feature vectors, and extracting the length of the click data sequence and dividing the length of the click sequence by the time difference between the last data and the first data of the click sequence to obtain 2 time domain feature vectors;
the statistical feature vectors include 6 frequency domain feature vectors and 56 time domain feature vectors.
5. The apparatus of claim 4, further comprising:
and the preprocessing unit is used for performing data cleaning preprocessing on the operation data before performing feature extraction on the operation data so as to supplement missing data.
6. A computer-readable storage medium comprising a set of computer-executable instructions that, when executed, perform the human recognition method of any one of claims 1-5.
CN202010360732.7A 2020-04-30 2020-04-30 Man-machine recognition method and device and storage medium Active CN111626769B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010360732.7A CN111626769B (en) 2020-04-30 2020-04-30 Man-machine recognition method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010360732.7A CN111626769B (en) 2020-04-30 2020-04-30 Man-machine recognition method and device and storage medium

Publications (2)

Publication Number Publication Date
CN111626769A CN111626769A (en) 2020-09-04
CN111626769B true CN111626769B (en) 2021-04-06

Family

ID=72272982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010360732.7A Active CN111626769B (en) 2020-04-30 2020-04-30 Man-machine recognition method and device and storage medium

Country Status (1)

Country Link
CN (1) CN111626769B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103970271A (en) * 2014-04-04 2014-08-06 浙江大学 Daily activity identifying method with exercising and physiology sensing data fused
CN107122641A (en) * 2017-04-25 2017-09-01 杭州安石信息技术有限公司 Smart machine owner recognition methods and owner's identifying device based on use habit
CN107819945A (en) * 2017-10-30 2018-03-20 同济大学 The handheld device navigation patterns authentication method and system of comprehensive many factors
CN109977651A (en) * 2019-03-14 2019-07-05 广州多益网络股份有限公司 Man-machine recognition methods, device and electronic equipment based on sliding trace

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9691025B2 (en) * 2014-09-16 2017-06-27 Caterpillar Inc. Machine operation classifier

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103970271A (en) * 2014-04-04 2014-08-06 浙江大学 Daily activity identifying method with exercising and physiology sensing data fused
CN107122641A (en) * 2017-04-25 2017-09-01 杭州安石信息技术有限公司 Smart machine owner recognition methods and owner's identifying device based on use habit
CN107819945A (en) * 2017-10-30 2018-03-20 同济大学 The handheld device navigation patterns authentication method and system of comprehensive many factors
CN109977651A (en) * 2019-03-14 2019-07-05 广州多益网络股份有限公司 Man-machine recognition methods, device and electronic equipment based on sliding trace

Also Published As

Publication number Publication date
CN111626769A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
JP6681342B2 (en) Behavioral event measurement system and related method
CN106155298B (en) The acquisition method and device of man-machine recognition methods and device, behavioural characteristic data
CN106951765A (en) A kind of zero authority mobile device recognition methods based on browser fingerprint similarity
CN108229324A (en) Gesture method for tracing and device, electronic equipment, computer storage media
CN107194213B (en) Identity recognition method and device
CN109086834B (en) Character recognition method, character recognition device, electronic equipment and storage medium
Jain et al. Gender recognition in smartphones using touchscreen gestures
CN107093164A (en) Method and apparatus for generating image
Ahmad et al. Analysis of interaction trace maps for active authentication on smart devices
CN111767982A (en) Training method and device for user conversion prediction model, storage medium and electronic equipment
Maiorana et al. Mobile keystroke dynamics for biometric recognition: An overview
Jia et al. Real‐time hand gestures system based on leap motion
CN117058723B (en) Palmprint recognition method, palmprint recognition device and storage medium
CN111626769B (en) Man-machine recognition method and device and storage medium
CN105809488B (en) Information processing method and electronic equipment
CN107729844A (en) Face character recognition methods and device
Zhang et al. From electromyogram to password: exploring the privacy impact of wearables in augmented reality
Zhang et al. Research and development of palmprint authentication system based on android smartphones
Hu et al. Deceive mouse-dynamics-based authentication model via movement simulation
US20210390177A1 (en) Emulator Detection Through User Interactions
CN115081334A (en) Method, system, apparatus and medium for predicting age bracket or gender of user
WO2021151947A1 (en) Method to generate training data for a bot detector module, bot detector module trained from training data generated by the method and bot detection system
CN113868516A (en) Object recommendation method and device, electronic equipment and storage medium
CN113807920A (en) Artificial intelligence based product recommendation method, device, equipment and storage medium
CN113496015A (en) Identity authentication method and device and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant