CN110162939B - Man-machine identification method, equipment and medium - Google Patents

Man-machine identification method, equipment and medium Download PDF

Info

Publication number
CN110162939B
CN110162939B CN201811248586.8A CN201811248586A CN110162939B CN 110162939 B CN110162939 B CN 110162939B CN 201811248586 A CN201811248586 A CN 201811248586A CN 110162939 B CN110162939 B CN 110162939B
Authority
CN
China
Prior art keywords
machine
man
terminal device
human
prediction probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811248586.8A
Other languages
Chinese (zh)
Other versions
CN110162939A (en
Inventor
范小龙
张西文
陈良文
曾键
钟子檀
张谋辉
杨正朋
沈维杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201811248586.8A priority Critical patent/CN110162939B/en
Publication of CN110162939A publication Critical patent/CN110162939A/en
Application granted granted Critical
Publication of CN110162939B publication Critical patent/CN110162939B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A man-machine identification method, apparatus and medium are disclosed. The man-machine identification method comprises the following steps: receiving original data acquired by terminal equipment in response to a man-machine identification request from the terminal equipment; extracting a first number of features of multiple dimensions from the raw data; inputting the first number of the plurality of dimensional features to a second number of the plurality of user behavior models, respectively, and outputting a second number of the plurality of human-machine predictor probabilities from the second number of the plurality of user behavior models; determining a human-machine prediction probability based on the plurality of human-machine predictor probabilities; and obtaining a man-machine recognition result concerning the operation at the terminal device based on the man-machine prediction probability.

Description

Man-machine identification method, equipment and medium
Technical Field
The present disclosure relates to the field of man-machine identification. And more particularly to a man-machine identification method, apparatus and medium.
Background
Human-machine identification is a security measure for performing authentication. Human recognition systems typically require the user to perform a simple test to prove that the operation is being performed by a normal user, rather than a computer attempting to attack a password-protected account.
The current man-machine recognition mode mainly adopts a verification code mode, which comprises character input, question understanding, image clicking, window sliding and the like. These man-machine recognition modes belong to the man-machine recognition modes with perception. This perceived man-machine recognition approach requires additional input verification by the user where the user is aware. Thus, there is a problem in that if the authentication code is set to be simple in consideration of reducing the complexity of user operations, the authentication code will be liable to break through. However, if the verification code is set to be complex because the verification code is not easy to break through, for example, the user is allowed to calculate mathematical questions, or see the pictures in the appointed state, etc., the complexity of the user operation is greatly increased, which is unfavorable for practical application.
In addition, the traditional malicious machine identification scheme is mainly generated based on strategies such as IP resources, frequent operation and the like. But the current black industry resources are more and more abundant, and the policy limits are very easy to break through.
Disclosure of Invention
In view of the above, it is desirable to provide a non-perceptual, behavior-based man-machine identification method, apparatus and medium that can achieve high accuracy man-machine identification and effectively prevent breakthrough of the black industry.
According to one aspect of the present disclosure, there is provided a human-machine identification method including: receiving original data acquired by terminal equipment in response to a man-machine identification request from the terminal equipment; extracting a first number of features of multiple dimensions from the raw data; inputting the first number of the plurality of dimensional features to a second number of the plurality of user behavior models, respectively, and outputting a second number of the plurality of human-machine predictor probabilities from the second number of the plurality of user behavior models; determining a human-machine prediction probability based on the plurality of human-machine predictor probabilities; and obtaining a man-machine recognition result concerning the operation at the terminal device based on the man-machine prediction probability.
In addition, in the man-machine identification method according to the embodiment of the disclosure, the second number of the plurality of user behavior models are trained on the same sample library based on different supervised classification algorithms, respectively.
In addition, in the man-machine identification method according to the embodiment of the present disclosure, after the step of determining the man-machine prediction probability, further includes: the human-machine prediction probability is stored in association with the terminal device in a database.
In addition, in the man-machine identification method according to the embodiment of the present disclosure, after the step of determining the man-machine prediction probability, further includes: searching and acquiring a plurality of historical man-machine prediction probabilities of the terminal equipment from the database; and inputting the current human-machine prediction probability and the plurality of historical human-machine prediction probabilities into a weighted model, and updating the human-machine prediction probability with the output of the weighted model.
In addition, the man-machine identification method according to the embodiment of the present disclosure further includes: updating the sample library based on the database; and retraining the plurality of user behavior models with the updated sample library.
In addition, in the man-machine identification method according to an embodiment of the present disclosure, after the step of extracting the first number of the features of the plurality of dimensions from the raw data, further includes: enhancement processing is performed on the extracted features to obtain a third number of features of multiple dimensions.
In addition, the man-machine identification method according to the embodiment of the present disclosure further includes: in response to a man-machine identification request from a terminal device, a token is sent to the terminal device, wherein the token is associated with a man-machine prediction probability of the terminal device.
According to another aspect of the present disclosure, there is provided a man-machine identification apparatus including: the communication unit is used for responding to a man-machine identification request from the terminal equipment and receiving the original data acquired by the terminal equipment; an extracting unit, configured to extract a first number of features of multiple dimensions from the raw data; and a processing unit for inputting the first number of the plurality of the dimension features to a second number of the plurality of the user behavior models, respectively, and outputting a second number of the plurality of the man-machine predictor probabilities from the second number of the plurality of the user behavior models, determining a man-machine predictor probability based on the plurality of the man-machine predictor probabilities, and obtaining a man-machine recognition result regarding the operation at the terminal device based on the man-machine predictor probability.
In addition, in the man-machine identification device according to the embodiment of the present disclosure, the processing unit further includes: and the modeling unit is used for training the same sample library based on different supervised classification algorithms respectively to obtain the second number of multiple user behavior models.
In addition, the man-machine identification apparatus according to the embodiment of the present disclosure further includes: and a storage unit configured to store a database, and wherein the man-machine prediction probability is stored in the database in association with the terminal device.
In addition, the man-machine identification apparatus according to the embodiment of the present disclosure further includes: a history querying unit for searching and acquiring a plurality of history man-machine prediction probabilities of the terminal device from the database, and wherein the processing is further configured to: the current human-machine prediction probability and the plurality of historical human-machine prediction probabilities are input into a weighted model, and the human-machine prediction probability is updated by the output of the weighted model.
In addition, the man-machine identification apparatus according to the embodiment of the present disclosure further includes: an updating unit configured to update the sample library based on the database; and wherein the modeling unit is further configured to: retraining the plurality of user behavior models with the updated sample library.
In addition, the man-machine identification apparatus according to the embodiment of the present disclosure further includes: and the feature enhancement unit is used for performing enhancement processing on the extracted features to obtain a third number of features with multiple dimensions.
In addition, in the man-machine identification method according to the embodiment of the present disclosure, the communication unit is further configured to: in response to a man-machine identification request from a terminal device, a token is sent to the terminal device, wherein the token is associated with a man-machine prediction probability of the terminal device.
According to another aspect of the present disclosure, there is provided a man-machine identification apparatus including: the communication unit is used for responding to a man-machine identification request from the terminal equipment and receiving the original data acquired by the terminal equipment; a storage unit for storing a computer program thereon; a processing unit for realizing the following steps when executing the computer program: extracting a first number of features of multiple dimensions from the raw data; inputting the first number of the plurality of dimensional features to a second number of the plurality of user behavior models, respectively, and outputting a second number of the plurality of human-machine predictor probabilities from the second number of the plurality of user behavior models; determining a human-machine prediction probability based on the second number of the plurality of human-machine predictor probabilities; and obtaining a man-machine recognition result concerning the operation at the terminal device based on the man-machine prediction probability.
According to another aspect of the present disclosure, there is provided a computer readable recording medium having stored thereon a computer program for realizing the following steps when the computer program is executed by a processing unit: receiving original data acquired by terminal equipment in response to a man-machine identification request from the terminal equipment; extracting a first number of features of multiple dimensions from the raw data; inputting the first number of the plurality of dimensional features to a second number of the plurality of user behavior models, respectively, and outputting a second number of the plurality of human-machine predictor probabilities from the second number of the plurality of user behavior models; determining a human-machine prediction probability based on the second number of the plurality of human-machine predictor probabilities; and obtaining a man-machine recognition result concerning the operation at the terminal device based on the man-machine prediction probability.
In the man-machine identification method and apparatus according to the embodiments of the present disclosure, man-machine identification is performed in a non-perceptual manner. That is, in the case where the user is unaware, it is judged whether or not the operation at the terminal device is the operation of the normal user by the collected characteristics at the terminal device. Therefore, compared with the mode that the user needs to calculate the verification code to perform man-machine identification in the prior art, the user is not required to perform any additional operation, and therefore the complexity of the user operation is reduced to the greatest extent. Further, in the man-machine recognition method and apparatus according to the embodiments of the present disclosure, the characteristics of the plurality of dimensions are collected based on the raw data of the plurality of categories, and the characteristics of the plurality of dimensions are also input to the user behavior model. In other words, the user behavior model in the present disclosure is a model built for the characteristics of multiple dimensions acquired based on the raw data of multiple categories. Compared to prior art schemes that use only a single class of behavioral data (e.g., keyboard, mouse operation data) to predict, the human-machine recognition method according to embodiments of the present disclosure is more accurate due to consideration of more classes of data and more dimensional features. In addition, in the man-machine recognition method according to the embodiment of the present disclosure, a plurality of user behavior models based on different supervised classification algorithms are adopted to perform prediction respectively, and the results of the plurality of different models are integrated to obtain a final man-machine prediction result. The accuracy of the prediction can be further improved compared to the scheme in the prior art in which only a single model is used for prediction. Also, in the man-machine recognition method and apparatus according to the embodiments of the present disclosure, the iterative user behavior model can be continuously updated based on the database, so that the user behavior model can more effectively cope with rapid changes in the black industry, thereby obtaining accurate man-machine recognition results even in the case of rapid changes in the black industry. In addition, in the man-machine recognition method and the equipment according to the embodiment of the disclosure, the man-machine prediction probability of this time can be further combined with the man-machine prediction probability of the history to obtain the final man-machine prediction probability, so that the accuracy of prediction can be further improved.
Drawings
FIG. 1 is a schematic diagram illustrating an application environment of an embodiment of the present disclosure;
FIG. 2 is a flow chart illustrating a method of man-machine identification according to an embodiment of the present disclosure;
FIG. 3 is a flow chart illustrating a method of man-machine identification according to another embodiment of the present disclosure;
FIG. 4 is a flow chart illustrating a method of man-machine identification according to yet another embodiment of the present disclosure;
FIG. 5 is a functional block diagram illustrating a human-machine identification device according to an embodiment of the present disclosure;
FIG. 6 is a functional block diagram illustrating a human-machine identification device according to another embodiment of the present disclosure;
FIG. 7 is a functional block diagram illustrating a human-machine identification device according to yet another embodiment of the present disclosure;
fig. 8 is a schematic diagram showing data flow between a server and a terminal device according to the present disclosure;
FIG. 9 illustrates an apparatus for integrated credit for a computing device as one example of a hardware entity in accordance with the present disclosure; and
fig. 10 shows a schematic diagram of a computer-readable recording medium according to an embodiment of the present disclosure.
Detailed Description
Various preferred embodiments of the present disclosure will be described below with reference to the accompanying drawings. The following description is provided with reference to the accompanying drawings to assist in the understanding of the exemplary embodiments of the present disclosure as defined by the claims and their equivalents. It includes various specific details that aid in understanding, but they are to be considered exemplary only. Accordingly, those skilled in the art will recognize that various changes and modifications may be made to the embodiments described herein without departing from the scope and spirit of the disclosure. Moreover, a detailed description of functions and configurations well known in the art will be omitted for the sake of clarity and conciseness of the present specification.
First, an application environment of the embodiments of the present disclosure is briefly described. As shown in fig. 1, the servers 10, 20 are connected to a plurality of terminal devices 30 through a network 40. The plurality of terminal devices 30 may be devices that actually perform various services. Although the terminal device 30 is shown in fig. 1 as a mobile phone in a unified manner, the present disclosure is not limited thereto. Those skilled in the art will appreciate that the terminal device 30 may also be any other type of device, such as a PDA (personal digital assistant), a tablet computer, a desktop computer, etc. The server 10 may be a device for man-machine identification as described below. The server 20 may be another server that interacts with the server 10. For example, the server 20 may be a business wind control backend server that queries the server 10 for human-machine recognition results, as described below. The network 40 may be any type of wired or wireless network, such as the Internet. It should be appreciated that the number of servers 10, servers 20, and terminal devices 30 shown in fig. 1 is illustrative and not limiting.
Next, a man-machine identification method according to an embodiment of the present disclosure will be described with reference to fig. 2. The man-machine identification method may be applied to the server 10 shown in fig. 1. As shown in fig. 2, the man-machine recognition method includes the following steps.
First, in step S201, raw data collected by a terminal device is received in response to a man-machine identification request from the terminal device. The raw data is data collected in real time on the terminal device. Specifically, when an operation requiring man-machine identification, such as logging in a certain bank account, is performed on the terminal device side, it is necessary to perform man-machine identification on the terminal device side, i.e., to determine whether the operation is an operation performed by a normal user or an operation performed by a malicious machine (i.e., an abnormal user). At this time, the terminal device sends a man-machine identification request to the server, and reports data collected by the terminal device to the server within a predetermined period before and after sending the man-machine identification request. The raw data includes a plurality of categories of data. For example, the raw data may include not only various behavior data collected by the terminal device at the front end, but also local attribute data and basic environment data about the terminal device. For example, the behavior data may include keyboard data, mouse data, etc. of the terminal device, the local attribute data may include attribute data about the terminal device itself, such as terminal device model data, system version data, basic hardware data, software data, days of use of the device, etc., and the basic environment data may include an IP address where the terminal device accesses the internet, accessed WIFI data, etc.
Next, at step S202, a first number of features of multiple dimensions are extracted from the raw data. As described above, the raw data may include a variety of categories of data. The number of the plurality of categories may be a fourth number. The first number and the fourth number are natural numbers independent of each other. For example, the first number is a natural number of 1 or more. The specific value of the first number will depend on the different application scenarios and the operation at the terminal device. The fourth number is a natural number of 2 or more. The specific value of the fourth number will be preset depending on the different application scenarios. For each category of data, features of multiple dimensions may be extracted from it.
The raw data is first preprocessed. In particular, the preprocessing may include data equal-length padding, abnormal data cleaning, and the like. Then, feature extraction processing is performed on the preprocessed data. For example, the process of extracting features may include operations such as digitizing non-numeric features, normalizing various data, and the like. Normalization processing is performed. For example, the normalization process may be performed by the following equation (1) or (2).
F(x)=(a+x)/(b+x)(1)
Figure BDA0001841118910000061
Wherein F (x) represents the characteristics obtained after normalization processing, and x represents the original data. a and b each represent a normalized parameter that can be adjusted according to different data discrimination levels.
For example, by inputting the raw data x as the number of days of use of the device to the above formula, a value between 0 and 1 is output as a feature input to the subsequent user behavior model. Of course, different functions may be used for normalization for different raw data, and the normalization method is not limited to the two listed above.
In addition, there are cases where there is less data collected during the predetermined period of time and thus fewer extractable features. However, a small number of features is disadvantageous for subsequent human recognition processing. Thus, in this case, after the step of extracting the first number of features of the plurality of dimensions from the raw data, the method may further comprise: enhancement processing is performed on the extracted features to obtain a third number of multi-dimensional features (not shown). That is, a third number of multi-dimensional features are additionally obtained on the basis of the extracted first number of multi-dimensional features. The third number is a natural number of 1 or more. Thus, by the enhancement processing, the first number plus the third number of the plurality of dimensions of the features can be obtained in total, so that the number of dimensions of the obtained features can be expanded. In addition, it is noted that enhancement processing is performed on features extracted from the same category of original data, respectively.
For example, for behavioral data in the raw data, the manner in which the data is enhanced may include statistics of the behavioral data to obtain various statistical features. Additionally, the manner in which the data is enhanced may also include differential manipulation of the behavior data. Furthermore, the manner in which data is enhanced may also include space-time multidimensional augmentation. For example, the behavior data collected at the terminal device includes mouse data. The mouse data are x and y coordinates of a cursor position displayed on a display screen of the middle terminal device, which are obtained through multiple sampling. Features of other dimensions such as the speed of mouse movement, angular velocity, etc. can be further derived by analyzing a plurality of x, y coordinates. For example, the original 20-dimensional base feature may be added to the 70-dimensional feature by a feature enhancement process.
Next, at step S203, the first number of features of the plurality of dimensions are respectively input to a second number of the plurality of user behavior models, and a second number of the plurality of human-machine predictor probabilities are output from the second number of the plurality of user behavior models. The second number and the first number are independent of each other. The second number is a natural number of 2 or more. The specific value of the second number will be preset depending on different application scenarios.
Here, it should be noted that the second number of the plurality of user behavior models are trained on the same sample library based on different supervised classification algorithms, respectively. The supervised classification algorithm includes a target variable (dependent variable, i.e., man-machine predictor probability) and a predicted variable (independent variable, i.e., extracted multidimensional feature) for predicting the target variable. Here, the man-machine predictor probability is a prediction result output from each user behavior model. From these variables a model can be built so that for a known predicted variable value, the corresponding target variable value can be obtained. This model is repeatedly trained until it reaches a predetermined accuracy on the training dataset. Specifically, in embodiments of the present disclosure, the sample library includes known multidimensional features as input arguments and known predictive probabilities as output arguments. For example, the user behavior model includes functions and parameters corresponding to the multidimensional features, and the like. Training is the training of those functions and parameters based on known independent and dependent variables. Through the training and learning process of the algorithm, the functions and parameters are continuously adjusted, and finally a group of functions and parameters for realizing the correct prediction result can be found. Once the functions and parameters are determined, the model is also determined. And then using the model for characteristic variables outside the sample library to further obtain a corresponding prediction result. For example, specific examples of supervised classification algorithms include: gradient lifting trees (Gradient Boosting Decision Tree, GBDT), convolutional neural networks (Convolutional Neural Network, CNN), logistic regression (Logistic Regression, LR), random Forest (RF), and the like.
For example, in the man-machine identification method according to the present disclosure, the second number may be set to 3. That is, three user behavior models may be used to perform human behavior prediction. The three user behavior models are trained based on a sample library by adopting a gradient lifting tree, a convolutional neural network and a random forest method respectively. Since the three user behavior models are trained based on different methods, respectively, different human-machine predictor probabilities will be output even if the same or approximately the same feature variables are input.
Of course, the number of user behavior models, i.e. the second number, is not limited to three. Those skilled in the art will appreciate that any other number of user behavior models may be similarly applied to the present disclosure and should be included within the scope of the present disclosure.
Next, in step S204, a human-machine prediction probability is determined based on the second number of the plurality of human-machine predictor probabilities. That is, the second plurality of human-machine predictor probabilities respectively output by the second plurality of user behavior models need to be integrated to obtain the final human-machine recognition result. For example, as a possible implementation, the second number of the plurality of the man-machine predictor probabilities may be averaged, and the calculated average value is used as the man-machine predictor probability for determining the man-machine recognition result.
Then, in step S205, based on the man-machine prediction probability, a man-machine recognition result concerning the operation at the terminal device is obtained. For example, it may be determined whether the operation at the terminal device is an operation of a normal user by judging a numerical range of the man-machine prediction probability. Specifically, when the human-machine prediction probability is greater than a certain threshold, the operation at the terminal device is considered as an operation of an abnormal user. When the human-machine prediction probability is smaller than a specific threshold value, the operation at the terminal device is considered to be the operation of a normal user.
It can be seen that in the man-machine identification method according to the embodiment of the present disclosure, man-machine identification is performed in a non-perceptual manner. That is, in the case where the user is unaware, it is judged whether or not the operation at the terminal device is the operation of the normal user by the collected characteristics at the terminal device. Therefore, compared with the mode that the user needs to calculate the verification code to perform man-machine identification in the prior art, the user is not required to perform any additional operation, and therefore the complexity of the user operation is reduced to the greatest extent. Further, in the man-machine recognition method and apparatus according to the embodiments of the present disclosure, the characteristics of the plurality of dimensions are collected based on the raw data of the plurality of categories, and the characteristics of the plurality of dimensions are also input to the user behavior model. In other words, the user behavior model in the present disclosure is a model built for the characteristics of multiple dimensions acquired based on the raw data of multiple categories. Compared to prior art schemes that use only a single class of behavioral data (e.g., keyboard, mouse operations) to predict, the human-machine recognition method according to embodiments of the present disclosure is more accurate due to consideration of more classes of data and more dimensional features. In addition, in the man-machine recognition method according to the embodiment of the present disclosure, a plurality of user behavior models based on different supervised classification algorithms are adopted to perform prediction respectively, and the results of the plurality of different models are integrated to obtain a final man-machine prediction result. The accuracy of the prediction can be further improved compared to the scheme in the prior art in which only a single model is used for prediction.
In addition, as another possible embodiment, the man-machine prediction probability calculated for each time of the terminal device may also be stored as man-machine prediction history data of the terminal device. Specifically, fig. 3 illustrates a man-machine identification method according to another embodiment of the present disclosure. The man-machine identification method comprises steps S201 to S205 described hereinabove with reference to fig. 2. In addition, as shown in fig. 3, the man-machine identification method according to another embodiment of the disclosure further includes step S301. In step S301, the man-machine prediction probability is stored in a database in association with the terminal device. For example, the database may be a human behavior and abnormal environment database. For the purpose of reducing the storage amount, only the human-machine prediction probability indicating the abnormal user may be stored in the human-machine behavior and abnormal environment black library. That is, the process of step S301 is performed only when the man-machine prediction probability indicates that the operation at the terminal device is not the operation of a normal user.
By storing the man-machine identification result calculated each time in the database, it is possible to make all the history identification data in the database. Of course, in addition to this, the database may also receive black data records from other sources, including abnormal behavior, abnormal devices, abnormal IP, etc. That is, the database further summarizes the historical man-machine recognition results and various black industry resources, so that more effective behavior features and more malicious machine resource black libraries can be provided, and the black industry is prevented from breaking through the behavior model of a single point.
Thus, if the iterative user behavior model can be updated continuously based on the database, the user behavior model will be able to cope with the rapid change of the black industry more effectively, so that an accurate man-machine recognition result can be obtained even in the case of the rapid change of the black industry.
In this regard, after step S301, the man-machine identification method may further include the following steps.
In step S302, the sample library is updated based on the database. A portion of the data in the database, such as the most recently updated data, may be synchronized into the sample library.
Then, at step S303, the second number of multiple user behavior models are retrained with the updated sample library.
In the man-machine recognition method described with reference to fig. 2, only a single man-machine prediction probability is used as the final man-machine prediction probability. However, the present disclosure is not limited thereto. For example, in yet another embodiment according to the disclosure, the current human machine prediction probability may be further combined with the historical human machine prediction probability to obtain the final human machine prediction probability.
Fig. 4 illustrates a man-machine identification method according to yet another embodiment of the present disclosure. Referring to fig. 4, a man-machine identification method according to still another embodiment of the present disclosure includes steps S201 to S205 described hereinabove with reference to fig. 2 and step S301 described hereinabove with reference to fig. 3. In addition, the man-machine identification method according to still another embodiment of the present disclosure further includes the following steps.
In step S401, a plurality of historical human-machine prediction probabilities of the terminal device are searched and acquired from the database.
Then, in step S402, the current man-machine prediction probability and the plurality of historical man-machine prediction probabilities are input to a weighted model, and the man-machine prediction probability is updated with the output of the weighted model.
Theoretically, the weight corresponding to the current human-machine prediction probability is the largest, and the weight corresponding to the historical human-machine prediction probability at an earlier time is smaller. However, for example, the specific values of the weights of the weighting model may be learned by a supervised classification algorithm. In addition, how many historical results to use are also analyzed by the supervised classification algorithm.
Specifically, the final human-machine prediction probability f is calculated based on the current human-machine prediction probability and a plurality of historical human-machine prediction probabilities as:
Figure BDA0001841118910000101
wherein:
k i weights to be learned;
t i the time attenuation coefficient is obtained, the difference value from the identification time to the current time point is taken, and normalization processing is carried out;
P i a single predicted man-machine prediction probability value;
and N is 10 times by default, and can be automatically adjusted according to the scene.
Thus, in the man-machine identification method according to still another embodiment of the present disclosure, since the history prediction data is integrated, the accuracy of prediction can be further improved.
In addition, the man-machine identification method according to the present disclosure may further include: in response to a man-machine identification request from a terminal device, a token is sent to the terminal device, wherein the token is associated with a man-machine identification result of the terminal device. Specifically, when an operation requiring man-machine identification, such as logging in a certain bank account, is performed on the terminal device side, it is necessary to perform man-machine identification on the terminal device side, i.e., to determine whether the operation is an operation performed by a normal user or an operation performed by a malicious machine (i.e., an abnormal user). Also, it should be noted here that it is the business air control backend server that desires to obtain the man-machine recognition result regarding the terminal device. For example, when logging in a certain bank account, the business wind control backend server of the bank wants to obtain a man-machine identification result about the terminal device to decide whether to allow the operation of the user at the terminal device. At this time, the terminal device sends the obtained token to the service wind control background server, and the service wind control background server inquires the server for man-machine identification about the terminal device based on the token.
Next, a man-machine identification apparatus according to an embodiment of the present disclosure will be described with reference to fig. 5. The man-machine identification device may be the server 10 described hereinabove with reference to fig. 1. As shown in fig. 5, the man-machine recognition apparatus 500 includes: a communication unit 501, an extraction unit 502, and a processing unit 503.
The communication unit 501 is configured to receive raw data acquired by a terminal device in response to a man-machine identification request from the terminal device.
The extraction unit 502 is configured to extract a first number of features of multiple dimensions from the raw data.
The processing unit 503 is configured to input the features of the first number of dimensions to a second number of the plurality of user behavior models, and output a second number of the plurality of human-machine predictor probabilities from the second number of the user behavior models, and determine the human-machine predictor probability based on the second number of the plurality of human-machine predictor probabilities. Then, based on the man-machine prediction probability, a man-machine recognition result concerning the operation at the terminal device is obtained.
And, the processing unit 503 further includes: a modeling unit 5031 for obtaining the second number of multiple user behavior models based on different supervised classification algorithms respectively training with respect to the same sample library.
It can be seen that in the man-machine identification device according to the embodiment of the present disclosure, man-machine identification is performed in a non-perceptual manner. That is, in the case where the user is unaware, it is judged whether or not the operation at the terminal device is the operation of the normal user by the collected characteristics at the terminal device. Therefore, compared with the mode that the user needs to calculate the verification code to perform man-machine identification in the prior art, the user is not required to perform any additional operation, and therefore the complexity of the user operation is reduced to the greatest extent. Further, in the man-machine recognition method and apparatus according to the embodiments of the present disclosure, the characteristics of the plurality of dimensions are collected based on the raw data of the plurality of categories, and the characteristics of the plurality of dimensions are also input to the user behavior model. In other words, the user behavior model in the present disclosure is a model built for the characteristics of multiple dimensions acquired based on the raw data of multiple categories. Compared to prior art schemes that use only a single class of behavioral data (e.g., keyboard, mouse operations) to predict, the human recognition device according to embodiments of the present disclosure is more accurate due to consideration of more classes of data and more dimensional features. In addition, in the man-machine recognition apparatus according to the embodiment of the present disclosure, a plurality of user behavior models based on different supervised classification algorithms are employed to perform prediction separately, and the results of the plurality of different models are integrated to obtain a final man-machine prediction result. The accuracy of the prediction can be further improved compared to the scheme in the prior art in which only a single model is used for prediction.
In addition, as another possible embodiment, the man-machine prediction probability calculated for each time of the terminal device may also be stored as man-machine prediction history data of the terminal device. In particular, fig. 6 illustrates a man-machine identification device according to another embodiment of the present disclosure. As shown in fig. 6, in addition to the communication unit 501, the extraction unit 502, and the processing unit 503, the man-machine recognition apparatus 600 further includes: the storage unit 601 is configured to store a database. And storing the man-machine prediction probability determined by the processing unit in association with the terminal device in the database.
By storing the man-machine identification result calculated each time in the database, it is possible to make all the history identification data in the database. Of course, in addition to this, the database may also receive black data records from other sources, including abnormal behavior, abnormal devices, abnormal IP, etc. That is, the database further summarizes the historical man-machine recognition results and various black industry resources, so that more effective behavior features and more malicious machine resource black libraries can be provided, and the black industry is prevented from breaking through the behavior model of a single point.
Thus, if the iterative user behavior model can be updated continuously based on the database, the user behavior model will be able to cope with the rapid change of the black industry more effectively, so that an accurate man-machine recognition result can be obtained even in the case of the rapid change of the black industry.
In this regard, the human recognition device 600 may further include: an updating unit 602, configured to update the sample library based on the database. And wherein the modeling unit 5031 is further configured to: retraining the second plurality of user behavior models with the updated sample library.
Further, in the man-machine recognition apparatus described with reference to fig. 5, the processing unit 503 adopts only a single man-machine prediction probability as the final man-machine prediction probability. However, the present disclosure is not limited thereto. For example, in yet another embodiment according to the disclosure, the current human machine prediction probability may be further combined with the historical human machine prediction probability to obtain the final human machine prediction probability.
Fig. 7 illustrates a man-machine identification device according to a further embodiment of the present disclosure. As shown in fig. 7, the man-machine recognition apparatus 700 further includes, in addition to the communication unit 501, the extraction unit 502, the processing unit 503, and the storage unit 601: and the history query unit 701 is configured to search and acquire a plurality of historical man-machine prediction probabilities of the terminal device from the database. And, the process 503 is further configured to: the current human-machine prediction probability and the plurality of historical human-machine prediction probabilities are input into a weighted model, and the human-machine prediction probability is updated by the output of the weighted model.
Furthermore, in the man-machine identification device according to the present disclosure, the communication unit 501 is further configured to: and sending a token to the terminal equipment in response to the man-machine identification request from the terminal equipment, wherein the token is associated with the man-machine identification result of the terminal equipment.
Fig. 8 shows a data flow between the server 10 for performing man-machine recognition, the terminal device 30, and the server 30 for querying the man-machine recognition result. Specifically, when an operation requiring man-machine identification, such as logging in a certain bank account, is performed on the terminal device side, it is necessary to perform man-machine identification on the terminal device side, i.e., to determine whether the operation is an operation performed by a normal user or an operation performed by a malicious machine (i.e., an abnormal user). As shown in fig. 8, the terminal device 30 at this time transmits a man-machine identification request to the server 10. In response to the man-machine identification request, the server 10 sends a token to the terminal device. Also, as described above, it is desirable that the service wind control backend server obtain a man-machine recognition result with respect to the terminal device. For example, when logging in a certain bank account, the business wind control backend server of the bank wants to obtain a man-machine identification result about the terminal device to decide whether to allow the operation of the user at the terminal device. At this time, the terminal device 30 transmits the obtained token to the service wind control backend server 20, and the service wind control backend server 20 inquires the server 10 for man-machine recognition about the man-machine recognition result of the terminal device based on the token.
An apparatus for computing device integrated credit according to the present disclosure is shown in fig. 9 as one example of a hardware entity. The device comprises a processor 901, a memory 902 and at least one external communication interface 903. The processor 901, the memory 902 and the external communication interface 903 are all connected by a bus 804.
For the processor 901 for data processing, when performing processing, it may be implemented with a microprocessor, a central processing unit (CPU, central Processing Unit), a digital signal processor (DSP, digital Singnal Processor), or a programmable logic array (FPGA, field-Programmable Gate Array); to the memory 902, operational instructions, which may be computer executable code, are embodied by which the steps of the method flows of the various embodiments of the present disclosure described above are implemented.
Fig. 10 shows a schematic diagram of a computer-readable recording medium according to an embodiment of the present invention. As shown in fig. 10, a computer-readable recording medium 1000 according to an embodiment of the present invention has stored thereon computer program instructions 1001. The man-machine identification method according to the embodiment of the invention described with reference to the above figures is performed when said computer program instructions 1001 are executed by a processor.
Heretofore, a man-machine identification method and apparatus according to embodiments of the present disclosure have been described in detail with reference to fig. 1 to 10. In the man-machine identification method and apparatus according to the embodiments of the present disclosure, man-machine identification is performed in a non-perceptual manner. That is, in the case where the user is unaware, it is judged whether or not the operation at the terminal device is the operation of the normal user by the collected characteristics at the terminal device. Therefore, compared with the mode that the user needs to calculate the verification code to perform man-machine identification in the prior art, the user is not required to perform any additional operation, and therefore the complexity of the user operation is reduced to the greatest extent. Further, in the man-machine recognition method and apparatus according to the embodiments of the present disclosure, the characteristics of the plurality of dimensions are collected based on the raw data of the plurality of categories, and the characteristics of the plurality of dimensions are also input to the user behavior model. In other words, the user behavior model in the present disclosure is a model built for the characteristics of multiple dimensions acquired based on the raw data of multiple categories. Compared to prior art schemes that use only a single class of behavioral data (e.g., keyboard, mouse operations) to predict, the human-machine recognition method according to embodiments of the present disclosure is more accurate due to consideration of more classes of data and more dimensional features. In addition, in the man-machine recognition method according to the embodiment of the present disclosure, a plurality of user behavior models based on different supervised classification algorithms are adopted to perform prediction respectively, and the results of the plurality of different models are integrated to obtain a final man-machine prediction result. The accuracy of the prediction can be further improved compared to the scheme in the prior art in which only a single model is used for prediction. Also, in the man-machine recognition method and apparatus according to the embodiments of the present disclosure, the iterative user behavior model can be continuously updated based on the database, so that the user behavior model can more effectively cope with rapid changes in the black industry, thereby obtaining accurate man-machine recognition results even in the case of rapid changes in the black industry. In addition, in the man-machine recognition method and the equipment according to the embodiment of the disclosure, the man-machine prediction probability of this time can be further combined with the man-machine prediction probability of the history to obtain the final man-machine prediction probability, so that the accuracy of prediction can be further improved.
It should be noted that in this specification the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
Finally, it is also to be noted that the above-described series of processes includes not only processes performed in time series in the order described herein, but also processes performed in parallel or separately, not in time series.
From the above description of embodiments, it will be apparent to those skilled in the art that the present disclosure may be implemented by means of software plus necessary hardware platforms, but may of course also be implemented entirely in software. Based on such understanding, all or part of the technical solution of the present disclosure that contributes to the background art may be embodied in the form of a software product, which may be stored in a storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., comprising several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the various embodiments or parts of the embodiments of the present disclosure.
The foregoing has outlined rather broadly the principles and embodiments of the present disclosure using specific examples that are presented herein to aid in the understanding of the methods of the present disclosure and the core concepts thereof; meanwhile, as one of ordinary skill in the art will have variations in the detailed description and the application scope in light of the ideas of the present disclosure, the present disclosure should not be construed as being limited to the above description.

Claims (11)

1. A man-machine identification method of a non-perception mode comprises the following steps:
receiving raw data collected by a terminal device in response to a man-machine identification request from the terminal device, wherein the raw data are various types of data collected on the terminal device in real time and comprise various behavior data collected by the terminal device at a front end, and local attribute data and basic environment data about the terminal device, wherein the various behavior data comprise behavior data about operation behaviors at the terminal device, the local attribute data comprise attribute data about the terminal device, and the operation behaviors are different from user operation behaviors based on a perceived verification code mode;
Extracting a first number of features of multiple dimensions from the raw data, wherein the first number and the number of categories of the raw data are independent of each other;
inputting the first number of the plurality of dimensional features into a second number of the plurality of user behavior models respectively, and outputting a second number of the plurality of human-machine predictor probabilities from the second number of the plurality of user behavior models, wherein the first number and the second number are independent from each other, and the second number of the plurality of user behavior models are respectively trained on the same sample library based on different supervised classification algorithms;
determining a human-machine prediction probability based on an average of the second number of multiple human-machine predictor probabilities;
storing the human-machine prediction probability in a database in association with the terminal device;
searching and acquiring a plurality of historical man-machine prediction probabilities of the terminal equipment from the database; and
inputting the current human-machine prediction probability and the plurality of historical human-machine prediction probabilities into a weighted model, and updating the human-machine prediction probability by using the output of the weighted model, wherein the weight corresponding to the current human-machine prediction probability is the largest, and the weight corresponding to the historical human-machine prediction probability with earlier time is smaller; and
And obtaining a man-machine identification result about the operation behavior at the terminal equipment based on the updated man-machine prediction probability.
2. The method of claim 1, wherein the second number of the plurality of user behavior models are trained on the same sample library based on different supervised classification algorithms, respectively.
3. The method of claim 1, further comprising:
updating the sample library based on the database; and
retraining the second plurality of user behavior models with the updated sample library.
4. The method of claim 1, wherein after the step of extracting a first number of features of multiple dimensions from the raw data, further comprising:
enhancement processing is performed on the extracted features to obtain a third number of features of multiple dimensions.
5. The method of claim 1, further comprising:
in response to a man-machine identification request from a terminal device, a token is sent to the terminal device, wherein the token is associated with a man-machine prediction probability of the terminal device.
6. A non-perceptive manner of human-machine identification device, comprising:
a communication unit configured to receive raw data collected by a terminal device in response to a man-machine identification request from the terminal device, wherein the raw data is a plurality of types of data collected in real time on the terminal device, and includes a plurality of types of behavior data collected by the terminal device at a front end, and local attribute data and basic environment data about the terminal device, wherein the plurality of types of behavior data includes behavior data for an operation behavior at the terminal device, the local attribute data includes attribute data about the terminal device itself, the operation behavior being different from a user operation behavior based on a perceived captcha scheme;
An extracting unit, configured to extract a first number of features of multiple dimensions from the raw data, where the first number and the number of categories of the raw data are independent from each other; and
the processing unit is used for inputting the characteristics of the first number of the multiple dimensions into a second number of the multiple user behavior models respectively and outputting a second number of the multiple man-machine predictor probabilities from the second number of the multiple user behavior models; determining a human-machine prediction probability based on an average value of the second number of the plurality of human-machine predictor probabilities, wherein the first number and the second number are independent of each other, and the second number of the plurality of user behavior models are trained on the same sample library based on different supervised classification algorithms respectively; .
A storage unit configured to store a database, and wherein the man-machine prediction probability is stored in the database in association with the terminal device;
a history inquiry unit for searching and acquiring a plurality of history man-machine prediction probabilities of the terminal equipment from the database, and
wherein the processing unit is further configured to: and inputting the current human-machine prediction probability and the plurality of historical human-machine prediction probabilities into a weighted model, updating the human-machine prediction probability by using the output of the weighted model, and obtaining a human-machine recognition result about the operation behavior at the terminal equipment based on the updated human-machine prediction probability, wherein the weight corresponding to the current human-machine prediction probability is the largest, and the weight corresponding to the historical human-machine prediction probability with the earlier time is the smaller.
7. The apparatus of claim 6, wherein the processing unit further comprises:
and the modeling unit is used for training the same sample library based on different supervised classification algorithms respectively to obtain the second number of multiple user behavior models.
8. The apparatus of claim 7, further comprising:
an updating unit configured to update the sample library based on the database; and is also provided with
Wherein the modeling unit is further configured to: retraining the second plurality of user behavior models with the updated sample library.
9. The apparatus of claim 6, further comprising:
and the feature enhancement unit is used for performing enhancement processing on the extracted features to obtain a third number of features with multiple dimensions.
10. The device of claim 6, wherein the communication unit is further configured to:
in response to a man-machine identification request from a terminal device, a token is sent to the terminal device, wherein the token is associated with a man-machine prediction probability of the terminal device.
11. A computer readable recording medium having stored thereon a computer program for realizing the following steps when said computer program is executed by a processing unit:
Receiving raw data collected by a terminal device in response to a man-machine identification request from the terminal device, wherein the raw data are various types of data collected on the terminal device in real time and comprise various behavior data collected by the terminal device at a front end, and local attribute data and basic environment data about the terminal device, wherein the various behavior data comprise behavior data about operation behaviors at the terminal device, the local attribute data comprise attribute data about the terminal device, and the operation behaviors are different from user operation behaviors based on a perceived verification code mode;
extracting a first number of features of multiple dimensions from the raw data, wherein the first number and the number of categories of the raw data are independent of each other;
inputting the first number of the plurality of dimensional features into a second number of the plurality of user behavior models respectively, and outputting a second number of the plurality of human-machine predictor probabilities from the second number of the plurality of user behavior models, wherein the first number and the second number are independent from each other, and the second number of the plurality of user behavior models are respectively trained on the same sample library based on different supervised classification algorithms;
Determining a human-machine prediction probability based on an average of the second number of multiple human-machine predictor probabilities; and
storing the human-machine prediction probability in a database in association with the terminal device;
searching and acquiring a plurality of historical man-machine prediction probabilities of the terminal equipment from the database;
inputting the current human-machine prediction probability and the plurality of historical human-machine prediction probabilities into a weighted model, and updating the human-machine prediction probability by using the output of the weighted model, wherein the weight corresponding to the current human-machine prediction probability is the largest, and the weight corresponding to the historical human-machine prediction probability with earlier time is smaller; and
and obtaining a man-machine identification result about the operation behavior at the terminal equipment based on the updated man-machine prediction probability.
CN201811248586.8A 2018-10-25 2018-10-25 Man-machine identification method, equipment and medium Active CN110162939B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811248586.8A CN110162939B (en) 2018-10-25 2018-10-25 Man-machine identification method, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811248586.8A CN110162939B (en) 2018-10-25 2018-10-25 Man-machine identification method, equipment and medium

Publications (2)

Publication Number Publication Date
CN110162939A CN110162939A (en) 2019-08-23
CN110162939B true CN110162939B (en) 2023-05-02

Family

ID=67645259

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811248586.8A Active CN110162939B (en) 2018-10-25 2018-10-25 Man-machine identification method, equipment and medium

Country Status (1)

Country Link
CN (1) CN110162939B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109784015B (en) * 2018-12-27 2023-05-12 腾讯科技(深圳)有限公司 Identity authentication method and device
CN111177668A (en) * 2019-11-21 2020-05-19 武汉极意网络科技有限公司 Man-machine interaction verification method based on mobile device sensor
CN113124636B (en) * 2019-12-31 2022-05-24 海信集团有限公司 Refrigerator
CN111428881B (en) * 2020-03-20 2021-12-07 深圳前海微众银行股份有限公司 Recognition model training method, device, equipment and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104980421A (en) * 2014-10-15 2015-10-14 腾讯科技(深圳)有限公司 Method and system for processing batch requests
CN106155298A (en) * 2015-04-21 2016-11-23 阿里巴巴集团控股有限公司 Man-machine recognition methods and device, the acquisition method of behavior characteristics data and device
CN106997493A (en) * 2017-02-14 2017-08-01 云数信息科技(深圳)有限公司 Lottery user attrition prediction method and its system based on multi-dimensional data
CN108416198A (en) * 2018-02-06 2018-08-17 平安科技(深圳)有限公司 Man-machine identification model establishes device, method and computer readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104980421A (en) * 2014-10-15 2015-10-14 腾讯科技(深圳)有限公司 Method and system for processing batch requests
CN106155298A (en) * 2015-04-21 2016-11-23 阿里巴巴集团控股有限公司 Man-machine recognition methods and device, the acquisition method of behavior characteristics data and device
CN106997493A (en) * 2017-02-14 2017-08-01 云数信息科技(深圳)有限公司 Lottery user attrition prediction method and its system based on multi-dimensional data
CN108416198A (en) * 2018-02-06 2018-08-17 平安科技(深圳)有限公司 Man-machine identification model establishes device, method and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于梯度提升决策树的鼠标轨迹识别方法与研究;张志腾等;《信息通信》;20180915(第09期);22-24页 *

Also Published As

Publication number Publication date
CN110162939A (en) 2019-08-23

Similar Documents

Publication Publication Date Title
CN110162939B (en) Man-machine identification method, equipment and medium
US10938927B2 (en) Machine learning techniques for processing tag-based representations of sequential interaction events
US20200304550A1 (en) Generic Event Stream Processing for Machine Learning
KR102257053B1 (en) Personalized trending image search suggestion
CN114207648A (en) Techniques to automatically update payment information in a computing environment
CN107305611B (en) Method and device for establishing model corresponding to malicious account and method and device for identifying malicious account
CN109391620B (en) Method, system, server and storage medium for establishing abnormal behavior judgment model
CN110162958B (en) Method, apparatus and recording medium for calculating comprehensive credit score of device
WO2021168617A1 (en) Processing method and apparatus for service risk management, electronic device, and storage medium
CN112231592A (en) Network community discovery method, device, equipment and storage medium based on graph
JP2024536241A (en) Techniques for input classification and response using generative neural networks.
Wei et al. Toward identifying APT malware through API system calls
US20170302516A1 (en) Entity embedding-based anomaly detection for heterogeneous categorical events
WO2023142408A1 (en) Data processing method and method for training prediction model
US11868768B2 (en) Detecting secrets in source code
KR102465307B1 (en) Method for generating of whitelist and user device for perfoming the same, computer-readable storage medium and computer program
CN113409014A (en) Big data service processing method based on artificial intelligence and artificial intelligence server
US11595438B2 (en) Webpage phishing detection using deep reinforcement learning
CN115146258B (en) Request processing method and device, storage medium and electronic equipment
CN116956356B (en) Information transmission method and equipment based on data desensitization processing
CN117527444B (en) Method, apparatus and medium for training a model for detecting risk values of login data
CN114205164B (en) Traffic classification method and device, training method and device, equipment and medium
CN115987689B (en) Network intrusion detection method and device
US20240330473A1 (en) Automatic classification of security vulnerabilities
EP4113398A2 (en) Data labeling processing method and apparatus, electronic device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant