WO2020164274A1

WO2020164274A1 - Network verification data sending method and apparatus, and storage medium and server

Info

Publication number: WO2020164274A1
Application number: PCT/CN2019/117939
Authority: WO
Inventors: 黎立桂
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-02-13
Filing date: 2019-11-13
Publication date: 2020-08-20
Also published as: CN109981567A

Abstract

Provided are a network verification data sending method and apparatus, and a storage medium and a server. The network verification data sending method comprises: receiving a verification request of a target user; acquiring environmental data of a terminal device of the target user; obtaining a risk coefficient of the terminal device of the target user according to the environmental data; if the risk coefficient is lower than a preset risk threshold, sending perception-free verification data to the target user; and if the risk coefficient is not lower than the preset risk threshold, sending intelligence verification data to the target user. In the present application, relatively simple perception-free verification data can be sent to a terminal device with a low risk coefficient, such that difficulty in verification is reduced for a normal user, and intelligence verification data is sent to a terminal device with a high risk coefficient, such that the capability of a server resisting a malicious attack is improved.

Description

Method, device, storage medium and server for sending network verification data

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 13, 2019, with application number 201910115365.1, and the application title is "Network Verification Data Transmission Method, Device, Storage Medium, and Server", all of which are approved The reference is incorporated in this application.

Technical field

This application relates to the field of computer technology. Specifically, this application relates to a method, device, storage medium, and server for sending network verification data.

Background technique

When a user logs in or browses a part of a web page, in order to avoid machine operations occupying server resources or being maliciously attacked, the client of the app or web page will require the user to perform behavior verification to confirm that the user currently operating is a non-machine user, and to avoid malicious Login or malicious attack. In the existing network authentication methods, for all users, regardless of whether they are machine accounts, or the software and hardware environment for account login, generally a network authentication method is adopted. In order to ensure the user experience of non-machine users, the general verification method should not be too complicated, but simple verification methods are easy to be cracked by some users with high risk coefficients, thus failing to achieve the purpose of protecting the rights and interests of normal users; but setting too complex verification methods For some users with low risk factors, the verification method has the problems of long verification time and complicated operations, and the user experience is very poor.

Summary of the invention

Aiming at the shortcomings of the existing methods, this application proposes a method, device, storage medium and server for sending network verification data to solve the problems existing in the prior art.

The method for sending network verification data includes:

Receive verification requests from target users;

Obtain environmental data of the target user terminal device;

According to the environmental data, obtain the risk coefficient of the target user terminal device;

If the risk coefficient is lower than the preset risk threshold, send no perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.

This application also proposes a device for sending network verification data, which includes:

The verification request receiving module is used to receive the verification request of the target user;

Environmental data acquisition module for acquiring environmental data of target user terminal equipment;

The risk coefficient calculation module is used to obtain the risk coefficient of the target user terminal device according to the environmental data;

The verification data sending module is used to send non-perception verification data to the target user when the risk coefficient is lower than the preset risk threshold; and send intelligence verification data to the target user when the risk coefficient is not lower than the preset risk threshold.

The present application also proposes a computer non-volatile readable storage medium on which a computer program is stored, which is characterized in that when the program is executed by a processor, the method for sending network verification data as described in any one of the foregoing is implemented.

This application also proposes a server, which includes:

One or more processors;

Storage device for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the method for sending network verification data according to any one of the foregoing

This application has the following beneficial effects:

This application can achieve the purpose of sending relatively simple non-perceptual verification data to terminal devices with low risk factors, reducing the difficulty of verification for normal users, and improving user experience; at the same time, this application can send intelligence verification data to terminal devices with high risk factors. The machine user is prevented from launching malicious attacks to the server by means of verification requests, thereby achieving the purpose of filtering abnormal verification requests and improving the server's ability to resist malicious attacks.

The additional aspects and advantages of this application will be partly given in the following description, which will become obvious from the following description, or be understood through the practice of this application.

Description of the drawings

The above and/or additional aspects and advantages of the present application will become obvious and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, in which:

Figure 1 is a schematic flow chart of an embodiment of establishing a risk coefficient prediction model in this application;

2 is a schematic flowchart of a second embodiment of a method for sending network verification data according to this application;

3 is a schematic diagram of the module structure of an embodiment of a device for sending network verification data according to the application;

Figure 4 is a schematic structural diagram of an embodiment of a server of this application.

detailed description

The embodiments of the present application are described in detail below, and examples of the embodiments are shown in the accompanying drawings.

Those skilled in the art can understand that the server used here includes, but is not limited to, a computer, a network host, a single network server, a set of multiple network servers, or a cloud composed of multiple servers. Here, the cloud is composed of a large number of computers or network servers based on Cloud Computing. Cloud computing is a type of distributed computing, a super virtual computer composed of a group of loosely coupled computer sets.

The solutions provided in the embodiments of this application can be applied to a web server or application server. Through the solutions provided in the embodiments of this application, the server can determine the risk factor of the terminal application environment where the user accessing the web page or application is located, and based on the above The risk factor determines the user's authentication method, thereby reducing the interference or attack of abnormal users to the server. The solutions provided in the embodiments of the present application can be applied to various user verification scenarios, which is not limited in this application.

The technical solution provided by the embodiments of this application is composed of two parts: the first part uses the distinguishing feature set of the sample users to train the classification model to establish the risk coefficient prediction model; the second part uses the trained risk coefficient prediction model to determine the risk of the target user The level of the coefficient and the preset risk threshold in order to send the corresponding verification data.

The following describes the embodiments of the present application in detail according to the sequence of establishing a risk coefficient prediction model and sending verification data according to the risk coefficient prediction model.

Before performing the method for sending network verification data described in this application, a risk coefficient prediction model may be established first, as shown in FIG. 1, the establishment of the risk coefficient prediction model includes the following steps:

Step S10: Obtain sample data of a sample user, where the sample data includes environmental data of the sample user;

Step S20: Annotate the sample data to obtain a sample data annotation set, and generate a distinguishing feature set according to the sample data annotation set;

Step S30: establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;

Step S40: Obtain a risk coefficient prediction model according to the classification model.

Among them, each step is explained as follows:

Step S10: Obtain sample data of a sample user, where the sample data includes environmental data of the sample user.

When the user's risk coefficient needs to be predicted based on environmental factors in the sample data, the sample data needs to include a variety of different environmental data and risk coefficients corresponding to different environmental data. Therefore, the risk coefficient includes both high risk coefficient and low risk coefficient. In order to obtain the environmental data corresponding to the high risk factor, an automated device may be used to simulate the abnormal behavior of the abnormal user, so as to obtain the verification request data corresponding to the abnormal behavior and the corresponding environmental data. Of course, in practical applications, real attacks of abnormal users can also be used as sample data to continuously enrich the diversity of sample data. In most of the visits, the general users are the data of normal users. Therefore, the verification request data of normal users and the corresponding environmental data in practical applications can be used as sample data, and the verification request of normal users can also be simulated by automated equipment. A variety of different normal sample data. When the environmental data in the sample data is difficult to obtain, the environmental data of the terminal device can be obtained through the crawler algorithm. Therefore, in some embodiments of the present application, the obtaining sample data of sample users may include:

Obtain sample data of sample users through verification requests sent by sample users, automated verification equipment and crawler algorithms.

In some embodiments of the present application, abnormal sample data can be obtained according to crawler algorithms and automated verification equipment, and data corresponding to a normal verification request of the user can be regarded as normal sample data. The sample data includes environmental data and request data of the sample user, and the crawler algorithm can be run on the terminal device through a webpage plug-in or an application program to send the environmental data of the terminal to the server. The request data may include a user registration request and/or a verification request, so as to trigger the verification request when the user registers or verifies the identity. The environmental data can be obtained through crawler algorithms, automated verification equipment, and normal verification.

In another embodiment of the present application, the acquiring sample data of the sample user through the verification request sent by the sample user, the automated verification device, and the crawler algorithm includes:

Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;

The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data.

The environmental data mentioned in this application may include: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, available screen resolution data, operating system data, time data, touch data, specific plug-ins One or more of word count, software installation data, browser data, and stack fingerprint data. The device types include terminal types such as Android phones, tablet computers, and computer types. The operating system data includes IOS type, version, resolution, etc., endpoint IP address, etc. The sample data obtained may include some sample data with uncertain sources, and such data may not be used as training sample data.

Step S20: Annotate the sample data to obtain a sample data annotation set, and generate a distinguishing feature set according to the sample data annotation set.

When the sample data is a known data value, the sample data obtained through the multiple channels can be labeled according to the known sample data information to obtain an accurate sample data label set. The sample data label set may also include a normal sample data label set and an abnormal sample data label set. When labeling, it can be determined whether the sample data is normal sample data or abnormal sample data according to whether the actual data of the known terminal device is consistent with the sample data obtained through verification request or the like. For example, if the user_agent of the terminal device analyzes that the device is a mobile phone, and the actual terminal device is a computer, the sample data is marked as abnormal sample data; the terminal device does not support multi-touch, but the real terminal is obtained through JavaScript scripts If the device supports multi-touch, the sample data is marked as abnormal sample data.

In some labeling methods, the similarity between the sample data and the normal sample data obtained through the normal verification method can also be labeled, and the high similarity to the normal sample data is labeled as the normal sample data, and the similarity to the normal sample data is low. The marked as abnormal sample data. For example: mark the sample data with the access frequency of the terminal device within 1 minute or the access frequency of a fixed IP address within 1 minute; if the access frequency of the terminal device within 1 minute or a fixed IP address within 1 minute If the frequency of visits exceeds 10 times, the sample data is marked as abnormal sample data. Another example: label based on a user’s operating system types, device types, IP address types, and other factors. If a user’s operating system types, device types, or IP address types are many, the sample The data is marked as abnormal sample data.

After the sample data label set is obtained, a distinguishing feature set is generated according to the sample data label set. The features in the distinguishing feature set are all determined from the environmental data, and their specific features may include one or more of the following features: browser language, terminal device pixel ratio, terminal device color depth, X-direction screen resolution Screen resolution, Y direction screen resolution, X direction available screen resolution, Y direction available screen resolution, X direction and Y direction screen resolution product resolution_multi, X direction and Y direction available screen resolution product available_resolution_multi, the above two The difference of a product, etc. Wherein, the difference between the two products can be used to generate a resolution-based nonlinear combination feature, and to explore the normal range of the product, especially when the resolution of the terminal device is too low, the difference can be The value determines whether the terminal device is a low-end device, so as to configure a low-end crawler algorithm for the terminal device. The specific features in the distinguishing feature set may also include one or more of the following features: acquired 24-hour features (for example, the sample data between 2 AM and 5 AM may be abnormal sample data), monthly features, Holiday characteristics (Spring Festival, National Day holidays, etc.), weekday characteristics, week characteristics, the number of touch points of the terminal device, whether the terminal device supports touch control, whether the touch point of the terminal device is consistent with the operating system, the terminal Whether the number of touch points of the device is consistent with whether it supports touch control, etc. Among them, whether the number of touchable points of the terminal device is consistent with whether it supports touch can be used to detect the change of the operating system based on the number of touchable points of the terminal device, and the terminal device that does not support touch operation is often used for crawler operation and normal user use The terminal device of is generally a terminal device that can be touched, so whether the terminal device supports touch operation can be one of the characteristics for judging the risk factor.

The specific features in the distinguishing feature set may also include one or more of the following features: whether the type of CPU is known, whether the browser plug-in is missing, whether the font list detected using JS/CSS is missing, whether the operating system is unknown, Whether WebGL supplier information is missing, whether the browser type is robot, whether the browser manufacturer is other, whether the operating system manufacturer is other, the total number of browser plug-ins, the total number of browser plug-ins, the total number of fonts detected using JS/CSS, and the operation Whether the system and the system platform are consistent, whether the audio stack fingerprint is provided, the parameter information of the audio stack fingerprint, the total number of logical processors available to the user agent by the system, whether AdBlock is installed, whether the user has tampered with the language, whether the user has tampered with the screen resolution, the user Whether the operating system, browser manufacturer, operating system manufacturer, access device type, operating system and system platform are the same. Among them, whether WebGL supplier information is missing can be used to determine the abnormality of the user's use of the device, and then determine the user's effectiveness according to the degree of supplier information missing, so as to achieve the purpose of grading and judging user effectiveness; for example, when the degree of missing is high, medium and low, users Effectiveness corresponds to low, medium and high. Among them, whether the browser type is robot can be used to judge whether the terminal device is abnormal according to the degree of tampering of the browser, or to judge whether it is a crawler model browser behavior according to the browser type. Among them, the total number of fonts detected using JS/CSS can be used to determine whether there are specific plug-ins that help crawlers realize web page data analysis, and to quantitatively determine whether the system or browser has been tampered with by crawlers based on the total number of fonts. Among them, whether the operating system and the system platform are consistent can be used to detect whether the system of the terminal device can be tampered with, and to confirm the consistency of the operating system and the system platform when marking sample data.

Therefore, this application also proposes an embodiment for establishing a risk coefficient prediction model: the environmental data of the sample user includes: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, and available screen resolution One or more of data, operating system data, time data, touch data, specific plug-in word count, software installation data, browser data, and stack fingerprint data to determine features in the distinguishing feature set from the environmental data .

This application can also calculate the range, quartile, quartile range, and quintile summary of each value based on the numerical type features in the aforementioned distinguishing feature set (in order minimum, upper quartile, median, Lower quartile, maximum value) to get the derived features for identifying outliers. If the outlier in the sample data is the required abnormal sample data, the sample data corresponding to the outlier can be regarded as the abnormal sample data; if the sample data corresponding to the outlier is the sample that needs to be filtered out Data, remove the sample data corresponding to the outlier from all sample data. Therefore, in another specific embodiment of the present application, the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X direction of the terminal device. Multiplied by the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;

Before generating a distinguishing feature set based on the sample data label set, the method further includes:

Calculate the total number of logical processors available to the user, the total number of fonts of the device terminal, the product of the available screen resolution, the range of the total number of plug-ins, the quartile, the quartile range and the five-number summary;

Based on the range, quartile, quartile range, and quintile generalization, the derivative features for identifying outliers are obtained;

The sample data annotation set is filtered according to the derived features.

This embodiment can achieve the purpose of filtering the sample data annotation set, thereby improving the accuracy of the risk coefficient prediction model. For features that do not have numeric values, they can be mapped into specific numeric features; for Boolean features, since there are only two values, the range, quartile, quartile range, and quintile are not calculated Generalize.

The Naive Bayes algorithm based on Gaussian distribution is selected as the classification model for supervised anomaly verification requests. The Bayesian classification algorithm can learn the characteristics of positive and negative samples at the same time to obtain the comprehensive evaluation probability of whether each sample is abnormal, that is, every The probability of whether the sample data is abnormal. The Naive Bayesian algorithm based on Gaussian distribution is an existing probability model. It learns the features of different feature values by assuming that the original data obeys the Gaussian distribution and the influence of a feature value on a given class is independent of the values of other features. , Calculate the probability that an example belongs to a specific class through the prior probability and the posterior probability. The mathematical expressions and algorithm models of the algorithm model are also well-known technologies, and will not be repeated here.

According to the classification model, the probability of whether each sample data is abnormal can be obtained, and then the mapping function or corresponding relationship between the probability and the risk coefficient can be set to obtain the risk coefficient prediction model. For example, multiplying the probability by a preset threshold can obtain a risk coefficient that is positively or negatively correlated with the probability. In some embodiments, the selected threshold ε can be used as a fixed coefficient, and the selected threshold ε can also be a step coefficient to obtain stepped risk coefficients corresponding to different numerical ranges.

In this application, by establishing the risk coefficient prediction model, the risk coefficients corresponding to different environmental data can be obtained, so that the server can determine the sending data of the verification picture according to the risk coefficient. The verification pictures may include non-perceptual verification pictures suitable for low risk factors, such as static pictures with numbers or text, etc., and may also include intelligence verification pictures suitable for high risk factors that require user participation, such as sliding puzzles, intelligence Questions and answers, etc.

For example, the risk coefficient prediction model established in this application can detect aggressive verification requests that differ in the time point of normal verification and the request frequency; a further example can be: under normal circumstances, users rarely do so at 2-5 in the morning. Point to send verification requests frequently, or under normal circumstances, it is impossible for users to dynamically switch IP addresses frequently in a short period of time. Therefore, the risk coefficient prediction model of this application can quickly mark the risk coefficient of the terminal device as high risk according to the time characteristics of the verification request, so that the user can send verification data suitable for high risk coefficient.

In an embodiment of the present application for establishing a risk coefficient prediction model, the acquisition of sample data of the sample user through the verification request sent by the sample user, automated verification equipment, and crawler algorithm includes:

The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data;

The labeling the sample data to obtain a sample data label set; generating a distinguishing feature set according to the sample data label set includes:

Respectively label the normal sample data and abnormal sample data to obtain a normal sample data label set and an abnormal sample data label set;

A distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.

In this embodiment, an automated verification device is used to simulate the verification request of an abnormal sample user, and a large amount of sample data can be quickly obtained to improve the speed of establishing a risk coefficient prediction model. The normal request data is that the verification request is normal. If a normal sample user sends an abnormal verification request, the verification request is not the normal request data. The abnormal sample user indicates a sample user with a high risk coefficient, and the verification request may be normal or abnormal; through the risk coefficient prediction model of this application, the normal verification request can be distinguished from An abnormal verification request, that is, whether the risk coefficient of the abnormal sample user is low or high.

This application proposes a method for sending network verification data, as shown in Figure 2, which is used to take into account the complexity of user verification methods and improve user experience, which includes the following steps:

Step S1: Receive a verification request from the target user;

Step S2: Obtain environmental data of the target user terminal device;

Step S3: Obtain the risk coefficient of the target user terminal device according to the environmental data;

Step S4: If the risk coefficient is lower than the preset risk threshold, send non-perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.

When receiving the verification request of the target user, the server obtains the environmental data of the target user's terminal device. The environmental data can be obtained through methods such as plug-in programs, crawling algorithms, or sent to the server together with the verification request; the server calculates the risk coefficient of the target user terminal device according to the environmental data, and then according to the level of the risk coefficient Send the corresponding verification data to the terminal device. This application can achieve the purpose of sending relatively simple non-perceptual verification data to terminal devices with low risk factors, reducing the difficulty of verification for normal users, and improving user experience; at the same time, this application can send intelligence verification data to terminal devices with high risk factors to prevent The machine user initiates a malicious attack to the server through the verification request, thereby achieving the purpose of filtering abnormal verification requests and improving the server's ability to resist malicious attacks.

In another embodiment of the method for sending network verification data of the present application, if the risk factor is not lower than the preset risk threshold, sending intelligence verification data to the target user includes:

If the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;

If the risk coefficient is higher than the preset high-risk threshold, send the second intelligence verification data to the target user.

In this embodiment, the risk factor is divided into three levels, and the first intelligence verification data and the second intelligence verification data are sent to the two levels with higher risk factors, for example, the first intelligence verification data is sent to a terminal device with a medium risk factor. The verification data sends the second intelligence verification data to the terminal device with a high risk coefficient, thereby further clarifying the correspondence between the environmental data and the verification data, and further optimizing the user experience.

In another embodiment of the method for sending network verification data of the present application, the obtaining the risk coefficient of the target user terminal device according to the environmental data includes:

Input the environmental data into the risk coefficient prediction model to obtain the risk coefficient of the target user terminal device;

The risk coefficient prediction model is established by the following methods:

Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;

Label the sample data to obtain a sample data label set;

Generating a distinguishing feature set according to the sample data label set;

Establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;

According to the classification model, a risk coefficient prediction model is obtained.

The risk coefficient prediction model in this embodiment can be established in a server, or can be derived from other devices or networks, has portability, and improves the scope of application of the method for sending network verification data.

Further, if the risk factor is lower than the preset risk threshold, send the non-perceived verification data to the target user; if the risk factor is not lower than the preset risk threshold, send the intelligence verification data to the target user, which can specifically adopt the following methods:

Ways to set the risk threshold: For example, set the risk threshold to 50%; when the probability value output by the classification model is greater than 50%, it is abnormal; when it is less than 50%, it is normal; you can also directly output 1 and -1, Among them, 1 means normal, -1 means abnormal; according to the results of normal and abnormal, different verification data are sent respectively.

The way of setting the risk level: For example, when the probability value output by the classification model is 0-40%, it corresponds to the low risk L1 level; when the output probability value is 40%-60%, it corresponds to the medium risk L2 Level: When the output probability value is 60%-100%, it corresponds to the high-risk L3 level; according to the level, different verification data are sent. For example: for L1 level, you can load the intelligent non-perceptual verification method, for L2 level, you can load the sliding puzzle verification method or the verification method of recognizing text in the background picture randomly, and for the L3 level, you can load the voice verification method.

This application also proposes a device for sending network verification data. As shown in FIG. 3, the device includes:

The verification request receiving module 1 is used to receive the verification request of the target user;

The environmental data acquisition module 2 is used to acquire environmental data of the target user terminal device;

The risk coefficient calculation module 3 is used to obtain the risk coefficient of the target user terminal device according to the environmental data;

The verification data sending module 4 is configured to send non-perception verification data to the target user when the risk coefficient is lower than the preset risk threshold; and send intelligence verification data to the target user when the risk coefficient is not lower than the preset risk threshold.

The embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method for sending network verification data described in any one of the above is implemented. Wherein, the storage medium includes, but is not limited to, any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory), RAM (Random AccesSS Memory), and then Memory), EPROM (EraSable Programmable Read-Only Memory), EEPROM (Electrically EraSable Programmable Read-Only Memory), flash memory, magnetic card or optical card. That is, the storage medium includes any medium that stores or transmits information in a readable form by a device (for example, a computer). It can be a read-only memory, magnetic disk or optical disk, etc.

The embodiment of the present application also provides a server, and the server includes:

One or more processors;

Storage device for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the method for sending network verification data according to any one of the foregoing.

FIG. 4 is a schematic diagram of the structure of the server of this application, including a processor 320, a storage device 330, an input unit 340, a display unit 350 and other devices. Those skilled in the art can understand that the structural components shown in FIG. 4 do not constitute a limitation on all servers, and may include more or less components than those shown in the figure, or combine certain components. The storage device 330 may be used to store the application program 310 and various functional modules. The processor 320 runs the application program 310 stored in the storage device 330 to execute various functional applications and data processing of the device. The storage device 330 may be an internal memory or an external memory, or include both internal memory and external memory. The internal memory may include read-only memory, programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory. External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc. The storage devices disclosed in this application include but are not limited to these types of storage devices. The storage device 330 disclosed in this application is merely an example and not a limitation.

The input unit 340 is used for receiving signal input, and receiving user attribute information of the target user on the first statistical date and access information to the specified target. The input unit 340 may include a touch panel and other input devices. The touch panel can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc., to operate on the touch panel or near the touch panel), and according to preset The program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as playback control buttons, switch buttons, etc.), trackball, mouse, and joystick. The display unit 350 may be used to display information input by the user or information provided to the user and various menus of the computer device. The display unit 350 may take the form of a liquid crystal display, an organic light emitting diode, or the like. The processor 320 is the control center of the computer equipment. It uses various interfaces and lines to connect various parts of the entire computer, runs or executes software programs and/or modules stored in the storage device 330, and calls data stored in the storage device. , Perform various functions and process data.

In an embodiment, the server includes one or more processors 320, one or more storage devices 330, and one or more application programs 310, wherein the one or more application programs 310 are stored in the storage device 330 And is configured to be executed by the one or more processors 320, and the one or more application programs 310 are configured to execute the churn user early warning method described in the above embodiments.

It should be understood that, although the various steps in the flowchart of the drawings are shown in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least part of the steps in the flowchart of the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution is also It is not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.

It should be understood that the functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.

The above are only part of the implementation of this application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of this application, several improvements and modifications can be made, and these improvements and modifications are also Should be regarded as the scope of protection of this application.

Claims

A method for sending network verification data is characterized in that it comprises the steps:

Receive verification requests from target users;

Obtain environmental data of the target user terminal device;

According to the environmental data, obtain the risk coefficient of the target user terminal device;

If the risk coefficient is lower than the preset risk threshold, send no perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.
The method according to claim 1, wherein if the risk coefficient is not lower than the preset risk threshold, sending intelligence verification data to the target user comprises:

If the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;

If the risk coefficient is higher than the preset high-risk threshold, send the second intelligence verification data to the target user.
The method according to claim 1, characterized in that, before obtaining the risk coefficient of the target user terminal device according to the environmental data, it further comprises establishing a risk coefficient prediction model, and the establishing the risk coefficient prediction model comprises the following steps:

Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;

Label the sample data to obtain a sample data label set, and generate a distinguishing feature set according to the sample data label set;

Establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;

According to the classification model, a risk coefficient prediction model is obtained;

The obtaining the risk coefficient of the target user terminal device according to the environmental data includes:

The environmental data is input into the risk coefficient prediction model to obtain the risk coefficient of the target user terminal device.
The method according to claim 3, wherein said acquiring sample data of sample users comprises:

Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;

The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data;

The labeling the sample data to obtain a sample data label set; generating a distinguishing feature set according to the sample data label set includes:

Respectively label the normal sample data and abnormal sample data to obtain a normal sample data label set and an abnormal sample data label set;

A distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.
The method according to claim 3, wherein the environmental data of the sample user includes: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, available screen resolution data, operating system One or more of data, time data, touch data, specific plug-in words, software installation data, browser data, and stack fingerprint data.
The method according to claim 5, wherein the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X of the terminal device. The product of the direction and the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;

Before generating a distinguishing feature set based on the sample data label set, the method further includes:

Calculate the total number of logical processors available to the user, the total number of fonts of the terminal device, the product of the available screen resolution, the range of the total number of plug-ins, the quartile, the quartile range, and the five-number summary;

Based on the range, quartile, quartile range, and quintile generalization, the derivative features for identifying outliers are obtained;

The sample data annotation set is filtered according to the derived features.
The method according to claim 3, wherein the features in the distinguishing feature set include any one or more of the following: browser language, pixel ratio of the terminal device, color depth of the terminal device, X-direction screen resolution , Y-direction screen resolution, X-direction usable screen resolution, Y-direction usable screen resolution, the first product of X-direction and Y-direction screen resolution, the second product of X-direction and Y-direction available screen resolution, The difference between the first product and the second product, whether the CPU type is known, whether the browser plug-in is missing, whether the font list detected by JS/CSS is missing, whether the operating system is unknown, whether the WebGL vendor information Missing, whether the browser type is robot, whether the browser manufacturer is other, whether the operating system manufacturer is other, the total number of browser plug-ins, browser plug-ins, the total number of fonts detected using JS/CSS, whether the operating system and system platform Consistency, whether the audio stack fingerprint is provided, the parameter information of the audio stack fingerprint, the total number of logical processors available to the user agent by the system, whether AdBlock is installed, whether the user has tampered with the language, whether the user has tampered with the screen resolution, whether the user has tampered with the operating system , Whether the browser manufacturer, operating system manufacturer, access device type, operating system and system platform are consistent.
A device for sending network verification data, characterized in that it comprises:

The verification request receiving module is used to receive the verification request of the target user;

Environmental data acquisition module for acquiring environmental data of target user terminal equipment;

The risk coefficient calculation module is used to obtain the risk coefficient of the target user terminal device according to the environmental data;

The verification data sending module is used to send non-perception verification data to the target user when the risk coefficient is lower than the preset risk threshold; and send intelligence verification data to the target user when the risk coefficient is not lower than the preset risk threshold.
The device according to claim 8, wherein the verification data sending module is specifically configured to send intelligence verification data to the target user when the risk factor is not lower than the preset risk threshold:

If the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;

If the risk coefficient is higher than the preset high-risk threshold, send the second intelligence verification data to the target user.
The device according to claim 8, wherein the risk coefficient calculation module is further used to establish a risk coefficient prediction model, and the risk coefficient calculation module is specifically used to:

Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;

Label the sample data to obtain a sample data label set, and generate a distinguishing feature set according to the sample data label set;

Establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;

According to the classification model, a risk coefficient prediction model is obtained;

When the risk coefficient calculation module obtains the risk coefficient of the target user terminal device according to the environmental data, it is specifically used to:

The environmental data is input into the risk coefficient prediction model to obtain the risk coefficient of the target user terminal device.
The apparatus according to claim 10, wherein the risk coefficient calculation module is specifically configured to: when acquiring sample data of sample users:

Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;

The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data;

The risk coefficient calculation module is used to label the sample data to obtain a sample data label set; when generating a distinguishing feature set based on the sample data label set, it is specifically used to:

Respectively label the normal sample data and abnormal sample data to obtain a normal sample data label set and an abnormal sample data label set;

A distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.
The device according to claim 10, wherein the environmental data of the sample user includes: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, available screen resolution data, operating system One or more of data, time data, touch data, specific plug-in words, software installation data, browser data, and stack fingerprint data.
The apparatus according to claim 12, wherein the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X of the terminal device. The product of the direction and the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;

The risk coefficient calculation module is also used for:

Calculate the total number of logical processors available to the user, the total number of fonts of the terminal device, the product of the available screen resolution, the range of the total number of plug-ins, the quartile, the quartile range, and the five-number summary;

According to the range, quartile, quartile range, and quintile, the derivative characteristics of identifying outliers are obtained;

The sample data annotation set is filtered according to the derived features.
The device according to claim 10, wherein the features in the distinguishing feature set include any one or more of the following: browser language, pixel ratio of the terminal device, color depth of the terminal device, X-direction screen resolution , Y-direction screen resolution, X-direction usable screen resolution, Y-direction usable screen resolution, the first product of X-direction and Y-direction screen resolution, the second product of X-direction and Y-direction available screen resolution, The difference between the first product and the second product, whether the CPU type is known, whether the browser plug-in is missing, whether the font list detected by JS/CSS is missing, whether the operating system is unknown, whether the WebGL vendor information Missing, whether the browser type is robot, whether the browser manufacturer is other, whether the operating system manufacturer is other, the total number of browser plug-ins, the total number of browser plug-ins, the total number of fonts detected using JS/CSS, the operating system and the system platform Consistency, whether the audio stack fingerprint is provided, the parameter information of the audio stack fingerprint, the total number of logical processors available to the user agent by the system, whether AdBlock is installed, whether the user has tampered with the language, whether the user has tampered with the screen resolution, whether the user has tampered with the operating system , Whether the browser manufacturer, operating system manufacturer, access device type, operating system and system platform are consistent.
A computer non-volatile readable storage medium having a computer program stored thereon, wherein the program is executed by a processor to implement the method for sending network verification data according to any one of claims 1 to 7 .
A server, characterized in that the server includes:

One or more processors;

Storage device for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the following steps:

Receive verification requests from target users;

Obtain environmental data of the target user terminal device;

According to the environmental data, obtain the risk coefficient of the target user terminal device;

If the risk coefficient is lower than the preset risk threshold, send no perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.
The server according to claim 16, wherein when the processor realizes that if the risk coefficient is not lower than the preset risk threshold, sending intelligence verification data to the target user, specifically executes the following steps:

If the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;

If the risk coefficient is higher than the preset high-risk threshold, send the second intelligence verification data to the target user.
The server according to claim 16, wherein the processor is further configured to establish a risk coefficient prediction model before obtaining the risk coefficient of the target user terminal device according to the environmental data, and the processor When realizing the establishment of the risk coefficient prediction model, the following steps are specifically performed:

Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;

Label the sample data to obtain a sample data label set, and generate a distinguishing feature set according to the sample data label set;

Establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;

According to the classification model, a risk coefficient prediction model is obtained;

When the processor realizes that the risk coefficient of the target user terminal device is obtained according to the environmental data, it specifically executes the following steps:

The environmental data is input into the risk coefficient prediction model to obtain the risk coefficient of the target user terminal device.
The server according to claim 18, wherein the processor specifically executes the following steps when implementing the acquiring sample data of the sample user:

Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;

The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data;

The processor implements the labeling of the sample data to obtain a sample data label set; when generating a distinguishing feature set according to the sample data label set, the following steps are specifically executed:

Respectively label the normal sample data and abnormal sample data to obtain a normal sample data label set and an abnormal sample data label set;

A distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.
The server according to claim 18, wherein the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X of the terminal device. The product of the direction and the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;

The processor is further configured to implement the following steps before generating the distinguishing feature set according to the sample data label set:

Calculate the total number of logical processors available to the user, the total number of fonts of the terminal device, the product of the available screen resolution, the range of the total number of plug-ins, the quartile, the quartile range, and the five-number summary;

Based on the range, quartile, quartile range, and quintile generalization, the derivative features for identifying outliers are obtained;

The sample data annotation set is filtered according to the derived features.