WO2020164274A1 - Network verification data sending method and apparatus, and storage medium and server - Google Patents
Network verification data sending method and apparatus, and storage medium and server Download PDFInfo
- Publication number
- WO2020164274A1 WO2020164274A1 PCT/CN2019/117939 CN2019117939W WO2020164274A1 WO 2020164274 A1 WO2020164274 A1 WO 2020164274A1 CN 2019117939 W CN2019117939 W CN 2019117939W WO 2020164274 A1 WO2020164274 A1 WO 2020164274A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- sample
- user
- verification
- risk coefficient
- Prior art date
Links
- 238000012795 verification Methods 0.000 title claims abstract description 156
- 238000000034 method Methods 0.000 title claims abstract description 47
- 230000007613 environmental effect Effects 0.000 claims abstract description 82
- 230000002159 abnormal effect Effects 0.000 claims description 66
- 238000004422 calculation algorithm Methods 0.000 claims description 24
- 238000013145 classification model Methods 0.000 claims description 17
- 238000004364 calculation method Methods 0.000 claims description 9
- 230000008447 perception Effects 0.000 claims description 7
- 238000002372 labelling Methods 0.000 claims description 5
- 238000004590 computer program Methods 0.000 claims description 3
- 230000000875 corresponding effect Effects 0.000 description 19
- 238000010586 diagram Methods 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 206010000117 Abnormal behaviour Diseases 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000013515 script Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000012549 training Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/02—Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
- H04L63/0227—Filtering policies
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/12—Applying verification of the received information
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
Definitions
- This application relates to the field of computer technology. Specifically, this application relates to a method, device, storage medium, and server for sending network verification data.
- the general verification method should not be too complicated, but simple verification methods are easy to be cracked by some users with high risk coefficients, thus failing to achieve the purpose of protecting the rights and interests of normal users; but setting too complex verification methods For some users with low risk factors, the verification method has the problems of long verification time and complicated operations, and the user experience is very poor.
- this application proposes a method, device, storage medium and server for sending network verification data to solve the problems existing in the prior art.
- the method for sending network verification data includes:
- This application also proposes a device for sending network verification data, which includes:
- the verification request receiving module is used to receive the verification request of the target user
- Environmental data acquisition module for acquiring environmental data of target user terminal equipment
- the risk coefficient calculation module is used to obtain the risk coefficient of the target user terminal device according to the environmental data
- the verification data sending module is used to send non-perception verification data to the target user when the risk coefficient is lower than the preset risk threshold; and send intelligence verification data to the target user when the risk coefficient is not lower than the preset risk threshold.
- the present application also proposes a computer non-volatile readable storage medium on which a computer program is stored, which is characterized in that when the program is executed by a processor, the method for sending network verification data as described in any one of the foregoing is implemented.
- This application also proposes a server, which includes:
- One or more processors are One or more processors;
- Storage device for storing one or more programs
- the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the method for sending network verification data according to any one of the foregoing
- This application can achieve the purpose of sending relatively simple non-perceptual verification data to terminal devices with low risk factors, reducing the difficulty of verification for normal users, and improving user experience; at the same time, this application can send intelligence verification data to terminal devices with high risk factors.
- the machine user is prevented from launching malicious attacks to the server by means of verification requests, thereby achieving the purpose of filtering abnormal verification requests and improving the server's ability to resist malicious attacks.
- Figure 1 is a schematic flow chart of an embodiment of establishing a risk coefficient prediction model in this application
- FIG. 2 is a schematic flowchart of a second embodiment of a method for sending network verification data according to this application;
- FIG. 3 is a schematic diagram of the module structure of an embodiment of a device for sending network verification data according to the application;
- Figure 4 is a schematic structural diagram of an embodiment of a server of this application.
- the server used here includes, but is not limited to, a computer, a network host, a single network server, a set of multiple network servers, or a cloud composed of multiple servers.
- the cloud is composed of a large number of computers or network servers based on Cloud Computing.
- Cloud computing is a type of distributed computing, a super virtual computer composed of a group of loosely coupled computer sets.
- the solutions provided in the embodiments of this application can be applied to a web server or application server.
- the server can determine the risk factor of the terminal application environment where the user accessing the web page or application is located, and based on the above The risk factor determines the user's authentication method, thereby reducing the interference or attack of abnormal users to the server.
- the solutions provided in the embodiments of the present application can be applied to various user verification scenarios, which is not limited in this application.
- the technical solution provided by the embodiments of this application is composed of two parts: the first part uses the distinguishing feature set of the sample users to train the classification model to establish the risk coefficient prediction model; the second part uses the trained risk coefficient prediction model to determine the risk of the target user The level of the coefficient and the preset risk threshold in order to send the corresponding verification data.
- a risk coefficient prediction model Before performing the method for sending network verification data described in this application, a risk coefficient prediction model may be established first, as shown in FIG. 1, the establishment of the risk coefficient prediction model includes the following steps:
- Step S10 Obtain sample data of a sample user, where the sample data includes environmental data of the sample user;
- Step S20 Annotate the sample data to obtain a sample data annotation set, and generate a distinguishing feature set according to the sample data annotation set;
- Step S30 establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;
- Step S40 Obtain a risk coefficient prediction model according to the classification model.
- Step S10 Obtain sample data of a sample user, where the sample data includes environmental data of the sample user.
- the sample data needs to include a variety of different environmental data and risk coefficients corresponding to different environmental data. Therefore, the risk coefficient includes both high risk coefficient and low risk coefficient.
- an automated device may be used to simulate the abnormal behavior of the abnormal user, so as to obtain the verification request data corresponding to the abnormal behavior and the corresponding environmental data.
- real attacks of abnormal users can also be used as sample data to continuously enrich the diversity of sample data. In most of the visits, the general users are the data of normal users.
- the verification request data of normal users and the corresponding environmental data in practical applications can be used as sample data, and the verification request of normal users can also be simulated by automated equipment.
- the environmental data in the sample data is difficult to obtain, the environmental data of the terminal device can be obtained through the crawler algorithm. Therefore, in some embodiments of the present application, the obtaining sample data of sample users may include:
- abnormal sample data can be obtained according to crawler algorithms and automated verification equipment, and data corresponding to a normal verification request of the user can be regarded as normal sample data.
- the sample data includes environmental data and request data of the sample user, and the crawler algorithm can be run on the terminal device through a webpage plug-in or an application program to send the environmental data of the terminal to the server.
- the request data may include a user registration request and/or a verification request, so as to trigger the verification request when the user registers or verifies the identity.
- the environmental data can be obtained through crawler algorithms, automated verification equipment, and normal verification.
- the acquiring sample data of the sample user through the verification request sent by the sample user, the automated verification device, and the crawler algorithm includes:
- the automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal
- the sample request data and environmental data of the sample user are regarded as abnormal sample data.
- the environmental data mentioned in this application may include: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, available screen resolution data, operating system data, time data, touch data, specific plug-ins One or more of word count, software installation data, browser data, and stack fingerprint data.
- the device types include terminal types such as Android phones, tablet computers, and computer types.
- the operating system data includes IOS type, version, resolution, etc., endpoint IP address, etc.
- the sample data obtained may include some sample data with uncertain sources, and such data may not be used as training sample data.
- Step S20 Annotate the sample data to obtain a sample data annotation set, and generate a distinguishing feature set according to the sample data annotation set.
- the sample data obtained through the multiple channels can be labeled according to the known sample data information to obtain an accurate sample data label set.
- the sample data label set may also include a normal sample data label set and an abnormal sample data label set.
- labeling it can be determined whether the sample data is normal sample data or abnormal sample data according to whether the actual data of the known terminal device is consistent with the sample data obtained through verification request or the like. For example, if the user_agent of the terminal device analyzes that the device is a mobile phone, and the actual terminal device is a computer, the sample data is marked as abnormal sample data; the terminal device does not support multi-touch, but the real terminal is obtained through JavaScript scripts If the device supports multi-touch, the sample data is marked as abnormal sample data.
- the similarity between the sample data and the normal sample data obtained through the normal verification method can also be labeled, and the high similarity to the normal sample data is labeled as the normal sample data, and the similarity to the normal sample data is low.
- the marked as abnormal sample data For example: mark the sample data with the access frequency of the terminal device within 1 minute or the access frequency of a fixed IP address within 1 minute; if the access frequency of the terminal device within 1 minute or a fixed IP address within 1 minute If the frequency of visits exceeds 10 times, the sample data is marked as abnormal sample data.
- a distinguishing feature set is generated according to the sample data label set.
- the features in the distinguishing feature set are all determined from the environmental data, and their specific features may include one or more of the following features: browser language, terminal device pixel ratio, terminal device color depth, X-direction screen resolution Screen resolution, Y direction screen resolution, X direction available screen resolution, Y direction available screen resolution, X direction and Y direction screen resolution product resolution_multi, X direction and Y direction available screen resolution product available_resolution_multi, the above two The difference of a product, etc.
- the difference between the two products can be used to generate a resolution-based nonlinear combination feature, and to explore the normal range of the product, especially when the resolution of the terminal device is too low, the difference can be The value determines whether the terminal device is a low-end device, so as to configure a low-end crawler algorithm for the terminal device.
- the specific features in the distinguishing feature set may also include one or more of the following features: acquired 24-hour features (for example, the sample data between 2 AM and 5 AM may be abnormal sample data), monthly features, Holiday characteristics (Spring Festival, National Day holidays, etc.), weekday characteristics, week characteristics, the number of touch points of the terminal device, whether the terminal device supports touch control, whether the touch point of the terminal device is consistent with the operating system, the terminal Whether the number of touch points of the device is consistent with whether it supports touch control, etc.
- acquired 24-hour features for example, the sample data between 2 AM and 5 AM may be abnormal sample data
- monthly features for example, the sample data between 2 AM and 5 AM may be abnormal sample data
- Holiday characteristics Spring Festival, National Day holidays, etc.
- weekday characteristics week characteristics
- week characteristics week characteristics
- the terminal device of is generally a terminal device that can be touched, so whether the terminal device supports touch operation can be one of the characteristics for judging the risk factor.
- the specific features in the distinguishing feature set may also include one or more of the following features: whether the type of CPU is known, whether the browser plug-in is missing, whether the font list detected using JS/CSS is missing, whether the operating system is unknown, Whether WebGL supplier information is missing, whether the browser type is robot, whether the browser manufacturer is other, whether the operating system manufacturer is other, the total number of browser plug-ins, the total number of browser plug-ins, the total number of fonts detected using JS/CSS, and the operation Whether the system and the system platform are consistent, whether the audio stack fingerprint is provided, the parameter information of the audio stack fingerprint, the total number of logical processors available to the user agent by the system, whether AdBlock is installed, whether the user has tampered with the language, whether the user has tampered with the screen resolution, the user Whether the operating system, browser manufacturer, operating system manufacturer, access device type, operating system and system platform are the same.
- whether WebGL supplier information is missing can be used to determine the abnormality of the user's use of the device, and then determine the user's effectiveness according to the degree of supplier information missing, so as to achieve the purpose of grading and judging user effectiveness; for example, when the degree of missing is high, medium and low, users Effectiveness corresponds to low, medium and high.
- whether the browser type is robot can be used to judge whether the terminal device is abnormal according to the degree of tampering of the browser, or to judge whether it is a crawler model browser behavior according to the browser type.
- the total number of fonts detected using JS/CSS can be used to determine whether there are specific plug-ins that help crawlers realize web page data analysis, and to quantitatively determine whether the system or browser has been tampered with by crawlers based on the total number of fonts.
- whether the operating system and the system platform are consistent can be used to detect whether the system of the terminal device can be tampered with, and to confirm the consistency of the operating system and the system platform when marking sample data.
- the environmental data of the sample user includes: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, and available screen resolution
- the environmental data of the sample user includes: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, and available screen resolution
- This application can also calculate the range, quartile, quartile range, and quintile summary of each value based on the numerical type features in the aforementioned distinguishing feature set (in order minimum, upper quartile, median, Lower quartile, maximum value) to get the derived features for identifying outliers. If the outlier in the sample data is the required abnormal sample data, the sample data corresponding to the outlier can be regarded as the abnormal sample data; if the sample data corresponding to the outlier is the sample that needs to be filtered out Data, remove the sample data corresponding to the outlier from all sample data.
- the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X direction of the terminal device. Multiplied by the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;
- the method Before generating a distinguishing feature set based on the sample data label set, the method further includes:
- the sample data annotation set is filtered according to the derived features.
- This embodiment can achieve the purpose of filtering the sample data annotation set, thereby improving the accuracy of the risk coefficient prediction model.
- features that do not have numeric values they can be mapped into specific numeric features; for Boolean features, since there are only two values, the range, quartile, quartile range, and quintile are not calculated Generalize.
- Step S30 establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;
- the Naive Bayes algorithm based on Gaussian distribution is selected as the classification model for supervised anomaly verification requests.
- the Bayesian classification algorithm can learn the characteristics of positive and negative samples at the same time to obtain the comprehensive evaluation probability of whether each sample is abnormal, that is, every The probability of whether the sample data is abnormal.
- the Naive Bayesian algorithm based on Gaussian distribution is an existing probability model. It learns the features of different feature values by assuming that the original data obeys the Gaussian distribution and the influence of a feature value on a given class is independent of the values of other features. , Calculate the probability that an example belongs to a specific class through the prior probability and the posterior probability.
- the mathematical expressions and algorithm models of the algorithm model are also well-known technologies, and will not be repeated here.
- Step S40 Obtain a risk coefficient prediction model according to the classification model.
- the probability of whether each sample data is abnormal can be obtained, and then the mapping function or corresponding relationship between the probability and the risk coefficient can be set to obtain the risk coefficient prediction model. For example, multiplying the probability by a preset threshold can obtain a risk coefficient that is positively or negatively correlated with the probability.
- the selected threshold ⁇ can be used as a fixed coefficient, and the selected threshold ⁇ can also be a step coefficient to obtain stepped risk coefficients corresponding to different numerical ranges.
- the risk coefficients corresponding to different environmental data can be obtained, so that the server can determine the sending data of the verification picture according to the risk coefficient.
- the verification pictures may include non-perceptual verification pictures suitable for low risk factors, such as static pictures with numbers or text, etc., and may also include intelligence verification pictures suitable for high risk factors that require user participation, such as sliding puzzles, intelligence Questions and answers, etc.
- the risk coefficient prediction model established in this application can detect aggressive verification requests that differ in the time point of normal verification and the request frequency; a further example can be: under normal circumstances, users rarely do so at 2-5 in the morning. Point to send verification requests frequently, or under normal circumstances, it is impossible for users to dynamically switch IP addresses frequently in a short period of time. Therefore, the risk coefficient prediction model of this application can quickly mark the risk coefficient of the terminal device as high risk according to the time characteristics of the verification request, so that the user can send verification data suitable for high risk coefficient.
- the acquisition of sample data of the sample user through the verification request sent by the sample user, automated verification equipment, and crawler algorithm includes:
- the automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal
- the sample request data and environmental data of the sample user are regarded as abnormal sample data;
- the labeling the sample data to obtain a sample data label set; generating a distinguishing feature set according to the sample data label set includes:
- a distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.
- an automated verification device is used to simulate the verification request of an abnormal sample user, and a large amount of sample data can be quickly obtained to improve the speed of establishing a risk coefficient prediction model.
- the normal request data is that the verification request is normal. If a normal sample user sends an abnormal verification request, the verification request is not the normal request data.
- the abnormal sample user indicates a sample user with a high risk coefficient, and the verification request may be normal or abnormal; through the risk coefficient prediction model of this application, the normal verification request can be distinguished from An abnormal verification request, that is, whether the risk coefficient of the abnormal sample user is low or high.
- This application proposes a method for sending network verification data, as shown in Figure 2, which is used to take into account the complexity of user verification methods and improve user experience, which includes the following steps:
- Step S1 Receive a verification request from the target user
- Step S2 Obtain environmental data of the target user terminal device
- Step S3 Obtain the risk coefficient of the target user terminal device according to the environmental data
- Step S4 If the risk coefficient is lower than the preset risk threshold, send non-perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.
- the server When receiving the verification request of the target user, the server obtains the environmental data of the target user's terminal device.
- the environmental data can be obtained through methods such as plug-in programs, crawling algorithms, or sent to the server together with the verification request; the server calculates the risk coefficient of the target user terminal device according to the environmental data, and then according to the level of the risk coefficient Send the corresponding verification data to the terminal device.
- This application can achieve the purpose of sending relatively simple non-perceptual verification data to terminal devices with low risk factors, reducing the difficulty of verification for normal users, and improving user experience; at the same time, this application can send intelligence verification data to terminal devices with high risk factors to prevent The machine user initiates a malicious attack to the server through the verification request, thereby achieving the purpose of filtering abnormal verification requests and improving the server's ability to resist malicious attacks.
- sending intelligence verification data to the target user includes:
- the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;
- the risk factor is divided into three levels, and the first intelligence verification data and the second intelligence verification data are sent to the two levels with higher risk factors, for example, the first intelligence verification data is sent to a terminal device with a medium risk factor.
- the verification data sends the second intelligence verification data to the terminal device with a high risk coefficient, thereby further clarifying the correspondence between the environmental data and the verification data, and further optimizing the user experience.
- the obtaining the risk coefficient of the target user terminal device according to the environmental data includes:
- the risk coefficient prediction model is established by the following methods:
- sample data of a sample user Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;
- a risk coefficient prediction model is obtained.
- the risk coefficient prediction model in this embodiment can be established in a server, or can be derived from other devices or networks, has portability, and improves the scope of application of the method for sending network verification data.
- the risk factor is lower than the preset risk threshold, send the non-perceived verification data to the target user; if the risk factor is not lower than the preset risk threshold, send the intelligence verification data to the target user, which can specifically adopt the following methods:
- Ways to set the risk threshold For example, set the risk threshold to 50%; when the probability value output by the classification model is greater than 50%, it is abnormal; when it is less than 50%, it is normal; you can also directly output 1 and -1, Among them, 1 means normal, -1 means abnormal; according to the results of normal and abnormal, different verification data are sent respectively.
- the way of setting the risk level For example, when the probability value output by the classification model is 0-40%, it corresponds to the low risk L1 level; when the output probability value is 40%-60%, it corresponds to the medium risk L2 Level: When the output probability value is 60%-100%, it corresponds to the high-risk L3 level; according to the level, different verification data are sent. For example: for L1 level, you can load the intelligent non-perceptual verification method, for L2 level, you can load the sliding puzzle verification method or the verification method of recognizing text in the background picture randomly, and for the L3 level, you can load the voice verification method.
- This application also proposes a device for sending network verification data. As shown in FIG. 3, the device includes:
- the verification request receiving module 1 is used to receive the verification request of the target user
- the environmental data acquisition module 2 is used to acquire environmental data of the target user terminal device
- the risk coefficient calculation module 3 is used to obtain the risk coefficient of the target user terminal device according to the environmental data
- the verification data sending module 4 is configured to send non-perception verification data to the target user when the risk coefficient is lower than the preset risk threshold; and send intelligence verification data to the target user when the risk coefficient is not lower than the preset risk threshold.
- the embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method for sending network verification data described in any one of the above is implemented.
- the storage medium includes, but is not limited to, any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory), RAM (Random AccesSS Memory), and then Memory), EPROM (EraSable Programmable Read-Only Memory), EEPROM (Electrically EraSable Programmable Read-Only Memory), flash memory, magnetic card or optical card. That is, the storage medium includes any medium that stores or transmits information in a readable form by a device (for example, a computer). It can be a read-only memory, magnetic disk or optical disk, etc.
- the embodiment of the present application also provides a server, and the server includes:
- One or more processors are One or more processors;
- Storage device for storing one or more programs
- the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the method for sending network verification data according to any one of the foregoing.
- FIG. 4 is a schematic diagram of the structure of the server of this application, including a processor 320, a storage device 330, an input unit 340, a display unit 350 and other devices.
- the storage device 330 may be used to store the application program 310 and various functional modules.
- the processor 320 runs the application program 310 stored in the storage device 330 to execute various functional applications and data processing of the device.
- the storage device 330 may be an internal memory or an external memory, or include both internal memory and external memory.
- the internal memory may include read-only memory, programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory.
- External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc.
- the storage devices disclosed in this application include but are not limited to these types of storage devices.
- the storage device 330 disclosed in this application is merely an example and not a limitation.
- the input unit 340 is used for receiving signal input, and receiving user attribute information of the target user on the first statistical date and access information to the specified target.
- the input unit 340 may include a touch panel and other input devices.
- the touch panel can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc., to operate on the touch panel or near the touch panel), and according to preset
- the program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as playback control buttons, switch buttons, etc.), trackball, mouse, and joystick.
- the display unit 350 may be used to display information input by the user or information provided to the user and various menus of the computer device.
- the display unit 350 may take the form of a liquid crystal display, an organic light emitting diode, or the like.
- the processor 320 is the control center of the computer equipment. It uses various interfaces and lines to connect various parts of the entire computer, runs or executes software programs and/or modules stored in the storage device 330, and calls data stored in the storage device. , Perform various functions and process data.
- the server includes one or more processors 320, one or more storage devices 330, and one or more application programs 310, wherein the one or more application programs 310 are stored in the storage device 330 And is configured to be executed by the one or more processors 320, and the one or more application programs 310 are configured to execute the churn user early warning method described in the above embodiments.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Transfer Between Computers (AREA)
- Computer And Data Communications (AREA)
Abstract
Provided are a network verification data sending method and apparatus, and a storage medium and a server. The network verification data sending method comprises: receiving a verification request of a target user; acquiring environmental data of a terminal device of the target user; obtaining a risk coefficient of the terminal device of the target user according to the environmental data; if the risk coefficient is lower than a preset risk threshold, sending perception-free verification data to the target user; and if the risk coefficient is not lower than the preset risk threshold, sending intelligence verification data to the target user. In the present application, relatively simple perception-free verification data can be sent to a terminal device with a low risk coefficient, such that difficulty in verification is reduced for a normal user, and intelligence verification data is sent to a terminal device with a high risk coefficient, such that the capability of a server resisting a malicious attack is improved.
Description
本申请要求于2019年02月13日提交中国专利局、申请号为201910115365.1、申请名称为“网络验证数据的发送方法、装置、存储介质和服务器”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 13, 2019, with application number 201910115365.1, and the application title is "Network Verification Data Transmission Method, Device, Storage Medium, and Server", all of which are approved The reference is incorporated in this application.
本申请涉及计算机技术领域,具体而言,本申请涉及一种网络验证数据的发送方法、装置、存储介质和服务器。This application relates to the field of computer technology. Specifically, this application relates to a method, device, storage medium, and server for sending network verification data.
当用户进行登录操作或浏览部分网页时,为了避免机器操作占用服务器资源或遭受恶意攻击,app或网页的客户端会要求用户进行行为验证,以确认当前操作的用户为非机器用户,以及避免恶意登录或恶意攻击。在现有的网络验证方式中,对所有用户来说,不管其是否为机器账户、或是账户登录的软硬件环境等因素如何,一般均采用一种网络验证方式。为了保证非机器用户的用户体验,一般验证方式不宜过于复杂,但简单的验证方式又容易被部分风险系数高的用户破解,从而无法达到保护正常用户权益的目的;但设定过于复杂的验证方式,对于一些风险系数低的用户来说,验证方式又存在验证时间长、操作复杂的问题,用户体验很差。When a user logs in or browses a part of a web page, in order to avoid machine operations occupying server resources or being maliciously attacked, the client of the app or web page will require the user to perform behavior verification to confirm that the user currently operating is a non-machine user, and to avoid malicious Login or malicious attack. In the existing network authentication methods, for all users, regardless of whether they are machine accounts, or the software and hardware environment for account login, generally a network authentication method is adopted. In order to ensure the user experience of non-machine users, the general verification method should not be too complicated, but simple verification methods are easy to be cracked by some users with high risk coefficients, thus failing to achieve the purpose of protecting the rights and interests of normal users; but setting too complex verification methods For some users with low risk factors, the verification method has the problems of long verification time and complicated operations, and the user experience is very poor.
发明内容Summary of the invention
本申请针对现有方式的缺点,提出一种网络验证数据的发送方法、装置、存储介质和服务器,用以解决现有技术中存在的的问题。Aiming at the shortcomings of the existing methods, this application proposes a method, device, storage medium and server for sending network verification data to solve the problems existing in the prior art.
所述网络验证数据的发送方法,包括:The method for sending network verification data includes:
接收目标用户的验证请求;Receive verification requests from target users;
获取目标用户终端设备的环境数据;Obtain environmental data of the target user terminal device;
根据所述环境数据,得到目标用户终端设备的风险系数;According to the environmental data, obtain the risk coefficient of the target user terminal device;
若风险系数低于预设风险阈值,向目标用户发送无感知验证数据;若风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据。If the risk coefficient is lower than the preset risk threshold, send no perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.
本申请还提出一种网络验证数据的发送装置,该装置包括:This application also proposes a device for sending network verification data, which includes:
验证请求接收模块,用于接收目标用户的验证请求;The verification request receiving module is used to receive the verification request of the target user;
环境数据获取模块,用于获取目标用户终端设备的环境数据;Environmental data acquisition module for acquiring environmental data of target user terminal equipment;
风险系数计算模块,用于根据所述环境数据,得到目标用户终端设备的风险系数;The risk coefficient calculation module is used to obtain the risk coefficient of the target user terminal device according to the environmental data;
验证数据发送模块,用于当风险系数低于预设风险阈值时,向目标用户发送无感知验证数据;当风险系数不低于所述预设风险阈值时,向目标用户发送智力验证数据。The verification data sending module is used to send non-perception verification data to the target user when the risk coefficient is lower than the preset risk threshold; and send intelligence verification data to the target user when the risk coefficient is not lower than the preset risk threshold.
本申请还提出一种计算机非易失性可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现前述任意一项所述的网络验证数据的发送方法。The present application also proposes a computer non-volatile readable storage medium on which a computer program is stored, which is characterized in that when the program is executed by a processor, the method for sending network verification data as described in any one of the foregoing is implemented.
本申请还提出一种服务器,所述服务器包括:This application also proposes a server, which includes:
一个或多个处理器;One or more processors;
存储装置,用于存储一个或多个程序,Storage device for storing one or more programs,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现前述任意一项所述的网络验证数据的发送方法When the one or more programs are executed by the one or more processors, the one or more processors implement the method for sending network verification data according to any one of the foregoing
本申请具有以下有益效果:This application has the following beneficial effects:
本申请可达到向风险系数低的终端设备发送较为简单的无感知验证数据,降低正常用户的验证难度,提高用户体验的目的;同时,本申请可向风险系数高的终端设备发送智力验证数据,防止机器用户通过验证请求的方式向服务器发起恶意攻击,从而达到了过滤非正常验证请求的目的,提高了服务器对抗恶意攻击的能力。This application can achieve the purpose of sending relatively simple non-perceptual verification data to terminal devices with low risk factors, reducing the difficulty of verification for normal users, and improving user experience; at the same time, this application can send intelligence verification data to terminal devices with high risk factors. The machine user is prevented from launching malicious attacks to the server by means of verification requests, thereby achieving the purpose of filtering abnormal verification requests and improving the server's ability to resist malicious attacks.
本申请附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本申请的实践了解到。The additional aspects and advantages of this application will be partly given in the following description, which will become obvious from the following description, or be understood through the practice of this application.
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present application will become obvious and easy to understand from the following description of the embodiments in conjunction with the accompanying drawings, in which:
图1为本申请建立风险系数预测模型实施例的流程示意图;Figure 1 is a schematic flow chart of an embodiment of establishing a risk coefficient prediction model in this application;
图2为本申请网络验证数据的发送方法第二实施例的流程示意图;2 is a schematic flowchart of a second embodiment of a method for sending network verification data according to this application;
图3为本申请网络验证数据的发送装置实施例的模块结构示意图;3 is a schematic diagram of the module structure of an embodiment of a device for sending network verification data according to the application;
图4为本申请服务器实施例的结构示意图。Figure 4 is a schematic structural diagram of an embodiment of a server of this application.
下面详细描述本申请的实施例,所述实施例的示例在附图中示出。The embodiments of the present application are described in detail below, and examples of the embodiments are shown in the accompanying drawings.
本技术领域技术人员可以理解,这里所使用的服务器其包括但不限于计算机、网络主机、单个网络服务器、多个网络服务器集或多个服务器构成的云。在此,云由基于云计算(Cloud Computing)的大量计算机或网络服务器构成,其中,云计算是分布式计算的一种,由一群松散耦合的计算机集组成的一个超级虚拟计算机。Those skilled in the art can understand that the server used here includes, but is not limited to, a computer, a network host, a single network server, a set of multiple network servers, or a cloud composed of multiple servers. Here, the cloud is composed of a large number of computers or network servers based on Cloud Computing. Cloud computing is a type of distributed computing, a super virtual computer composed of a group of loosely coupled computer sets.
本申请实施例提供的方案可以应用在网页服务器或应用程序服务器中,通过本申请实施例提供的方案,服务器可判断访问网页或应用程序的用户所在的终端应用环境的风险系数,并根据所述风险系数确定用户的验证方式,从而减少非正常用户对服务器的干扰或攻击。本申请实施例提供的方案可以应用在多种用户验证的场景中,本申请并不对此做出限定。The solutions provided in the embodiments of this application can be applied to a web server or application server. Through the solutions provided in the embodiments of this application, the server can determine the risk factor of the terminal application environment where the user accessing the web page or application is located, and based on the above The risk factor determines the user's authentication method, thereby reducing the interference or attack of abnormal users to the server. The solutions provided in the embodiments of the present application can be applied to various user verification scenarios, which is not limited in this application.
本申请实施例提供的技术方案分两部分构成:第一部分利用样本用户的区分特征集进行分类模型训练,以建立风险系数预测模型;第二部分利用训练好的风险系数预测模型判断目标用户的风险系数与预设风险阈值的高低,以便发送对应的验证数据。The technical solution provided by the embodiments of this application is composed of two parts: the first part uses the distinguishing feature set of the sample users to train the classification model to establish the risk coefficient prediction model; the second part uses the trained risk coefficient prediction model to determine the risk of the target user The level of the coefficient and the preset risk threshold in order to send the corresponding verification data.
下面按照建立风险系数预测模型、根据所述风险系数预测模型发送验证数据的顺序对本申请实施例进行详细介绍。The following describes the embodiments of the present application in detail according to the sequence of establishing a risk coefficient prediction model and sending verification data according to the risk coefficient prediction model.
在进行本申请所述网络验证数据的发送方法之前,可先建立风险系数预测模型,如图1所示,所述建立风险系数预测模型包括如下步骤:Before performing the method for sending network verification data described in this application, a risk coefficient prediction model may be established first, as shown in FIG. 1, the establishment of the risk coefficient prediction model includes the following steps:
步骤S10:获取样本用户的样本数据,所述样本数据包括样本用户的环境数据;Step S10: Obtain sample data of a sample user, where the sample data includes environmental data of the sample user;
步骤S20:标注所述样本数据,得到样本数据标注集,根据所述样本数据标注集生成区分特征集;Step S20: Annotate the sample data to obtain a sample data annotation set, and generate a distinguishing feature set according to the sample data annotation set;
步骤S30:根据所述区分特征集和基于高斯分布的朴素贝叶斯算法,建立 分类模型;Step S30: establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;
步骤S40:根据所述分类模型,得到风险系数预测模型。Step S40: Obtain a risk coefficient prediction model according to the classification model.
其中,每个步骤具体解释如下:Among them, each step is explained as follows:
步骤S10:获取样本用户的样本数据,所述样本数据包括样本用户的环境数据。Step S10: Obtain sample data of a sample user, where the sample data includes environmental data of the sample user.
当需要根据样本数据中的环境因素预测用户的风险系数时,所述样本数据需包括多种不同的环境数据,以及对应于不同环境数据的风险系数。故,所述风险系数既包括高风险系数,亦包括低风险系数。为了得到高风险系数对应的环境数据,可采用自动化设备模拟非正常用户的异常行为,以得到对应于该异常行为的验证请求数据与对应的环境数据。当然,在实际应用中,亦可将真实的非正常用户的攻击行为作为样本数据,以不断丰富样本数据的多样性。在大部分的访问中,一般用户均为正常用户的数据,故可将实际应用中正常用户的验证请求数据和对应的环境数据作为样本数据,亦可通过自动化设备模拟正常用户的验证请求,以多种不同的正常样本数据。当样本数据中的环境数据较难获取时,可通过爬虫算法获取终端设备的环境数据。故在本申请的部分实施例中,所述获取样本用户的样本数据,可包括:When the user's risk coefficient needs to be predicted based on environmental factors in the sample data, the sample data needs to include a variety of different environmental data and risk coefficients corresponding to different environmental data. Therefore, the risk coefficient includes both high risk coefficient and low risk coefficient. In order to obtain the environmental data corresponding to the high risk factor, an automated device may be used to simulate the abnormal behavior of the abnormal user, so as to obtain the verification request data corresponding to the abnormal behavior and the corresponding environmental data. Of course, in practical applications, real attacks of abnormal users can also be used as sample data to continuously enrich the diversity of sample data. In most of the visits, the general users are the data of normal users. Therefore, the verification request data of normal users and the corresponding environmental data in practical applications can be used as sample data, and the verification request of normal users can also be simulated by automated equipment. A variety of different normal sample data. When the environmental data in the sample data is difficult to obtain, the environmental data of the terminal device can be obtained through the crawler algorithm. Therefore, in some embodiments of the present application, the obtaining sample data of sample users may include:
通过样本用户发送的验证请求、自动化验证设备和爬虫算法,获取样本用户的样本数据。Obtain sample data of sample users through verification requests sent by sample users, automated verification equipment and crawler algorithms.
在本申请的部分实施例中,可根据爬虫算法和自动化验证设备获取异常的样本数据,将用户正常的验证请求对应的数据作为正常的样本数据。所述样本数据包括样本用户的环境数据和请求数据,所述爬虫算法可通过网页插件或应用程序运行于终端设备上,以将终端的环境数据发送至服务器。所述请求数据可包括用户注册请求和/或验证请求,以在用户注册或验证身份时触发该验证请求。所述环境数据可通过爬虫算法、自动化验证设备和正常验证等途径获取。In some embodiments of the present application, abnormal sample data can be obtained according to crawler algorithms and automated verification equipment, and data corresponding to a normal verification request of the user can be regarded as normal sample data. The sample data includes environmental data and request data of the sample user, and the crawler algorithm can be run on the terminal device through a webpage plug-in or an application program to send the environmental data of the terminal to the server. The request data may include a user registration request and/or a verification request, so as to trigger the verification request when the user registers or verifies the identity. The environmental data can be obtained through crawler algorithms, automated verification equipment, and normal verification.
在本申请的另一实施例中,所述通过样本用户发送的验证请求、自动化验证设备和爬虫算法,获取样本用户的样本数据,包括:In another embodiment of the present application, the acquiring sample data of the sample user through the verification request sent by the sample user, the automated verification device, and the crawler algorithm includes:
获取样本用户的正常请求数据,根据所述正常请求数据得到样本用户的环境数据,将所述正常请求数据和环境数据作为正常的样本数据;Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;
通过自动化验证设备模拟异常样本用户的验证请求,获取对应于异常样本 用户的样本请求数据;通过爬虫算法获取所述自动化验证设备的环境数据,得到对应于异常样本用户的环境数据,将所述异常样本用户的样本请求数据和环境数据作为异常的样本数据。The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data.
本申请中所述的环境数据可包括:设备类型、硬件型号数据、像素比数据、颜色深度数据、屏幕分辨率数据、可用屏幕分辨率数据、操作系统数据、时间数据、触控数据、特定插件字数、软件安装数据、浏览器数据、堆栈指纹数据中的一个或多个。所述设备类型包括安卓手机、平板电脑、电脑类型等终端类型。所述操作系统数据包括IOS类型、版本、分辨率等,终点IP地址等。在获取的样本数据中,可能包括部分来源不确定的样本数据,可不将此种数据采用训练样本数据。The environmental data mentioned in this application may include: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, available screen resolution data, operating system data, time data, touch data, specific plug-ins One or more of word count, software installation data, browser data, and stack fingerprint data. The device types include terminal types such as Android phones, tablet computers, and computer types. The operating system data includes IOS type, version, resolution, etc., endpoint IP address, etc. The sample data obtained may include some sample data with uncertain sources, and such data may not be used as training sample data.
步骤S20:标注所述样本数据,得到样本数据标注集,根据所述样本数据标注集生成区分特征集。Step S20: Annotate the sample data to obtain a sample data annotation set, and generate a distinguishing feature set according to the sample data annotation set.
当样本数据为已知数据值时,可根据已知的样本数据信息标注通过所述多种途径获取的样本数据,以获得准确的样本数据标注集。所述样本数据标注集亦可包括正常的样本数据标注集和异常的的样本数据标注集。标注时,可根据已知的终端设备实际数据与通过验证请求等方式获取的样本数据是否一致来判定该样本数据是正常的样本数据还是异常的样本数据。例如:通过终端设备的user_agent解析设备为手机,而实际的终端设备为电脑,则将该样本数据标注为异常的样本数据;通过JavaScript脚本获取到终端设备不支持多点触控,而真实的终端设备是支持多点触控的,则将该样本数据标注为异常的样本数据。When the sample data is a known data value, the sample data obtained through the multiple channels can be labeled according to the known sample data information to obtain an accurate sample data label set. The sample data label set may also include a normal sample data label set and an abnormal sample data label set. When labeling, it can be determined whether the sample data is normal sample data or abnormal sample data according to whether the actual data of the known terminal device is consistent with the sample data obtained through verification request or the like. For example, if the user_agent of the terminal device analyzes that the device is a mobile phone, and the actual terminal device is a computer, the sample data is marked as abnormal sample data; the terminal device does not support multi-touch, but the real terminal is obtained through JavaScript scripts If the device supports multi-touch, the sample data is marked as abnormal sample data.
在部分标注方式中,还可将样本数据与通过正常验证方式获取的正常样本数据的相似度进行标注,将与正常样本数据相似度高的标注为正常样本数据,将与正常样本数据相似度低的标注为异常样本数据。例如:通过终端设备在1分钟之内的访问频次或某固定IP地址在1分钟之内的访问频次标注样本数据;如果终端设备在1分钟之内的访问频次或某固定IP地址在1分钟之内的访问频次超过10次,则将该样本数据标注为异常样本数据。又例如:根据1个用户的操作系统种类数、设备种类数、IP地址种类数等因素进行标注,若1个用户的操作系统种类数或设备种类数或IP地址种类数很多,则将该样本数据标注为异常样本数据。In some labeling methods, the similarity between the sample data and the normal sample data obtained through the normal verification method can also be labeled, and the high similarity to the normal sample data is labeled as the normal sample data, and the similarity to the normal sample data is low. The marked as abnormal sample data. For example: mark the sample data with the access frequency of the terminal device within 1 minute or the access frequency of a fixed IP address within 1 minute; if the access frequency of the terminal device within 1 minute or a fixed IP address within 1 minute If the frequency of visits exceeds 10 times, the sample data is marked as abnormal sample data. Another example: label based on a user’s operating system types, device types, IP address types, and other factors. If a user’s operating system types, device types, or IP address types are many, the sample The data is marked as abnormal sample data.
得到样本数据标注集之后,再根据所述样本数据标注集生成区分特征集。所述区分特征集中的特征均从所述环境数据中确定,其具体特征可包括如下特征中的一个或多个:浏览器语言、终端设备的像素比、终端设备的颜色深度、X方向屏幕分辨率、Y方向屏幕分辨率、X方向可用屏幕分辨率、Y方向可用屏幕分辨率、X方向与Y方向的屏幕分辨率乘积resolution_multi、X方向与Y方向的可用屏幕分辨率的乘积available_resolution_multi、上述两个乘积的差值等。其中,所述两个乘积的差值可用于生成基于分辨率的非线性组合特征,以及用于发掘所述乘积的正常取值范围,尤其是终端设备的分辨率过低时,可根据该差值确定终端设备是否为低端设备,从而为该终端设备配置低端的爬虫算法。所述区分特征集中的具体特征还可以包括如下特征中的一个或多个:获取的24小时特征(例如在凌晨2点-5点之间的样本数据可能为异常的样本数据)、月份特征、节假日特征(春节、国庆假期等)、工作日特征、星期特征、终端设备可触控的点的个数、终端设备是否支持可触控、终端设备可触控的点数与操作系统是否一致、终端设备可触控的点数与是否支持可触控是否一致等。其中,终端设备可触控的点数与是否支持可触控是否一致可用于根据终端设备的可触控点数检测操作系统的变化,且不支持触控操作的终端设备常用于爬虫运行而正常用户使用的终端设备一般为可触控的终端设备,故可将终端设备是否支持触控操作作为判断风险系数的特征之一。After the sample data label set is obtained, a distinguishing feature set is generated according to the sample data label set. The features in the distinguishing feature set are all determined from the environmental data, and their specific features may include one or more of the following features: browser language, terminal device pixel ratio, terminal device color depth, X-direction screen resolution Screen resolution, Y direction screen resolution, X direction available screen resolution, Y direction available screen resolution, X direction and Y direction screen resolution product resolution_multi, X direction and Y direction available screen resolution product available_resolution_multi, the above two The difference of a product, etc. Wherein, the difference between the two products can be used to generate a resolution-based nonlinear combination feature, and to explore the normal range of the product, especially when the resolution of the terminal device is too low, the difference can be The value determines whether the terminal device is a low-end device, so as to configure a low-end crawler algorithm for the terminal device. The specific features in the distinguishing feature set may also include one or more of the following features: acquired 24-hour features (for example, the sample data between 2 AM and 5 AM may be abnormal sample data), monthly features, Holiday characteristics (Spring Festival, National Day holidays, etc.), weekday characteristics, week characteristics, the number of touch points of the terminal device, whether the terminal device supports touch control, whether the touch point of the terminal device is consistent with the operating system, the terminal Whether the number of touch points of the device is consistent with whether it supports touch control, etc. Among them, whether the number of touchable points of the terminal device is consistent with whether it supports touch can be used to detect the change of the operating system based on the number of touchable points of the terminal device, and the terminal device that does not support touch operation is often used for crawler operation and normal user use The terminal device of is generally a terminal device that can be touched, so whether the terminal device supports touch operation can be one of the characteristics for judging the risk factor.
所述区分特征集中的具体特征还可以包括如下特征中的一个或多个:CPU的类型是否可知、浏览器插件是否缺失、使用JS/CSS检测到的字体列表是否缺失、操作系统是否为unknown、WebGL供应商信息是否缺失、浏览器类型是否为robot,浏览器生产厂商是否为other、操作系统生产厂商是否为other、浏览器插件、浏览器插件总数、使用JS/CSS检测到的字体总数、操作系统和系统平台是否一致、音频堆栈指纹是否提供、音频堆栈指纹的参数信息、系统对用户代理可用的逻辑处理器总数、是否安装AdBlock、用户是否篡改了语言、用户是否篡改了屏幕分辨率、用户是否篡改了操作系统、浏览器生产厂商、操作系统生产厂商、访问设备类型、操作系统和系统平台是否一致等。其中,WebGL供应商信息是否缺失可用于判断用户使用设备的异常,进而根据供应商信息的缺失程度判断用户的有效性,达到分级评判用户有效性的目的;例如 缺失程度分别为高中低时,用户有效性对应为低中高。其中,浏览器类型是否为robot可用于根据浏览器的篡改程度评判终端设备是否异常,或者用于根据浏览器类型判断是否为爬虫模型浏览器行为。其中,使用JS/CSS检测到的字体总数可用于判断是否有特定的有助于爬虫实现网页数据解析的插件,以及根据字体总数定量判断系统或浏览器是否被爬虫程序篡改。其中,操作系统和系统平台是否一致可用于检测终端设备的系统是否可以被篡改,以及标记样本数据时,用于确认操作系统和系统平台的一致性。The specific features in the distinguishing feature set may also include one or more of the following features: whether the type of CPU is known, whether the browser plug-in is missing, whether the font list detected using JS/CSS is missing, whether the operating system is unknown, Whether WebGL supplier information is missing, whether the browser type is robot, whether the browser manufacturer is other, whether the operating system manufacturer is other, the total number of browser plug-ins, the total number of browser plug-ins, the total number of fonts detected using JS/CSS, and the operation Whether the system and the system platform are consistent, whether the audio stack fingerprint is provided, the parameter information of the audio stack fingerprint, the total number of logical processors available to the user agent by the system, whether AdBlock is installed, whether the user has tampered with the language, whether the user has tampered with the screen resolution, the user Whether the operating system, browser manufacturer, operating system manufacturer, access device type, operating system and system platform are the same. Among them, whether WebGL supplier information is missing can be used to determine the abnormality of the user's use of the device, and then determine the user's effectiveness according to the degree of supplier information missing, so as to achieve the purpose of grading and judging user effectiveness; for example, when the degree of missing is high, medium and low, users Effectiveness corresponds to low, medium and high. Among them, whether the browser type is robot can be used to judge whether the terminal device is abnormal according to the degree of tampering of the browser, or to judge whether it is a crawler model browser behavior according to the browser type. Among them, the total number of fonts detected using JS/CSS can be used to determine whether there are specific plug-ins that help crawlers realize web page data analysis, and to quantitatively determine whether the system or browser has been tampered with by crawlers based on the total number of fonts. Among them, whether the operating system and the system platform are consistent can be used to detect whether the system of the terminal device can be tampered with, and to confirm the consistency of the operating system and the system platform when marking sample data.
故,本申请还提出一种建立风险系数预测模型的实施例:所述样本用户的环境数据包括:设备类型、硬件型号数据、像素比数据、颜色深度数据、屏幕分辨率数据、可用屏幕分辨率数据、操作系统数据、时间数据、触控数据、特定插件字数、软件安装数据、浏览器数据、堆栈指纹数据中的一个或多个,以从所述环境数据中确定所述区分特征集中的特征。Therefore, this application also proposes an embodiment for establishing a risk coefficient prediction model: the environmental data of the sample user includes: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, and available screen resolution One or more of data, operating system data, time data, touch data, specific plug-in word count, software installation data, browser data, and stack fingerprint data to determine features in the distinguishing feature set from the environmental data .
本申请还可根据前述区分特征集中的数值类型特征计算各数值的极差、四分位数、四分位数极差、五数概括(按次序最小值、上四分位、中位数、下四分位数、最大值),以得到识别离群点的衍生特征。若样本数据中的所述离群点为需要的异常样本数据,则可将该离群点对应的样本数据作为异常的样本数据;若所述离群点对应的样本数据是需要过滤掉的样本数据,则从全部样本数据中去处该离群点对应的样本数据。故在本申请的又一具体实施例中,所述操作系统数据包括用户可用的逻辑处理器总数,所述字体数据包括终端设备的字体总数,所述可用屏幕分辨率数据包括终端设备的X方向与Y方向的可用屏幕分辨率的乘积,所述浏览器数据包括浏览器安装的插件总数;This application can also calculate the range, quartile, quartile range, and quintile summary of each value based on the numerical type features in the aforementioned distinguishing feature set (in order minimum, upper quartile, median, Lower quartile, maximum value) to get the derived features for identifying outliers. If the outlier in the sample data is the required abnormal sample data, the sample data corresponding to the outlier can be regarded as the abnormal sample data; if the sample data corresponding to the outlier is the sample that needs to be filtered out Data, remove the sample data corresponding to the outlier from all sample data. Therefore, in another specific embodiment of the present application, the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X direction of the terminal device. Multiplied by the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;
所述根据所述样本数据标注集生成区分特征集之前,还包括:Before generating a distinguishing feature set based on the sample data label set, the method further includes:
计算所述用户可用的逻辑处理器总数、设备终端的字体总数、可用屏幕分辨率的乘积、插件总数的极差、四分位数、四分位数极差和五数概括;Calculate the total number of logical processors available to the user, the total number of fonts of the device terminal, the product of the available screen resolution, the range of the total number of plug-ins, the quartile, the quartile range and the five-number summary;
根据所述极差、四分位数、四分位数极差和五数概括得到识别离群点的衍生特征;Based on the range, quartile, quartile range, and quintile generalization, the derivative features for identifying outliers are obtained;
根据所述衍生特征过滤所述样本数据标注集。The sample data annotation set is filtered according to the derived features.
本实施例可达到过滤所述样本数据标注集的目的,从而提高风险系数预测模型的准确性。对于不具有数值的特征,可将其映射转化为具体的数值特征; 对于布尔类型特征,由于只有两个值,则不计算其极差、四分位数、四分位数极差、五数概括。This embodiment can achieve the purpose of filtering the sample data annotation set, thereby improving the accuracy of the risk coefficient prediction model. For features that do not have numeric values, they can be mapped into specific numeric features; for Boolean features, since there are only two values, the range, quartile, quartile range, and quintile are not calculated Generalize.
步骤S30:根据所述区分特征集和基于高斯分布的朴素贝叶斯算法,建立分类模型;Step S30: establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;
选择基于高斯分布的朴素贝叶斯算法作为有监督异常验证请求的分类模型,可通过贝叶斯分类算法同时学习正、负样本的特征,以得到每一个样本是否异常的综合评估概率,即每个样本数据是否异常的概率。基于高斯分布的朴素贝叶斯算法是现有的概率模型,通过假定原有的数据服从高斯分布、一个特征值在给定类上的影响独立于其他特征的值,学习不同特征取值的特征,通过先验概率和后验概率来计算一个样例属于特定类的概率。其算法模型的数学表达式和算法模型亦为公知技术,在此不再赘述。The Naive Bayes algorithm based on Gaussian distribution is selected as the classification model for supervised anomaly verification requests. The Bayesian classification algorithm can learn the characteristics of positive and negative samples at the same time to obtain the comprehensive evaluation probability of whether each sample is abnormal, that is, every The probability of whether the sample data is abnormal. The Naive Bayesian algorithm based on Gaussian distribution is an existing probability model. It learns the features of different feature values by assuming that the original data obeys the Gaussian distribution and the influence of a feature value on a given class is independent of the values of other features. , Calculate the probability that an example belongs to a specific class through the prior probability and the posterior probability. The mathematical expressions and algorithm models of the algorithm model are also well-known technologies, and will not be repeated here.
步骤S40:根据所述分类模型,得到风险系数预测模型。Step S40: Obtain a risk coefficient prediction model according to the classification model.
根据所述分类模型可得到每个样本数据是否异常的概率,再设定所述概率与风险系数之间的映射函数或对应关系,即可得到风险系数预测模型。例如,将所述概率与预设阈值相乘,可得到与所述概率正相关或负相关的风险系数。在一些实施例中,所述选定阈值ε可用为固定系数,所述选定阈值ε亦可为阶梯系数,以得到对应于不同数值范围的阶梯型风险系数。According to the classification model, the probability of whether each sample data is abnormal can be obtained, and then the mapping function or corresponding relationship between the probability and the risk coefficient can be set to obtain the risk coefficient prediction model. For example, multiplying the probability by a preset threshold can obtain a risk coefficient that is positively or negatively correlated with the probability. In some embodiments, the selected threshold ε can be used as a fixed coefficient, and the selected threshold ε can also be a step coefficient to obtain stepped risk coefficients corresponding to different numerical ranges.
本申请通过建立所述风险系数预测模型,可得到对应于不同环境数据的风险系数,以便服务器根据所述风险系数确定验证图片的发送数据。所述验证图片可包括适用于风险系数低的无感知验证图片,例如带有数字或文字的静态图片等,还可包括适用于风险系数高的需要用户参与的智力验证图片,例如滑动拼图、智力问答等。In this application, by establishing the risk coefficient prediction model, the risk coefficients corresponding to different environmental data can be obtained, so that the server can determine the sending data of the verification picture according to the risk coefficient. The verification pictures may include non-perceptual verification pictures suitable for low risk factors, such as static pictures with numbers or text, etc., and may also include intelligence verification pictures suitable for high risk factors that require user participation, such as sliding puzzles, intelligence Questions and answers, etc.
例如,本申请建立的风险系数预测模型,可检测出正常验证的时间点和请求频率有差异的攻击性验证请求;跟进一步地示例可以为:正常情况下,用户很少在凌晨2点-5点大量频繁地发送验证请求,或是正常情况下,用户不可能在短时间内动态地频繁切换IP地址。所以,本申请的风险系数预测模型可根据验证请求的时间特点,快速将终端设备的风险系数标记为高风险,以使用户发送适用于风险系数高的验证数据。For example, the risk coefficient prediction model established in this application can detect aggressive verification requests that differ in the time point of normal verification and the request frequency; a further example can be: under normal circumstances, users rarely do so at 2-5 in the morning. Point to send verification requests frequently, or under normal circumstances, it is impossible for users to dynamically switch IP addresses frequently in a short period of time. Therefore, the risk coefficient prediction model of this application can quickly mark the risk coefficient of the terminal device as high risk according to the time characteristics of the verification request, so that the user can send verification data suitable for high risk coefficient.
在本申请的一个建立风险系数预测模型的实施例中,所述通过样本用户发 送的验证请求、自动化验证设备和爬虫算法,获取样本用户的样本数据,包括:In an embodiment of the present application for establishing a risk coefficient prediction model, the acquisition of sample data of the sample user through the verification request sent by the sample user, automated verification equipment, and crawler algorithm includes:
获取样本用户的正常请求数据,根据所述正常请求数据得到样本用户的环境数据,将所述正常请求数据和环境数据作为正常的样本数据;Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;
通过自动化验证设备模拟异常样本用户的验证请求,获取对应于异常样本用户的样本请求数据;通过爬虫算法获取所述自动化验证设备的环境数据,得到对应于异常样本用户的环境数据,将所述异常样本用户的样本请求数据和环境数据作为异常的样本数据;The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data;
所述标注所述样本数据,得到样本数据标注集;根据所述样本数据标注集生成区分特征集,包括:The labeling the sample data to obtain a sample data label set; generating a distinguishing feature set according to the sample data label set includes:
分别标注所述正常的样本数据和异常的样本数据,得到正常的样本数据标注集和异常的样本数据标注集;Respectively label the normal sample data and abnormal sample data to obtain a normal sample data label set and an abnormal sample data label set;
根据所述正常的样本数据标注集和异常的样本数据标注集生成区分特征集。A distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.
本实施例将通过自动化验证设备模拟异常样本用户的验证请求,可快速得到大量的样本数据,以提高建立风险系数预测模型的速度。所述正常请求数据为该验证请求是正常的,若正常的样本用户发出异常的验证请求,则该验证请求不是所述正常请求数据。所述异常样本用户表示风险系数可能较高的样本用户,其验证请求可能是正常的,亦有可能是异常的;通过本申请的所述风险系数预测模型,即可区分出正常的验证请求和异常的验证请求,即所述异常样本用户的风险系数为低或是高。In this embodiment, an automated verification device is used to simulate the verification request of an abnormal sample user, and a large amount of sample data can be quickly obtained to improve the speed of establishing a risk coefficient prediction model. The normal request data is that the verification request is normal. If a normal sample user sends an abnormal verification request, the verification request is not the normal request data. The abnormal sample user indicates a sample user with a high risk coefficient, and the verification request may be normal or abnormal; through the risk coefficient prediction model of this application, the normal verification request can be distinguished from An abnormal verification request, that is, whether the risk coefficient of the abnormal sample user is low or high.
本申请提出一种网络验证数据的发送方法,如图2所示,用于兼顾用户验证方式的复杂性和提高用户体验,其包括如下步骤:This application proposes a method for sending network verification data, as shown in Figure 2, which is used to take into account the complexity of user verification methods and improve user experience, which includes the following steps:
步骤S1:接收目标用户的验证请求;Step S1: Receive a verification request from the target user;
步骤S2:获取目标用户终端设备的环境数据;Step S2: Obtain environmental data of the target user terminal device;
步骤S3:根据所述环境数据,得到目标用户终端设备的风险系数;Step S3: Obtain the risk coefficient of the target user terminal device according to the environmental data;
步骤S4:若风险系数低于预设风险阈值,向目标用户发送无感知验证数据;若风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据。Step S4: If the risk coefficient is lower than the preset risk threshold, send non-perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.
当接收到目标用户的验证请求时,服务器获取目标用户终端设备的环境数据。所述环境数据可通过插件程序、爬虫算法等方法得到,或与所述验证请求 一同发送至服务器;服务器根据所述环境数据计算出目标用户终端设备的风险系数,从而根据所述风险系数的高低向终端设备发送对应的验证数据。本申请可达到向风险系数低的终端设备发送较为简单的无感知验证数据,降低正常用户的验证难度,提高用户体验的目的;同时,本申请可风险系数高的终端设备发送智力验证数据,防止机器用户通过验证请求的方式向服务器发起恶意攻击,从而达到了过滤非正常验证请求的目的,提高了服务器对抗恶意攻击的能力。When receiving the verification request of the target user, the server obtains the environmental data of the target user's terminal device. The environmental data can be obtained through methods such as plug-in programs, crawling algorithms, or sent to the server together with the verification request; the server calculates the risk coefficient of the target user terminal device according to the environmental data, and then according to the level of the risk coefficient Send the corresponding verification data to the terminal device. This application can achieve the purpose of sending relatively simple non-perceptual verification data to terminal devices with low risk factors, reducing the difficulty of verification for normal users, and improving user experience; at the same time, this application can send intelligence verification data to terminal devices with high risk factors to prevent The machine user initiates a malicious attack to the server through the verification request, thereby achieving the purpose of filtering abnormal verification requests and improving the server's ability to resist malicious attacks.
在本申请网络验证数据的发送方法的另一实施例中,所述若风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据,包括:In another embodiment of the method for sending network verification data of the present application, if the risk factor is not lower than the preset risk threshold, sending intelligence verification data to the target user includes:
若风险系数不低于所述预设风险阈值,且不高于预设高风险阈值,向目标用户发送第一智力验证数据;所述预设高风险阈值大于所述预设风险阈值;If the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;
若风险系数高于所述预设高风险阈值,向目标用户发送第二智力验证数据。If the risk coefficient is higher than the preset high-risk threshold, send the second intelligence verification data to the target user.
本实施例将所述风险系数分为三个等级,并对风险系数较高的两个等级分别发送第一智力验证数据和第二智力验证数据,例如向中等风险系数的终端设备发送第一智力验证数据,向高等风险系数的终端设备发送第二智力验证数据,从而进一步细化了环境数据与验证数据之间的对应关系,进一步优化了用户体验。In this embodiment, the risk factor is divided into three levels, and the first intelligence verification data and the second intelligence verification data are sent to the two levels with higher risk factors, for example, the first intelligence verification data is sent to a terminal device with a medium risk factor. The verification data sends the second intelligence verification data to the terminal device with a high risk coefficient, thereby further clarifying the correspondence between the environmental data and the verification data, and further optimizing the user experience.
在本申请网络验证数据的发送方法的另一实施例中,所述根据所述环境数据,得到目标用户终端设备的风险系数,包括:In another embodiment of the method for sending network verification data of the present application, the obtaining the risk coefficient of the target user terminal device according to the environmental data includes:
将所述环境数据输入所述风险系数预测模型,得到目标用户终端设备的风险系数;Input the environmental data into the risk coefficient prediction model to obtain the risk coefficient of the target user terminal device;
所述风险系数预测模型通过以下方法建立:The risk coefficient prediction model is established by the following methods:
获取样本用户的样本数据,所述样本数据包括样本用户的环境数据;Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;
标注所述样本数据,得到样本数据标注集;Label the sample data to obtain a sample data label set;
根据所述样本数据标注集生成区分特征集;Generating a distinguishing feature set according to the sample data label set;
根据所述区分特征集和基于高斯分布的朴素贝叶斯算法,建立分类模型;Establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;
根据所述分类模型,得到风险系数预测模型。According to the classification model, a risk coefficient prediction model is obtained.
本实施例中的风险系数预测模型可在服务器中建立,亦可来源于其它设备或网络,具有可移植性,提高了网络验证数据的发送方法的适用范围。The risk coefficient prediction model in this embodiment can be established in a server, or can be derived from other devices or networks, has portability, and improves the scope of application of the method for sending network verification data.
进一步地,若风险系数低于预设风险阈值,向目标用户发送无感知验证数 据;若风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据,可具体采用如下方式:Further, if the risk factor is lower than the preset risk threshold, send the non-perceived verification data to the target user; if the risk factor is not lower than the preset risk threshold, send the intelligence verification data to the target user, which can specifically adopt the following methods:
设定风险阈值的方式:例如,设定风险阈值为50%;当所述分类模型输出的概率值大于50%,则为异常;当小于50%为正常;也可以直接输出1和-1,其中1表示正常,-1表示异常;根据正常和异常的结果,分别发送不同的验证数据。Ways to set the risk threshold: For example, set the risk threshold to 50%; when the probability value output by the classification model is greater than 50%, it is abnormal; when it is less than 50%, it is normal; you can also directly output 1 and -1, Among them, 1 means normal, -1 means abnormal; according to the results of normal and abnormal, different verification data are sent respectively.
设定风险级别的方式:例如,设定所述分类模型输出的概率值为0-40%时,对应为低风险L1级别;输出的概率值为40%-60%时,对应为中风险L2级别;输出的概率值为60%-100%时,对应为高风险L3级别;根据所述级别,分别发送不同的验证数据。例如:对于L1级别可以加载智能无感知验证方式,对于L2级别可以随机加载滑动拼图验证方式或在背景图片识别文字的验证方式,对于L3级别可以加载语音验证方式。The way of setting the risk level: For example, when the probability value output by the classification model is 0-40%, it corresponds to the low risk L1 level; when the output probability value is 40%-60%, it corresponds to the medium risk L2 Level: When the output probability value is 60%-100%, it corresponds to the high-risk L3 level; according to the level, different verification data are sent. For example: for L1 level, you can load the intelligent non-perceptual verification method, for L2 level, you can load the sliding puzzle verification method or the verification method of recognizing text in the background picture randomly, and for the L3 level, you can load the voice verification method.
本申请还提出一种网络验证数据的发送装置,如图3所示,该装置包括:This application also proposes a device for sending network verification data. As shown in FIG. 3, the device includes:
验证请求接收模块1,用于接收目标用户的验证请求;The verification request receiving module 1 is used to receive the verification request of the target user;
环境数据获取模块2,用于获取目标用户终端设备的环境数据;The environmental data acquisition module 2 is used to acquire environmental data of the target user terminal device;
风险系数计算模块3,用于根据所述环境数据,得到目标用户终端设备的风险系数;The risk coefficient calculation module 3 is used to obtain the risk coefficient of the target user terminal device according to the environmental data;
验证数据发送模块4,用于当风险系数低于预设风险阈值时,向目标用户发送无感知验证数据;当风险系数不低于所述预设风险阈值时,向目标用户发送智力验证数据。The verification data sending module 4 is configured to send non-perception verification data to the target user when the risk coefficient is lower than the preset risk threshold; and send intelligence verification data to the target user when the risk coefficient is not lower than the preset risk threshold.
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述任意一项所述的网络验证数据的发送方法。其中,所述存储介质包括但不限于任何类型的盘(包括软盘、硬盘、光盘、CD-ROM、和磁光盘)、ROM(Read-Only Memory,只读存储器)、RAM(Random AcceSS Memory,随即存储器)、EPROM(EraSable Programmable Read-Only Memory,可擦写可编程只读存储器)、EEPROM(Electrically EraSable Programmable Read-Only Memory,电可擦可编程只读存储器)、闪存、磁性卡片或光线卡片。也就是,存储介质包括由设备(例如,计算机)以能够读的形式存储或传输信息的任何介质。可以是只读存储器,磁盘或光盘等。The embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the method for sending network verification data described in any one of the above is implemented. Wherein, the storage medium includes, but is not limited to, any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory), RAM (Random AccesSS Memory), and then Memory), EPROM (EraSable Programmable Read-Only Memory), EEPROM (Electrically EraSable Programmable Read-Only Memory), flash memory, magnetic card or optical card. That is, the storage medium includes any medium that stores or transmits information in a readable form by a device (for example, a computer). It can be a read-only memory, magnetic disk or optical disk, etc.
本申请实施例还提供一种服务器,所述服务器包括:The embodiment of the present application also provides a server, and the server includes:
一个或多个处理器;One or more processors;
存储装置,用于存储一个或多个程序,Storage device for storing one or more programs,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述任意一项所述的网络验证数据的发送方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the method for sending network verification data according to any one of the foregoing.
图4为本申请服务器的结构示意图,包括处理器320、存储装置330、输入单元340以及显示单元350等器件。本领域技术人员可以理解,图4示出的结构器件并不构成对所有服务器的限定,可以包括比图示更多或更少的部件,或者组合某些部件。存储装置330可用于存储应用程序310以及各功能模块,处理器320运行存储在存储装置330的应用程序310,从而执行设备的各种功能应用以及数据处理。存储装置330可以是内存储器或外存储器,或者包括内存储器和外存储器两者。内存储器可以包括只读存储器、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦写可编程ROM(EEPROM)、快闪存储器、或者随机存储器。外存储器可以包括硬盘、软盘、ZIP盘、U盘、磁带等。本申请所公开的存储装置包括但不限于这些类型的存储装置。本申请所公开的存储装置330只作为例子而非作为限定。FIG. 4 is a schematic diagram of the structure of the server of this application, including a processor 320, a storage device 330, an input unit 340, a display unit 350 and other devices. Those skilled in the art can understand that the structural components shown in FIG. 4 do not constitute a limitation on all servers, and may include more or less components than those shown in the figure, or combine certain components. The storage device 330 may be used to store the application program 310 and various functional modules. The processor 320 runs the application program 310 stored in the storage device 330 to execute various functional applications and data processing of the device. The storage device 330 may be an internal memory or an external memory, or include both internal memory and external memory. The internal memory may include read-only memory, programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), flash memory, or random access memory. External storage can include hard disks, floppy disks, ZIP disks, U disks, tapes, etc. The storage devices disclosed in this application include but are not limited to these types of storage devices. The storage device 330 disclosed in this application is merely an example and not a limitation.
输入单元340用于接收信号的输入,以及接收目标用户在第一统计日期的用户属性信息以及对指定目标的访问信息。输入单元340可包括触控面板以及其它输入设备。触控面板可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板上或在触控面板附近的操作),并根据预先设定的程序驱动相应的连接装置;其它输入设备可以包括但不限于物理键盘、功能键(比如播放控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。显示单元350可用于显示用户输入的信息或提供给用户的信息以及计算机设备的各种菜单。显示单元350可采用液晶显示器、有机发光二极管等形式。处理器320是计算机设备的控制中心,利用各种接口和线路连接整个电脑的各个部分,通过运行或执行存储在存储装置330内的软件程序和/或模块,以及调用存储在存储装置内的数据,执行各种功能和处理数据。The input unit 340 is used for receiving signal input, and receiving user attribute information of the target user on the first statistical date and access information to the specified target. The input unit 340 may include a touch panel and other input devices. The touch panel can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc., to operate on the touch panel or near the touch panel), and according to preset The program drives the corresponding connection device; other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as playback control buttons, switch buttons, etc.), trackball, mouse, and joystick. The display unit 350 may be used to display information input by the user or information provided to the user and various menus of the computer device. The display unit 350 may take the form of a liquid crystal display, an organic light emitting diode, or the like. The processor 320 is the control center of the computer equipment. It uses various interfaces and lines to connect various parts of the entire computer, runs or executes software programs and/or modules stored in the storage device 330, and calls data stored in the storage device. , Perform various functions and process data.
在一实施方式中,服务器包括一个或多个处理器320,以及一个或多个存储装置330,一个或多个应用程序310,其中所述一个或多个应用程序310被 存储在存储装置330中并被配置为由所述一个或多个处理器320执行,所述一个或多个应用程序310配置用于执行以上实施例所述的流失用户预警方法。In an embodiment, the server includes one or more processors 320, one or more storage devices 330, and one or more application programs 310, wherein the one or more application programs 310 are stored in the storage device 330 And is configured to be executed by the one or more processors 320, and the one or more application programs 310 are configured to execute the churn user early warning method described in the above embodiments.
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that, although the various steps in the flowchart of the drawings are shown in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least part of the steps in the flowchart of the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution is also It is not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.
应该理解的是,在本申请各实施例中的各功能单元可集成在一个处理模块中,也可以各个单元单独物理存在,也可以两个或两个以上单元集成于一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It should be understood that the functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。The above are only part of the implementation of this application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of this application, several improvements and modifications can be made, and these improvements and modifications are also Should be regarded as the scope of protection of this application.
Claims (20)
- 一种网络验证数据的发送方法,其特征在于,包括步骤:A method for sending network verification data is characterized in that it comprises the steps:接收目标用户的验证请求;Receive verification requests from target users;获取目标用户终端设备的环境数据;Obtain environmental data of the target user terminal device;根据所述环境数据,得到目标用户终端设备的风险系数;According to the environmental data, obtain the risk coefficient of the target user terminal device;若风险系数低于预设风险阈值,向目标用户发送无感知验证数据;若风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据。If the risk coefficient is lower than the preset risk threshold, send no perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.
- 根据权利要求1所述的方法,其特征在于,所述若风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据,包括:The method according to claim 1, wherein if the risk coefficient is not lower than the preset risk threshold, sending intelligence verification data to the target user comprises:若风险系数不低于所述预设风险阈值,且不高于预设高风险阈值,向目标用户发送第一智力验证数据;所述预设高风险阈值大于所述预设风险阈值;If the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;若风险系数高于所述预设高风险阈值,向目标用户发送第二智力验证数据。If the risk coefficient is higher than the preset high-risk threshold, send the second intelligence verification data to the target user.
- 根据权利要求1所述的方法,其特征在于,所述根据所述环境数据,得到目标用户终端设备的风险系数之前,还包括建立风险系数预测模型,所述建立风险系数预测模型包括如下步骤:The method according to claim 1, characterized in that, before obtaining the risk coefficient of the target user terminal device according to the environmental data, it further comprises establishing a risk coefficient prediction model, and the establishing the risk coefficient prediction model comprises the following steps:获取样本用户的样本数据,所述样本数据包括样本用户的环境数据;Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;标注所述样本数据,得到样本数据标注集,根据所述样本数据标注集生成区分特征集;Label the sample data to obtain a sample data label set, and generate a distinguishing feature set according to the sample data label set;根据所述区分特征集和基于高斯分布的朴素贝叶斯算法,建立分类模型;Establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;根据所述分类模型,得到风险系数预测模型;According to the classification model, a risk coefficient prediction model is obtained;所述根据所述环境数据,得到目标用户终端设备的风险系数,包括:The obtaining the risk coefficient of the target user terminal device according to the environmental data includes:将所述环境数据输入所述风险系数预测模型,得到目标用户终端设备的风险系数。The environmental data is input into the risk coefficient prediction model to obtain the risk coefficient of the target user terminal device.
- 根据权利要求3所述的方法,其特征在于,所述获取样本用户的样本数据,包括:The method according to claim 3, wherein said acquiring sample data of sample users comprises:获取样本用户的正常请求数据,根据所述正常请求数据得到样本用户的环境数据,将所述正常请求数据和环境数据作为正常的样本数据;Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;通过自动化验证设备模拟异常样本用户的验证请求,获取对应于异常样本用户的样本请求数据;通过爬虫算法获取所述自动化验证设备的环境数据,得 到对应于异常样本用户的环境数据,将所述异常样本用户的样本请求数据和环境数据作为异常的样本数据;The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data;所述标注所述样本数据,得到样本数据标注集;根据所述样本数据标注集生成区分特征集,包括:The labeling the sample data to obtain a sample data label set; generating a distinguishing feature set according to the sample data label set includes:分别标注所述正常的样本数据和异常的样本数据,得到正常的样本数据标注集和异常的样本数据标注集;Respectively label the normal sample data and abnormal sample data to obtain a normal sample data label set and an abnormal sample data label set;根据所述正常的样本数据标注集和异常的样本数据标注集生成区分特征集。A distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.
- 根据权利要求3所述的方法,其特征在于,所述样本用户的环境数据包括:设备类型、硬件型号数据、像素比数据、颜色深度数据、屏幕分辨率数据、可用屏幕分辨率数据、操作系统数据、时间数据、触控数据、特定插件字数、软件安装数据、浏览器数据、堆栈指纹数据中的一个或多个。The method according to claim 3, wherein the environmental data of the sample user includes: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, available screen resolution data, operating system One or more of data, time data, touch data, specific plug-in words, software installation data, browser data, and stack fingerprint data.
- 根据权利要求5所述的方法,其特征在于,所述操作系统数据包括用户可用的逻辑处理器总数,所述字体数据包括终端设备的字体总数,所述可用屏幕分辨率数据包括终端设备的X方向与Y方向的可用屏幕分辨率的乘积,所述浏览器数据包括浏览器安装的插件总数;The method according to claim 5, wherein the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X of the terminal device. The product of the direction and the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;所述根据所述样本数据标注集生成区分特征集之前,还包括:Before generating a distinguishing feature set based on the sample data label set, the method further includes:计算所述用户可用的逻辑处理器总数、终端设备的字体总数、可用屏幕分辨率的乘积、插件总数的极差、四分位数、四分位数极差和五数概括;Calculate the total number of logical processors available to the user, the total number of fonts of the terminal device, the product of the available screen resolution, the range of the total number of plug-ins, the quartile, the quartile range, and the five-number summary;根据所述极差、四分位数、四分位数极差和五数概括得到识别离群点的衍生特征;Based on the range, quartile, quartile range, and quintile generalization, the derivative features for identifying outliers are obtained;根据所述衍生特征过滤所述样本数据标注集。The sample data annotation set is filtered according to the derived features.
- 根据权利要求3所述的方法,其特征在于,所述区分特征集中的特征包括以下任一项或多项:浏览器语言、终端设备的像素比、终端设备的颜色深度、X方向屏幕分辨率、Y方向屏幕分辨率、X方向可用屏幕分辨率、Y方向可用屏幕分辨率、X方向与Y方向的屏幕分辨率的第一乘积、X方向与Y方向的可用屏幕分辨率的第二乘积、所述第一乘积和所述第二乘积的差值、CPU的类型是否可知、浏览器插件是否缺失、使用JS/CSS检测到的字体列表是否缺失、操作系统是否为unknown、WebGL供应商信息是否缺失、浏览器类型 是否为robot、浏览器生产厂商是否为other、操作系统生产厂商是否为other、浏览器插件、浏览器插件总数、使用JS/CSS检测到的字体总数、操作系统和系统平台是否一致、音频堆栈指纹是否提供、音频堆栈指纹的参数信息、系统对用户代理可用的逻辑处理器总数、是否安装AdBlock、用户是否篡改了语言、用户是否篡改了屏幕分辨率、用户是否篡改了操作系统、浏览器生产厂商、操作系统生产厂商、访问设备类型、操作系统和系统平台是否一致。The method according to claim 3, wherein the features in the distinguishing feature set include any one or more of the following: browser language, pixel ratio of the terminal device, color depth of the terminal device, X-direction screen resolution , Y-direction screen resolution, X-direction usable screen resolution, Y-direction usable screen resolution, the first product of X-direction and Y-direction screen resolution, the second product of X-direction and Y-direction available screen resolution, The difference between the first product and the second product, whether the CPU type is known, whether the browser plug-in is missing, whether the font list detected by JS/CSS is missing, whether the operating system is unknown, whether the WebGL vendor information Missing, whether the browser type is robot, whether the browser manufacturer is other, whether the operating system manufacturer is other, the total number of browser plug-ins, browser plug-ins, the total number of fonts detected using JS/CSS, whether the operating system and system platform Consistency, whether the audio stack fingerprint is provided, the parameter information of the audio stack fingerprint, the total number of logical processors available to the user agent by the system, whether AdBlock is installed, whether the user has tampered with the language, whether the user has tampered with the screen resolution, whether the user has tampered with the operating system , Whether the browser manufacturer, operating system manufacturer, access device type, operating system and system platform are consistent.
- 一种网络验证数据的发送装置,其特征在于,包括:A device for sending network verification data, characterized in that it comprises:验证请求接收模块,用于接收目标用户的验证请求;The verification request receiving module is used to receive the verification request of the target user;环境数据获取模块,用于获取目标用户终端设备的环境数据;Environmental data acquisition module for acquiring environmental data of target user terminal equipment;风险系数计算模块,用于根据所述环境数据,得到目标用户终端设备的风险系数;The risk coefficient calculation module is used to obtain the risk coefficient of the target user terminal device according to the environmental data;验证数据发送模块,用于当风险系数低于预设风险阈值时,向目标用户发送无感知验证数据;当风险系数不低于所述预设风险阈值时,向目标用户发送智力验证数据。The verification data sending module is used to send non-perception verification data to the target user when the risk coefficient is lower than the preset risk threshold; and send intelligence verification data to the target user when the risk coefficient is not lower than the preset risk threshold.
- 根据权利要求8所述的装置,其特征在于,所述验证数据发送模块在当风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据时,具体用于:The device according to claim 8, wherein the verification data sending module is specifically configured to send intelligence verification data to the target user when the risk factor is not lower than the preset risk threshold:若风险系数不低于所述预设风险阈值,且不高于预设高风险阈值,向目标用户发送第一智力验证数据;所述预设高风险阈值大于所述预设风险阈值;If the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;若风险系数高于所述预设高风险阈值,向目标用户发送第二智力验证数据。If the risk coefficient is higher than the preset high-risk threshold, send the second intelligence verification data to the target user.
- 根据权利要求8所述的装置,其特征在于,所述风险系数计算模块还用于建立风险系数预测模型,所述风险系数计算模块在建立风险系数预测模型时,具体用于:The device according to claim 8, wherein the risk coefficient calculation module is further used to establish a risk coefficient prediction model, and the risk coefficient calculation module is specifically used to:获取样本用户的样本数据,所述样本数据包括样本用户的环境数据;Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;标注所述样本数据,得到样本数据标注集,根据所述样本数据标注集生成区分特征集;Label the sample data to obtain a sample data label set, and generate a distinguishing feature set according to the sample data label set;根据所述区分特征集和基于高斯分布的朴素贝叶斯算法,建立分类模型;Establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;根据所述分类模型,得到风险系数预测模型;According to the classification model, a risk coefficient prediction model is obtained;所述风险系数计算模块在根据所述环境数据,得到目标用户终端设备的风 险系数时,具体用于:When the risk coefficient calculation module obtains the risk coefficient of the target user terminal device according to the environmental data, it is specifically used to:将所述环境数据输入所述风险系数预测模型,得到目标用户终端设备的风险系数。The environmental data is input into the risk coefficient prediction model to obtain the risk coefficient of the target user terminal device.
- 根据权利要求10所述的装置,其特征在于,所述风险系数计算模块在获取样本用户的样本数据时,具体用于:The apparatus according to claim 10, wherein the risk coefficient calculation module is specifically configured to: when acquiring sample data of sample users:获取样本用户的正常请求数据,根据所述正常请求数据得到样本用户的环境数据,将所述正常请求数据和环境数据作为正常的样本数据;Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;通过自动化验证设备模拟异常样本用户的验证请求,获取对应于异常样本用户的样本请求数据;通过爬虫算法获取所述自动化验证设备的环境数据,得到对应于异常样本用户的环境数据,将所述异常样本用户的样本请求数据和环境数据作为异常的样本数据;The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data;所述风险系数计算模块在标注所述样本数据,得到样本数据标注集;根据所述样本数据标注集生成区分特征集时,具体用于:The risk coefficient calculation module is used to label the sample data to obtain a sample data label set; when generating a distinguishing feature set based on the sample data label set, it is specifically used to:分别标注所述正常的样本数据和异常的样本数据,得到正常的样本数据标注集和异常的样本数据标注集;Respectively label the normal sample data and abnormal sample data to obtain a normal sample data label set and an abnormal sample data label set;根据所述正常的样本数据标注集和异常的样本数据标注集生成区分特征集。A distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.
- 根据权利要求10所述的装置,其特征在于,所述样本用户的环境数据包括:设备类型、硬件型号数据、像素比数据、颜色深度数据、屏幕分辨率数据、可用屏幕分辨率数据、操作系统数据、时间数据、触控数据、特定插件字数、软件安装数据、浏览器数据、堆栈指纹数据中的一个或多个。The device according to claim 10, wherein the environmental data of the sample user includes: device type, hardware model data, pixel ratio data, color depth data, screen resolution data, available screen resolution data, operating system One or more of data, time data, touch data, specific plug-in words, software installation data, browser data, and stack fingerprint data.
- 根据权利要求12所述的装置,其特征在于,所述操作系统数据包括用户可用的逻辑处理器总数,所述字体数据包括终端设备的字体总数,所述可用屏幕分辨率数据包括终端设备的X方向与Y方向的可用屏幕分辨率的乘积,所述浏览器数据包括浏览器安装的插件总数;The apparatus according to claim 12, wherein the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X of the terminal device. The product of the direction and the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;所述风险系数计算模块还用于:The risk coefficient calculation module is also used for:计算所述用户可用的逻辑处理器总数、终端设备的字体总数、可用屏幕分辨率的乘积、插件总数的极差、四分位数、四分位数极差和五数概括;Calculate the total number of logical processors available to the user, the total number of fonts of the terminal device, the product of the available screen resolution, the range of the total number of plug-ins, the quartile, the quartile range, and the five-number summary;根据所述极差、四分位数、四分位数极差和五数概括得到识别离群点的衍 生特征;According to the range, quartile, quartile range, and quintile, the derivative characteristics of identifying outliers are obtained;根据所述衍生特征过滤所述样本数据标注集。The sample data annotation set is filtered according to the derived features.
- 根据权利要求10所述的装置,其特征在于,所述区分特征集中的特征包括以下任一项或多项:浏览器语言、终端设备的像素比、终端设备的颜色深度、X方向屏幕分辨率、Y方向屏幕分辨率、X方向可用屏幕分辨率、Y方向可用屏幕分辨率、X方向与Y方向的屏幕分辨率的第一乘积、X方向与Y方向的可用屏幕分辨率的第二乘积、所述第一乘积和所述第二乘积的差值、CPU的类型是否可知、浏览器插件是否缺失、使用JS/CSS检测到的字体列表是否缺失、操作系统是否为unknown、WebGL供应商信息是否缺失、浏览器类型是否为robot、浏览器生产厂商是否为other、操作系统生产厂商是否为other、浏览器插件、浏览器插件总数、使用JS/CSS检测到的字体总数、操作系统和系统平台是否一致、音频堆栈指纹是否提供、音频堆栈指纹的参数信息、系统对用户代理可用的逻辑处理器总数、是否安装AdBlock、用户是否篡改了语言、用户是否篡改了屏幕分辨率、用户是否篡改了操作系统、浏览器生产厂商、操作系统生产厂商、访问设备类型、操作系统和系统平台是否一致。The device according to claim 10, wherein the features in the distinguishing feature set include any one or more of the following: browser language, pixel ratio of the terminal device, color depth of the terminal device, X-direction screen resolution , Y-direction screen resolution, X-direction usable screen resolution, Y-direction usable screen resolution, the first product of X-direction and Y-direction screen resolution, the second product of X-direction and Y-direction available screen resolution, The difference between the first product and the second product, whether the CPU type is known, whether the browser plug-in is missing, whether the font list detected by JS/CSS is missing, whether the operating system is unknown, whether the WebGL vendor information Missing, whether the browser type is robot, whether the browser manufacturer is other, whether the operating system manufacturer is other, the total number of browser plug-ins, the total number of browser plug-ins, the total number of fonts detected using JS/CSS, the operating system and the system platform Consistency, whether the audio stack fingerprint is provided, the parameter information of the audio stack fingerprint, the total number of logical processors available to the user agent by the system, whether AdBlock is installed, whether the user has tampered with the language, whether the user has tampered with the screen resolution, whether the user has tampered with the operating system , Whether the browser manufacturer, operating system manufacturer, access device type, operating system and system platform are consistent.
- 一种计算机非易失性可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1至7中任意一项所述的网络验证数据的发送方法。A computer non-volatile readable storage medium having a computer program stored thereon, wherein the program is executed by a processor to implement the method for sending network verification data according to any one of claims 1 to 7 .
- 一种服务器,其特征在于,所述服务器包括:A server, characterized in that the server includes:一个或多个处理器;One or more processors;存储装置,用于存储一个或多个程序,Storage device for storing one or more programs,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现以下步骤:When the one or more programs are executed by the one or more processors, the one or more processors implement the following steps:接收目标用户的验证请求;Receive verification requests from target users;获取目标用户终端设备的环境数据;Obtain environmental data of the target user terminal device;根据所述环境数据,得到目标用户终端设备的风险系数;According to the environmental data, obtain the risk coefficient of the target user terminal device;若风险系数低于预设风险阈值,向目标用户发送无感知验证数据;若风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据。If the risk coefficient is lower than the preset risk threshold, send no perception verification data to the target user; if the risk coefficient is not lower than the preset risk threshold, send intelligence verification data to the target user.
- 根据权利要求16所述的服务器,其特征在于,所述处理器在实现所 述若风险系数不低于所述预设风险阈值,向目标用户发送智力验证数据时,具体执行以下步骤:The server according to claim 16, wherein when the processor realizes that if the risk coefficient is not lower than the preset risk threshold, sending intelligence verification data to the target user, specifically executes the following steps:若风险系数不低于所述预设风险阈值,且不高于预设高风险阈值,向目标用户发送第一智力验证数据;所述预设高风险阈值大于所述预设风险阈值;If the risk coefficient is not lower than the preset risk threshold and not higher than the preset high risk threshold, sending the first intelligence verification data to the target user; the preset high risk threshold is greater than the preset risk threshold;若风险系数高于所述预设高风险阈值,向目标用户发送第二智力验证数据。If the risk coefficient is higher than the preset high-risk threshold, send the second intelligence verification data to the target user.
- 根据权利要求16所述的服务器,其特征在于,所述处理器在实现所述根据所述环境数据,得到目标用户终端设备的风险系数之前,还用于建立风险系数预测模型,所述处理器在实现所述建立风险系数预测模型时,具体执行以下步骤:The server according to claim 16, wherein the processor is further configured to establish a risk coefficient prediction model before obtaining the risk coefficient of the target user terminal device according to the environmental data, and the processor When realizing the establishment of the risk coefficient prediction model, the following steps are specifically performed:获取样本用户的样本数据,所述样本数据包括样本用户的环境数据;Acquiring sample data of a sample user, where the sample data includes environmental data of the sample user;标注所述样本数据,得到样本数据标注集,根据所述样本数据标注集生成区分特征集;Label the sample data to obtain a sample data label set, and generate a distinguishing feature set according to the sample data label set;根据所述区分特征集和基于高斯分布的朴素贝叶斯算法,建立分类模型;Establishing a classification model according to the distinguishing feature set and the Naive Bayes algorithm based on Gaussian distribution;根据所述分类模型,得到风险系数预测模型;According to the classification model, a risk coefficient prediction model is obtained;所述处理器在实现所述根据所述环境数据,得到目标用户终端设备的风险系数时,具体执行以下步骤:When the processor realizes that the risk coefficient of the target user terminal device is obtained according to the environmental data, it specifically executes the following steps:将所述环境数据输入所述风险系数预测模型,得到目标用户终端设备的风险系数。The environmental data is input into the risk coefficient prediction model to obtain the risk coefficient of the target user terminal device.
- 根据权利要求18所述的服务器,其特征在于,所述处理器在实现所述获取样本用户的样本数据时,具体执行以下步骤:The server according to claim 18, wherein the processor specifically executes the following steps when implementing the acquiring sample data of the sample user:获取样本用户的正常请求数据,根据所述正常请求数据得到样本用户的环境数据,将所述正常请求数据和环境数据作为正常的样本数据;Acquiring the normal request data of the sample user, obtaining the environmental data of the sample user according to the normal request data, and using the normal request data and the environmental data as normal sample data;通过自动化验证设备模拟异常样本用户的验证请求,获取对应于异常样本用户的样本请求数据;通过爬虫算法获取所述自动化验证设备的环境数据,得到对应于异常样本用户的环境数据,将所述异常样本用户的样本请求数据和环境数据作为异常的样本数据;The automated verification device simulates the verification request of the abnormal sample user to obtain the sample request data corresponding to the abnormal sample user; obtains the environmental data of the automated verification device through the crawler algorithm, obtains the environmental data corresponding to the abnormal sample user, and converts the abnormal The sample request data and environmental data of the sample user are regarded as abnormal sample data;所述处理器在实现所述标注所述样本数据,得到样本数据标注集;根据所述样本数据标注集生成区分特征集时,具体执行以下步骤:The processor implements the labeling of the sample data to obtain a sample data label set; when generating a distinguishing feature set according to the sample data label set, the following steps are specifically executed:分别标注所述正常的样本数据和异常的样本数据,得到正常的样本数据标 注集和异常的样本数据标注集;Respectively label the normal sample data and abnormal sample data to obtain a normal sample data label set and an abnormal sample data label set;根据所述正常的样本数据标注集和异常的样本数据标注集生成区分特征集。A distinguishing feature set is generated according to the normal sample data label set and the abnormal sample data label set.
- 根据权利要求18所述的服务器,其特征在于,所述操作系统数据包括用户可用的逻辑处理器总数,所述字体数据包括终端设备的字体总数,所述可用屏幕分辨率数据包括终端设备的X方向与Y方向的可用屏幕分辨率的乘积,所述浏览器数据包括浏览器安装的插件总数;The server according to claim 18, wherein the operating system data includes the total number of logical processors available to the user, the font data includes the total number of fonts of the terminal device, and the available screen resolution data includes the X of the terminal device. The product of the direction and the available screen resolution in the Y direction, the browser data includes the total number of plug-ins installed by the browser;所述处理器在实现所述根据所述样本数据标注集生成区分特征集之前,还用于实现以下步骤:The processor is further configured to implement the following steps before generating the distinguishing feature set according to the sample data label set:计算所述用户可用的逻辑处理器总数、终端设备的字体总数、可用屏幕分辨率的乘积、插件总数的极差、四分位数、四分位数极差和五数概括;Calculate the total number of logical processors available to the user, the total number of fonts of the terminal device, the product of the available screen resolution, the range of the total number of plug-ins, the quartile, the quartile range, and the five-number summary;根据所述极差、四分位数、四分位数极差和五数概括得到识别离群点的衍生特征;Based on the range, quartile, quartile range, and quintile generalization, the derivative features for identifying outliers are obtained;根据所述衍生特征过滤所述样本数据标注集。The sample data annotation set is filtered according to the derived features.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910115365.1A CN109981567A (en) | 2019-02-13 | 2019-02-13 | Sending method, device, storage medium and the server of network authorization data |
CN201910115365.1 | 2019-02-13 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020164274A1 true WO2020164274A1 (en) | 2020-08-20 |
Family
ID=67076996
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/117939 WO2020164274A1 (en) | 2019-02-13 | 2019-11-13 | Network verification data sending method and apparatus, and storage medium and server |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109981567A (en) |
WO (1) | WO2020164274A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210240837A1 (en) * | 2020-02-04 | 2021-08-05 | Pindrop Security, Inc. | Dynamic account risk assessment from heterogeneous events |
CN115348486A (en) * | 2022-08-08 | 2022-11-15 | 青岛佳世特尔智创科技有限公司 | Wireless data management method and system applied to petrochemical industry system |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109981567A (en) * | 2019-02-13 | 2019-07-05 | 平安科技(深圳)有限公司 | Sending method, device, storage medium and the server of network authorization data |
CN112395589A (en) * | 2019-08-13 | 2021-02-23 | 北京默契破冰科技有限公司 | Method, apparatus, and computer storage medium for detecting abnormal user |
CN111104664B (en) * | 2019-11-29 | 2022-03-15 | 北京云测信息技术有限公司 | Risk identification method of electronic equipment and server |
CN113301033B (en) * | 2021-05-14 | 2023-05-02 | 杭州顶象科技有限公司 | Verification code display method and system for lightweight business intrusion |
CN114553541B (en) * | 2022-02-17 | 2024-02-06 | 苏州良医汇网络科技有限公司 | Method, device, equipment and storage medium for checking anti-crawlers in grading mode |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107256257A (en) * | 2017-06-12 | 2017-10-17 | 上海携程商务有限公司 | Abnormal user generation content identification method and system based on business datum |
CN107749844A (en) * | 2017-10-16 | 2018-03-02 | 维沃移动通信有限公司 | Auth method and mobile terminal |
US20180255097A1 (en) * | 2015-11-05 | 2018-09-06 | Alibaba Group Holding Limited | Method and device for application information risk management |
CN109120629A (en) * | 2018-08-31 | 2019-01-01 | 新华三信息安全技术有限公司 | A kind of abnormal user recognition methods and device |
CN109981567A (en) * | 2019-02-13 | 2019-07-05 | 平安科技(深圳)有限公司 | Sending method, device, storage medium and the server of network authorization data |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104580089A (en) * | 2013-10-18 | 2015-04-29 | 深圳市腾讯计算机系统有限公司 | User verification method and mobile terminal |
CN108449313B (en) * | 2018-02-01 | 2021-02-19 | 平安科技(深圳)有限公司 | Electronic device, Internet service system risk early warning method and storage medium |
CN109034584A (en) * | 2018-07-17 | 2018-12-18 | 国网浙江杭州市临安区供电有限公司 | Power supply station's honesty risk Analysis of Potential method based on big data |
CN108876600B (en) * | 2018-08-20 | 2023-09-05 | 平安科技(深圳)有限公司 | Early warning information pushing method, device, computer equipment and medium |
CN109327439B (en) * | 2018-09-29 | 2021-04-23 | 武汉极意网络科技有限公司 | Risk identification method and device for service request data, storage medium and equipment |
CN109255230A (en) * | 2018-09-29 | 2019-01-22 | 武汉极意网络科技有限公司 | Recognition methods, system, user equipment and the storage medium of abnormal verifying behavior |
-
2019
- 2019-02-13 CN CN201910115365.1A patent/CN109981567A/en active Pending
- 2019-11-13 WO PCT/CN2019/117939 patent/WO2020164274A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180255097A1 (en) * | 2015-11-05 | 2018-09-06 | Alibaba Group Holding Limited | Method and device for application information risk management |
CN107256257A (en) * | 2017-06-12 | 2017-10-17 | 上海携程商务有限公司 | Abnormal user generation content identification method and system based on business datum |
CN107749844A (en) * | 2017-10-16 | 2018-03-02 | 维沃移动通信有限公司 | Auth method and mobile terminal |
CN109120629A (en) * | 2018-08-31 | 2019-01-01 | 新华三信息安全技术有限公司 | A kind of abnormal user recognition methods and device |
CN109981567A (en) * | 2019-02-13 | 2019-07-05 | 平安科技(深圳)有限公司 | Sending method, device, storage medium and the server of network authorization data |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210240837A1 (en) * | 2020-02-04 | 2021-08-05 | Pindrop Security, Inc. | Dynamic account risk assessment from heterogeneous events |
CN115348486A (en) * | 2022-08-08 | 2022-11-15 | 青岛佳世特尔智创科技有限公司 | Wireless data management method and system applied to petrochemical industry system |
Also Published As
Publication number | Publication date |
---|---|
CN109981567A (en) | 2019-07-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020164274A1 (en) | Network verification data sending method and apparatus, and storage medium and server | |
US11310257B2 (en) | Anomaly scoring using collaborative filtering | |
WO2020164268A1 (en) | Verification code generation method and apparatus, and storage medium and computer device | |
US10715570B1 (en) | Generic event stream processing for machine learning | |
WO2019153604A1 (en) | Device and method for creating human/machine identification model, and computer readable storage medium | |
US9727723B1 (en) | Recommendation system based approach in reducing false positives in anomaly detection | |
US10740411B2 (en) | Determining repeat website users via browser uniqueness tracking | |
EP2638452B1 (en) | Resolving merged touch contacts | |
CN109376078A (en) | Test method, terminal device and the medium of mobile application | |
KR102513334B1 (en) | Image verification method and apparatus, electronic device and computer-readable storage medium | |
JP2016503219A (en) | System and method for cognitive behavior recognition | |
US9563763B1 (en) | Enhanced captchas | |
US20230035104A1 (en) | Verification method, apparatus and device, and storage medium | |
US20240291847A1 (en) | Security risk remediation tool | |
CN108156127B (en) | Network attack mode judging device, judging method and computer readable storage medium thereof | |
US11347842B2 (en) | Systems and methods for protecting a remotely hosted application from malicious attacks | |
CN110572402A (en) | internet hosting website detection method and system based on network access behavior analysis and readable storage medium | |
US9507621B1 (en) | Signature-based detection of kernel data structure modification | |
CN115688112A (en) | Industrial control risk assessment method, device, equipment and storage medium | |
CN114531294A (en) | Network anomaly sensing method and device, terminal and storage medium | |
CN113765924A (en) | Safety monitoring method, terminal and equipment based on cross-server access of user | |
US20240248971A1 (en) | Same person detection of end users based on input device data | |
US11882143B1 (en) | Cybersecurity system and method for protecting against zero-day attacks | |
US20240338447A1 (en) | Automated attack chain following by a threat analysis platform | |
WO2023132061A1 (en) | Training method, information processing device, and training program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19915374 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19915374 Country of ref document: EP Kind code of ref document: A1 |