WO2020143322A1 - 用户请求的检测方法、装置、计算机设备及存储介质 - Google Patents

用户请求的检测方法、装置、计算机设备及存储介质 Download PDF

Info

Publication number
WO2020143322A1
WO2020143322A1 PCT/CN2019/118396 CN2019118396W WO2020143322A1 WO 2020143322 A1 WO2020143322 A1 WO 2020143322A1 CN 2019118396 W CN2019118396 W CN 2019118396W WO 2020143322 A1 WO2020143322 A1 WO 2020143322A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
combined
user request
data
feature set
Prior art date
Application number
PCT/CN2019/118396
Other languages
English (en)
French (fr)
Inventor
黎立桂
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020143322A1 publication Critical patent/WO2020143322A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/40Network security protocols

Definitions

  • the embodiments of the present application relate to the financial field, and in particular, to a user request detection method, device, computer equipment, and storage medium.
  • Dividend insurance refers to an insurance product in which an insurance company allocates the distributable surplus of this type of dividend insurance in the previous fiscal year to customers in the form of cash dividends or value-added dividends in a certain proportion after the end of each fiscal year.
  • the inventors created by this application found in the research that in the prior art, when introducing insurance types manually, according to the quality of the personnel and the mastery of skills, there are uneven levels, especially for the benefits at different stages and different levels of dividends.
  • the calculation of dividends often fails to provide quick answers based on the actual needs of users, resulting in the loss of customers.
  • Embodiments of the present application provide a user request detection method, device, computer equipment, and storage medium.
  • a technical solution adopted by the embodiments created in this application is to provide a detection method requested by a user, including the following steps:
  • the feature set into a plurality of abnormality detection models according to the type of the feature set, obtain a detection result for judging whether the user request is abnormal, and use a preset judgment method to judge the detection result to determine Whether the user request is abnormal, wherein the abnormality detection model is a detection model that is previously trained to a converged state using a positive sample or a negative sample feature set, and is used to perform security classification on the terminal through the feature set;
  • using the preset score feature pair to construct a feature set from the combined feature extracted from the device data includes:
  • the combined feature is added to the negative sample set.
  • embodiments of the present application also provide a user request detection device, including:
  • the acquisition module is used to acquire the device data of the terminal sending the user request;
  • a processing module configured to use a preset score feature pair to construct a feature set of the combined feature extracted from the device data
  • the execution module is configured to input the feature set into a plurality of abnormality detection models according to the type of the feature set, obtain a detection result for judging whether the user request is abnormal, and apply a preset judgment method to the detection result A judgment is made to determine whether the user request is abnormal, wherein the abnormality detection model is a detection model that is previously trained to a converged state using a positive sample or a negative sample feature set, and is used to perform security classification on the terminal through the feature set.
  • the processing module includes: a first acquisition sub-module for extracting combined features from the device data; a first processing sub-module for comparing the combined features with preset score features; The first execution sub-module is used to add the combined feature to the positive sample set when the combined feature is consistent with the score feature; the second execution sub-module is used when the combined feature and the When the score features are not consistent, the combined feature is added to the negative sample set.
  • embodiments of the present application further provide a computer device, including a memory and a processor, and the memory stores computer-readable instructions.
  • the processor executes the steps of the user request detection method described above.
  • embodiments of the present application also provide a storage medium storing computer-readable instructions, which when executed by one or more processors, cause the one or more processors to execute the above The steps of the detection method requested by the user are described.
  • the beneficial effects of the embodiments of the present application are: constructing a feature set by using score features on the combined features extracted from the device data, and inputting the feature set into the anomaly detection model according to the type of the feature set.
  • This method By constructing feature sets on text-based device data with nominal attributes, an effective classification feature set can be mined to improve recognition accuracy.
  • the judgment method is used to judge the output results of multiple models to obtain more comprehensive detection results, which effectively avoids the one-sidedness problem of a single model and reduces the inaccuracy of a single anomaly detection model due to sample imbalance. Improve the accuracy of anomaly detection.
  • FIG. 1 is a schematic flowchart of a user request detection method provided by an embodiment of the present application.
  • FIG. 2 is a block diagram of a basic structure of a user request detection device provided by an embodiment of the present application
  • FIG. 3 is a block diagram of a basic structure of a computer device provided by an embodiment of the present application.
  • terminal and “terminal device” used here include not only devices with wireless signal receivers, but only devices with wireless signal receivers that do not have transmitting capabilities, but also devices that receive and transmit hardware.
  • Such equipment may include: cellular or other communication equipment, which has a single-line display or a multi-line display or a cellular or other communication device without a multi-line display; PCS (Personal Communications Service (personal communication system), which can combine voice, data processing, fax and/or data communication capabilities; PDA (Personal Digital Assistant (personal digital assistant), which may include radio frequency receivers, pagers, internet/intranet access, web browsers, notepads, calendars and/or GPS (Global Positioning System (Global Positioning System) receiver; conventional laptop and/or palmtop computer or other device that has and/or includes a conventional radio frequency receiver and/or palmtop computer or other device.
  • PCS Personal Communications Service
  • PDA Personal Digital Assistant
  • GPS Global Positioning System (Global Positioning System) receiver
  • conventional laptop and/or palmtop computer or other device that has and/or includes a conventional radio frequency receiver and/or palmtop computer or other device.
  • terminal and “terminal equipment” may be portable, transportable, installed in a vehicle (aeronautical, maritime, and/or terrestrial), or adapted and/or configured to operate locally, and/or In a distributed form, it operates on any other location on the earth and/or in space.
  • the "terminal” and “terminal device” used here may also be a communication terminal, an Internet terminal, a music/video playback terminal, for example, a PDA, MID (Mobile Internet Device (mobile Internet device) and/or mobile phone with music/video playback function, can also be a smart TV, set-top box and other devices.
  • the client terminal in this embodiment is the above-mentioned terminal.
  • FIG. 1 is a schematic flowchart of a detection method requested by a user in this embodiment.
  • the detection method requested by the user includes the following steps:
  • the user request is a request sent by the terminal to the server, where the user request may be a registration request or a verification request.
  • the registration request contains the device data of the terminal that sent the registration request.
  • the verification request includes the identification code of the terminal that sent the verification request, and the server queries the database for pre-stored device data through the identification code.
  • the device data of the terminal may also be obtained through JavaScript scripts. It should be noted that the device data of the terminal includes: device type, brand, system type, version, resolution, IP address, etc.
  • the combined feature is extracted from the device data, where the combined feature of the device data may be a combination of multiple types of device type, brand, system type, version, resolution, IP address, and so on. Assign a value to the combined feature, set the distinguishing value according to the range of the combined feature, that is, the score feature, and divide the value of all combined features into positive sample features and negative sample features according to the positive and negative sample distribution, that is, all Positive sample features and negative sample features are used as feature sets.
  • the first device type and the first system type are used as the first combined feature, and the value is 1, and the second device type and the second system type are used as the second combined feature, and Assign a value of 2, determine 1 as the value point of the score feature, divide the combined feature of the first device type and the first system type into a sample set and mark it as 0 as a positive sample set, and divide the second device type and the second The combined features of device types are divided into a sample set and marked as 1 as a negative sample set.
  • the combined features of the device brand, system type, version, resolution, IP address, etc. are added to the positive sample set and the negative sample set in sequence according to the above method.
  • the value of the combination feature can be determined according to a preset value method, for example, a preset value can be selected for the actual device data.
  • the combination method of combining features the combination can also be performed according to a preset method, and the number of combinations is not limited. In general, it can be combined according to actual applications, for example, a combination feature formed by combining the iOS system and the Apple brand.
  • the above method for constructing feature sets can effectively convert text data into 0-1 binary features for complex text-type device data, generate distinctive feature sets, and mine effective classification feature sets.
  • the anomaly detection model is a detection model that is trained in advance using a positive sample or a negative sample feature set to a converged state, and is used to perform security classification on the terminal through the feature set.
  • a positive feature set or a negative feature set can be selected to be input into the anomaly detection model.
  • the feature set is a positive feature set
  • the feature set is input to the anomaly detection model trained by positive sample features
  • the feature set is a negative feature set
  • the feature set is input to the anomaly detection model trained by negative sample features in.
  • multiple models include: training a Gaussian-distributed Naive through a positive sample set and a negative sample set Bayes (Naive Bayes algorithm) as the first model obtained by supervised classification model, using the positive sample feature set to train the unsupervised isolated forest algorithm to obtain the second model, using the negative sample feature set to train the unsupervised isolated forest algorithm
  • the obtained third model is the fourth model obtained by training the unsupervised OneClassSVM algorithm with positive sample feature sets and the fifth model obtained by training the unsupervised OneClassSVM algorithm with negative sample feature sets.
  • the labeled sample feature set is used for training.
  • the server compares the obtained device data with the reference device data obtained in advance, and the comparison is consistent
  • the device data is used as sample data.
  • the reference device data is obtained by using crawler algorithms, automated equipment, and normal verification. Using the consistent data as the sample data can ensure the accuracy of the sample feature set, thereby improving the accuracy of the abnormal detection model recognition.
  • the combined features are extracted by comparing pairs of consistent sample data, and the combined features are used as the positive sample feature set. And use the positive sample set to train the above model, so that the above model can distinguish positive sample features.
  • two classifications can be obtained, one is the same category as the positive sample, it can be considered that the obtained detection result is normal, and the other is a category different from the positive sample, The test result is considered abnormal.
  • the labeled negative sample feature set is obtained, and the above model is trained, and the obtained model can distinguish the negative sample features.
  • two categories are obtained, one is the same category as the negative sample, and the obtained detection result is abnormal, and the other category is different from the negative sample is considered normal .
  • the above-mentioned user-requested detection method constructs a feature set by using score features to combine features extracted from device data, and inputs the feature set into the anomaly detection model according to the type of the feature set.
  • Constructing feature sets of text-based device data called attributes can mine effective classification feature sets and improve recognition accuracy.
  • the judgment method is used to judge the output results of multiple models to obtain more comprehensive detection results, which effectively avoids the one-sidedness problem of a single model and reduces the inaccuracy of a single anomaly detection model due to sample imbalance. Improve the accuracy of anomaly detection.
  • step S1100 includes the following steps:
  • the user request is a request sent by the terminal to the server, where the user request may be a registration request, a verification request, and other requests for obtaining data.
  • the registration request contains an identification code, which is a character string that uniquely identifies the terminal, for example, IMEI.
  • Device data includes: device type, brand, system type, version, resolution, IP address, etc.
  • the user request carries device data such as IP address and version.
  • the device information of the terminal is pre-stored in the server, such as device type, device brand, and type of system used.
  • the verification server queries the database for pre-stored device data through the identification code.
  • the pre-stored device data is carried in the registration request when the terminal sends the registration request for the first time.
  • the device data of the terminal may also be obtained through JavaScript scripts.
  • step S1200 includes the following steps:
  • the server extracts a single feature from the device data, and then combines the single feature according to a preset combination rule to obtain a combined feature.
  • the preset combination rule is a method for classifying multiple single features.
  • the device brand and the device system can be used as a combined feature, or the device model and resolution can be used as combined features.
  • the combined feature it is easy to identify whether the feature is abnormal. For example, when the iOS system and the Apple device are used as the combined feature, it is a positive sample, and when the Android system and the Apple device are used as the combined feature, it is a negative sample.
  • step S1210 includes the following steps:
  • the single feature of device data can be any of device type, brand, system type, version, resolution, IP address, etc.
  • the server presets the extracted keywords or formats, and extracts from the device data according to the keywords or formats.
  • the IP address has a fixed format
  • the server presets the format of the IP address, and selects the same character from the device data as the preset format as the IP address.
  • the server presets two keywords, iOS and Android, and extracts iOS or Android with the same keyword as the system type from the device data.
  • the degree of association is used to characterize the association between each single feature.
  • the setting of the degree of association is used to make it easier to represent the true or abnormal characteristics of the feature after combining the single features.
  • the correlation degree is a preset value or level.
  • the acquired device data usually contains multiple types of data, that is, the device type and the system type used by the device. Usually, when the device type and the system type are related to each other, it is easier to indicate the true or abnormal characteristics.
  • the Apple brand the iOS system and the Android system
  • the data as a single feature are real data, but the Android system and the Apple brand as the combined features are abnormal data.
  • the correlation degree may be one or a combination of characters, numbers, and letters.
  • a single feature with the same correlation degree is extracted from the device data and combined to obtain a combined feature.
  • the score feature is a value that can distinguish the positive and negative of the combined feature, that is, the true or abnormal value.
  • the extracted multiple combined features are assigned.
  • the device type and the system type are used as examples.
  • the first device type and the first system type are used as the first combined feature and assigned a value of 1
  • the second device type and the second system type are used as the second combined feature, and assigned a value of 2, determining 1 as the value of the score feature Point, divide the combined feature of the first device type and the first system type into a sample set and mark it as 0 as a positive sample set, and divide the combined feature of the second device type and the second device type into a sample set and mark it as 1 as a negative sample set.
  • the user request sent by the terminal includes multiple types of device data, a combination of multiple device data is assigned as a combination feature, and each assigned combination feature is provided with a score feature. Compare the value of each combined feature with the corresponding score feature.
  • An embodiment of the present application provides a method for training an anomaly detection model. Specifically, before step S1300, the method further includes:
  • the multiple device data of the sample terminal are obtained through multiple channels; the multiple device data are compared separately; the devices that are consistent are compared Data as positive sample data.
  • various device data can be obtained through crawler algorithms, automated devices, and normal verification. For example, any one of the device type, brand, system type, version, resolution, and IP address can be obtained.
  • the comparison process the same type of device data is compared, for example, the data of brands obtained in multiple ways are compared, and the device data of versions obtained in multiple ways are compared. Among them, the consistent data is considered to be accurate. As the sample data, it can greatly improve the accuracy of the sample data.
  • a large number of pieces of device data are selected as sample data.
  • data that has determined that the user request is abnormal may be selected as negative sample data.
  • the above positive sample data are all accurate sample data.
  • the model is trained through positive sample data.
  • the classification results obtained include two types, one is a normal result that meets the positive sample classification value, and the other is not Abnormal results that meet the positive sample classification value.
  • the positive sample data obtained by the above method trains the OneClassSVM and the isolated forest classification model.
  • negative sample data is obtained, and the negative sample data is trained on the OneClassSVM and the isolated forest classification model.
  • positive and negative sample data for Naive Bayes Naive Bayes algorithm is trained to get five anomaly detection models.
  • model trained with positive sample data can identify data with the same characteristics as the positive sample data
  • model trained with negative sample data can identify data with the same characteristics as the negative sample data
  • the training method is as follows:
  • the above marked training data is input into the model to obtain the incentive classification value output by the model; compare whether the distance between the expected classification value and the incentive classification value is less than or equal to the preset threshold; when the expected classification value and the incentive classification value When the distance is greater than the preset threshold, the weights in the detection model are updated through the reverse algorithm repeatedly and iteratively, until the distance between the desired classification value and the excitation classification value is less than or equal to the preset threshold.
  • the excitation classification value is the excitation data obtained by the model based on the input sample data. Before the model is trained to convergence, the excitation classification value is a discrete value. When the model is not trained to convergence, the excitation classification value is relatively stable. The data.
  • step S1300 includes the following steps:
  • Judgment categories include: normal detection and abnormal detection. Five abnormal detection models are used in the embodiments of the present application, and five detection results are obtained.
  • the preset weight of each model can be set according to the accuracy rate of the model recognition, the weight of the setting with high accuracy is larger, and the weight of the setting with low accuracy is smaller. Assuming that the proportion of normal or abnormal detection results is set to 1, multiplying the weights can obtain the value of the normal detection result and the value of the abnormal detection result. Compare the two, and the larger the value is the final detection result.
  • the five models have the same accuracy and the same weight, and two of the final detection results are normal and three are abnormal. The result is that the abnormal value is large, and the final result is determined to be abnormal.
  • FIG. 3 is a block diagram of a basic structure of a detection device requested by a user in this embodiment.
  • a detection device requested by a user includes an acquisition module 2100, a processing module 2200, and an execution module 2300.
  • the obtaining module 2100 is used to obtain the device data of the terminal that sends the user request;
  • the processing module 2200 is used to adopt the preset score feature to construct the feature set of the combined feature extracted from the device data;
  • the execution module 2300 which is used to input the feature set into multiple abnormality detection models according to the type of the feature set, obtain a detection result for judging whether the user request is abnormal, and perform a predetermined judgment method on the detection result Judgment to determine whether the user request is abnormal, wherein the abnormality detection model is a detection model that is previously trained to a converged state using a positive sample or a negative sample feature set, and is used to perform security classification on the terminal through the feature set.
  • the abnormality detection model is a detection model that is previously trained to a converged state using a positive sample or a negative sample feature set, and is used to perform security classification on the terminal through
  • the detection device requested by the user constructs a feature set by using the score feature pair to extract the combined features from the device data, and inputs the feature set into the anomaly detection model according to the type of the feature set.
  • the nominal attribute The text-based device data constructs feature sets, which can mine effective classification feature sets and improve recognition accuracy.
  • the judgment method is used to judge the output results of multiple models to obtain more comprehensive detection results, which effectively avoids the one-sidedness problem of a single model and reduces the inaccuracy of a single anomaly detection model due to sample imbalance. Improve the accuracy of anomaly detection.
  • the processing module includes: a first acquisition sub-module for extracting combined features from the device data; a first processing sub-module for comparing the combined features with preset score features Perform comparison; the first execution sub-module is used to add the combined feature to the positive sample set when the combined feature is consistent with the score feature; the second execution sub-module is used when the combination When the feature is not consistent with the score feature, the combined feature is added to the negative sample set.
  • the first acquisition submodule includes: a second acquisition submodule for extracting a plurality of single features from the device data; and a third acquisition submodule for acquiring the association of each single feature
  • the second processing submodule is used to combine a plurality of single features with the same correlation degree as the combined feature.
  • it further includes: a fourth acquisition sub-module for acquiring sample data of the terminal; a fifth acquisition sub-module for extracting combined features from the sample data, wherein the combined features are all A marker is set; a third execution submodule is used to train a preset detection model through the marked sample data to obtain the abnormality detection model, wherein the sample data includes positive sample feature data and negative sample feature data.
  • the execution module includes: a sixth acquisition sub-module for acquiring the determination category of the plurality of detection results; a fourth execution sub-module for determining the plurality of multiples according to the preset weight of each model The judgment type obtained by the model is weighted to obtain the judgment result of whether the user sending the user request is abnormal.
  • the acquisition module includes: a seventh acquisition sub-module for receiving the user request sent by the terminal; an eighth acquisition sub-module for extracting from the server according to the identification code in the user request Pre-stored device data.
  • the execution module includes: a fifth execution sub-module for inputting the feature set into an anomaly detection model trained from positive sample features when the feature set is a positive feature set.
  • FIG. 3 is a block diagram of the basic structure of the computer device of this embodiment.
  • the computer device includes a processor, a non-volatile storage medium, a memory, and a network interface connected through a system bus.
  • the non-volatile storage medium of the computer device stores an operating system, a database, and computer-readable instructions.
  • the database may store a sequence of control information.
  • the processor may implement a A detection method requested by a user.
  • the processor of the computer device is used to provide calculation and control capabilities, and support the operation of the entire computer device.
  • the memory of the computer device may store computer readable instructions. When the computer readable instructions are executed by the processor, the processor may cause the processor to execute a detection method requested by the user.
  • the network interface of the computer device is used to connect and communicate with the terminal.
  • FIG. 3 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer equipment to which the solution of the present application is applied.
  • the specific computer equipment may Include more or less components than shown in the figure, or combine certain components, or have a different arrangement of components.
  • the processor is used to execute the specific content of the acquisition module 2100, the processing module 2200, and the execution module 2300 in FIG. 2, and the memory stores the program code and various types of data required to execute the above modules.
  • the network interface is used for data transmission to user terminals or servers.
  • the memory in this embodiment stores the program code and data required to execute all submodules in the detection method requested by the user, and the server can call the program code and data of the server to execute the functions of all submodules.
  • the present application also provides a storage medium that stores computer-readable instructions that, when executed by one or more processors, cause the one or more processors to execute the user request described in any of the foregoing embodiments The steps of the detection method.
  • the computer program may be stored in a computer-readable storage medium. When executed, it may include the processes of the foregoing method embodiments.
  • the aforementioned storage medium may be a magnetic disk, an optical disk, a read-only storage memory (Read-Only Memory, ROM) and other non-volatile storage media, or random access memory (Random Access Memory, RAM), etc.
  • steps in the flowchart in the drawings are displayed in order according to the arrows, the steps are not necessarily executed in the order indicated by the arrows. Unless there is a clear description in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least a part of the steps in the flowchart in the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and the execution order is also It is not necessarily carried out sequentially, but may be executed in turn or alternately with at least a part of other steps or sub-steps or stages of other steps.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例公开了一种用户请求的检测方法、装置、计算机设备及存储介质,包括下述步骤:获取发送用户请求的终端的设备数据;采用预设的分值特征对从所述设备数据中提取到的组合特征构造的特征集合;按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,得到判断所述用户请求是否异常的检测结果,并采用预设的判断方法对所述检测结果进行判断,以确定所述用户请求是否异常,异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型。该方法中对标称属性的文本型设备数据构造特征集合,可以挖掘出有效的分类特征集,提高识别的准确性。

Description

用户请求的检测方法、装置、计算机设备及存储介质
本申请要求于2019年01月08日提交中国专利局、申请号为201910015151.7、发明名称为“用户请求的检测方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在申请中。
技术领域
本申请实施例涉及金融领域,尤其是一种用户请求的检测方法、装置、计算机设备及存储介质。
背景技术
分红险指保险公司在每个会计年度结束后,将上一会计年度该类分红保险的可分配盈余,按一定的比例、以现金红利或增值红利的方式,分配给客户的一种保险产品。
现有技术中,对用户进行分红险介绍时,通常是通过对项目组的形式,对参与该项目销售的销售人员进行培训,然后由他们对到访客户进行保险产品的基本介绍。
本申请创造的发明人在研究中发现,现有技术中采用人工的方式对险种进行介绍时,根据人员素质及技能的掌握程度,出现层次不齐的现象,尤其对于各个阶段利益及不同分红水平的红利计算,往往无法根据用户的实际需求做出快速解答,导致客户流失。
发明内容
本申请实施例提供一种用户请求的检测方法、装置、计算机设备及存储介质。
为解决上述技术问题,本申请创造的实施例采用的一个技术方案是:提供一种用户请求的检测方法,包括下述步骤:
获取发送用户请求的终端的设备数据;
采用预设的分值特征对从所述设备数据中提取到的组合特征构造的特征集合;
按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,得到判断所述用户请求是否异常的检测结果,并采用预设的判断方法对所述检测结果进行判断,以确定所述用户请求是否异常,其中,所述异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型;
其中,所述采用预设的分值特征对从所述设备数据中提取到的组合特征构造特征集合,包括:
从所述设备数据中提取组合特征;
将所述组合特征与预设的分值特征进行比对;
当所述组合特征与所述分值特征一致时,将所述组合特征添加到正样本集合中;
当所述组合特征与所述分值特征不一致时,将所述组合特征添加到负样本集合中。
为解决上述技术问题,本申请实施例还提供一种用户请求的检测装置,包括:
获取模块,用于获取发送用户请求的终端的设备数据;
处理模块,用于采用预设的分值特征对从所述设备数据中提取到的组合特征构造的特征集合;
执行模块,用于按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,得到判断所述用户请求是否异常的检测结果,并采用预设的判断方法对所述检测结果进行判断,以确定所述用户请求是否异常,其中,所述异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型。
其中,所述处理模块包括:第一获取子模块,用于从所述设备数据中提取组合特征;第一处理子模块,用于将所述组合特征与预设的分值特征进行比对;第一执行子模块,用于当所述组合特征与所述分值特征一致时,将所述组合特征添加到正样本集合中;第二执行子模块,用于当所述组合特征与所述分值特征不一致时,将所述组合特征添加到负样本集合中。
为解决上述技术问题,本申请实施例还提供一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行上述所述用户请求的检测方法的步骤。
为解决上述技术问题,本申请实施例还提供一种存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行上述所述用户请求的检测方法的步骤。
本申请实施例的有益效果是:通过采用分值特征对从设备数据中提取到的组合特征构造特征集合,并按照所特征集合的类型将所述特征集合输入到异常检测模型中,该方法中对标称属性的文本型设备数据构造特征集合,可以挖掘出有效的分类特征集,提高识别的准确性。此外,采用判断方法对多个模型输出的结果进行判断,可以更加全面的得到检测结果,有效的避免了单一模型的片面性问题,同时降低了由于样本不均衡导致单一异常检测模型的不准性,提高了异常检测的准确率。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的用户请求的检测方法的基本流程示意图;
图2为本申请实施例提供的用户请求的检测装置基本结构框图;
图3为本申请实施例提供的计算机设备基本结构框图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
在本申请的说明书和权利要求书及上述附图中的描述的一些流程中,包含了按照特定顺序出现的多个操作,但是应该清楚了解,这些操作可以不按照其在本文中出现的顺序来执行或并行执行,操作的序号如101、102等,仅仅是用于区分开各个不同的操作,序号本身不代表任何的执行顺序。另外,这些流程可以包括更多或更少的操作,并且这些操作可以按顺序执行或并行执行。需要说明的是,本文中的“第一”、“第二”等描述,是用于区分不同的消息、设备、模块等,不代表先后顺序,也不限定“第一”和“第二”是不同的类型。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
实施例
本技术领域技术人员可以理解,这里所使用的“终端”、“终端设备”既包括无线信号接收器的设备,其仅具备无发射能力的无线信号接收器的设备,又包括接收和发射硬件的设备,其具有能够在双向通信链路上,执行双向通信的接收和发射硬件的设备。这种设备可以包括:蜂窝或其他通信设备,其具有单线路显示器或多线路显示器或没有多线路显示器的蜂窝或其他通信设备;PCS(Personal Communications Service,个人通信系统),其可以组合语音、数据处理、传真和/或数据通信能力;PDA(Personal Digital Assistant,个人数字助理),其可以包括射频接收器、寻呼机、互联网/内联网访问、网络浏览器、记事本、日历和/或GPS(Global Positioning System,全球定位系统)接收器;常规膝上型和/或掌上型计算机或其他设备,其具有和/或包括射频接收器的常规膝上型和/或掌上型计算机或其他设备。这里所使用的“终端”、“终端设备”可以是便携式、可运输、安装在交通工具(航空、海运和/或陆地)中的,或者适合于和/或配置为在本地运行,和/或以分布形式,运行在地球和/或空间的任何其他位置运行。这里所使用的“终端”、“终端设备”还可以是通信终端、上网终端、音乐/视频播放终端,例如可以是PDA、MID(Mobile Internet Device,移动互联网设备)和/或具有音乐/视频播放功能的移动电话,也可以是智能电视、机顶盒等设备。
本实施方式中的客户终端即为上述的终端。
具体地,请参阅图1,图1为本实施例用户请求的检测方法的基本流程示意图。
如图1所示,用户请求的检测方法包括下述步骤:
S1100、获取发送用户请求的终端的设备数据;
用户请求为终端向服务器发送的请求,其中,用户请求可以为注册请求或验证请求。通常情况下,当发送注册请求时,注册请求中包含发送注册请求的终端的设备数据。发送验证请求时,验证请求中包括发送验证请求的终端的识别码,服务器通过识别码在数据库中查询预先存储的设备数据。
在一些实施方式中,还可以通过JavaScript脚本来获取终端的设备数据。需要说明的是,终端的设备数据包括:设备类型、品牌、系统类型、版本、分辨率,IP地址等。
S1200、采用预设的分值特征对从设备数据中提取到的组合特征构造的特征集合;
本申请实施例中,从设备数据中提取组合特征,其中,设备数据的组合特征可以为设备类型、品牌、系统类型、版本、分辨率,IP地址等的多种的组合。对该组合特征进行赋值,根据组合特征的取值范围设定区分值,即分值特征,以及将所有组合特征的取值按照正负样本分布分为正样本特征和负样本特征,即将所有的正样本特征和负样本特征作为特征集合。
在实际应用中,根据正负样本分布,选择具有区分性的取值点;以此取值点作为基准,等于此取值点的样本标记为0,否则标记为1,或在此特征不同取值子集下,根据正负样本分布,选择具有区分性的取值子集,以此子集作为基准,样本取值属于此集合标记为0,否则标记为1,构造特征集。
举例说明,以设备类型和系统类型为例,为第一设备类型和第一系统类型作为第一组合特征,并赋值为1,将第二设备类型和第二系统类型作为第二组合特征,并赋值为2,确定1为分值特征的取值点,将第一设备类型和第一系统类型的组合特征划分到一个样本集中并标记为0作为正样本集,将第二设备类型和第二设备类型的组合特征划分到一个样本集中并标记为1作为负样本集。
本申请实施例中,按照以上方法分别将设备品牌、系统类型、版本、分辨率,IP地址等组合而成的组合特征依次添加到上述的正样本集和负样本集中。需要说明的是,对于组合特征的赋值,在实际应用中,可以按照预设的取值方法进行取值,例如对于真实存在的设备数据可以选取预设的值。对于组合特征的组合方法,也可以按照预设的方法进行组合,组合的个数不作限定,一般情况下可以按照实际应用进行组合,例如,iOS系统和苹果品牌进行组合后形成的组合特征。
以上构造特征集合的方法对于复杂的文本型设备数据,可以有效的将文本数据转换为0-1二值特征,生成具有区分性的特征集合,挖掘出有效的分类特征集。
S1300、按照特征集合的类型将特征集合输入到多个异常检测模型中,得到判断用户请求是否异常的检测结果,并采用预设的判断方法对检测结果进行判断,以确定用户请求是否异常,其中,异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型。
具体地,可以选择将正特征集合或者负特征集合输入到异常检测模型中。当特征集合为正特征集合时,将特征集合输入到由正样本特征训练得到的异常检测模型中,当特征集合为负特征集合时,将特征集合输入到由负样本特征训练得到的异常检测模型中。
检测结果分为两类,一种是该用户请求存在异常,另一种是该用户请求正常。本申请实施例中,多个模型包括:通过正样本集合和负样本集合训练Gaussian分布的Naive Bayes(朴素贝叶斯算法)作为有监督的分类模型得到的第一模型,利用正样本特征集合训练无监督的孤立森林算法得到的第二模型,利用负样本特征集合训练无监督的孤立森林算法得到的第三模型,利用正样本特征集合训练无监督的OneClassSVM算法得到的第四模型和利用负样本特征集合训练无监督的OneClassSVM算法得到的第五模型。
对上述模型训练时,采用做了标记的样本特征集进行训练。需要说明的是,在获取样本特征集时,为了确保样本数据的准确性,服务器在获取终端的设备数据后,将获取到设备数据与预先得到的参考设备数据进行比对,并将比对一致的设备数据作为样本数据。例如,参考设备数据为采用爬虫算法、自动化设备以及正常验证等途径得到数据。将比对一致的数据作为样本数据可以确保样本特征集的准确性,进而提高异常检测模型识别的准确度。
需要说明的是,对比对一致的样本数据提取组合特征,将该组合特征作为正样本特征集合。以及利用正样本集合对上述模型进行训练,如此,上述模型均可以区分正样本特征。将特征集合输入到训练好的异常检测模型后,可以得出两个分类,一种是与正样本具有相同的类别,可以认为得到的检测结果正常,另一种是与正样本不同的类别,认为检测结果异常。同理,获取标记的负样本特征集,对上述模型进行训练,得到的模型可以区分负样本特征。将特征集合输入到训练好的异常检测模型后,得到两个分类,一种是与负样本具有相同的类别,认为得到的检测结果异常,另外一种与负样本不同的类别被认为检测结果正常。
上述用户请求的检测方法,通过采用分值特征对从设备数据中提取到的组合特征构造特征集合,并按照所特征集合的类型将所述特征集合输入到异常检测模型中,该方法中对标称属性的文本型设备数据构造特征集合,可以挖掘出有效的分类特征集,提高识别的准确性。此外,采用判断方法对多个模型输出的结果进行判断,可以更加全面的得到检测结果,有效的避免了单一模型的片面性问题,同时降低了由于样本不均衡导致单一异常检测模型的不准性,提高了异常检测的准确率。
本申请实施例提供一种获取发送用户请求的终端的设备数据的方法。具体地,步骤S1100包括下述步骤:
S1110、接收终端发送的用户请求;
用户请求为终端向服务器发送的请求,其中,用户请求可以为注册请求、验证请求以及其它获取数据的请求。一般情况下,注册请求中包含识别码,识别码为唯一识别终端的字符串,例如,IMEI。
S1120、根据用户请求中的识别码从服务器中提取预存的设备数据。
设备数据包括:设备类型、品牌、系统类型、版本、分辨率,IP地址等。在一些实施方式中,用户请求中携带了IP地址、版本等设备数据。通常情况下,服务器中预存有终端的设备信息,例如设备类型、设备品牌、所使用的系统类型等,当发送用户请求时,验服务器通过识别码在数据库中查询预先存储的设备数据。在一些实施方式中,预存的设备数据是终端在首次发送注册请求时,注册请求中携带的。在一些实施方式中,还可以通过JavaScript脚本来获取终端的设备数据。
在实际应用中,由于设备数据中包括大量的文本型数据,对于文本型数据不能挖掘有效的分类特征,因此,为了解决这一特征,本申请实施提供一种采用预设的分值特征对从设备数据中提取到的组合特征构造特征集合的方法。具体地,步骤S1200包括下述步骤:
S1210、从设备数据中提取组合特征;
本申请实施例中,服务器从设备数据中提取单一特征,再按照预设的组合规则将单一特征进行组合,得到组合特征。
预设的组合规则为将多个单一特征进行分类的方法,例如,可以将设备品牌与设备系统作为一个组合特征,也可以将设备型号和分辨率作为组合特征。通过采用组合特征可以便于识别特征是否异常,例如,iOS系统与苹果设备作为组合特征时,是正样本,Android系统与苹果设备作为组合特征时,即为负样本。
本申请实施例提供一种从设备数据中提取组合特征的方法,具体地,步骤S1210包括下述步骤:
S1211、从设备数据中提取多个单一特征;
设备数据的单一特征可以为设备类型、品牌、系统类型、版本、分辨率,IP地址等的任一种。本申请实施例中,服务器预设有提取的关键词或者格式,并按照关键词或者格式从设备数据中进行提取。例如,IP地址有固定的格式,服务器预设有IP地址的格式,并从设备数据中选取与预设的格式相同的字符作为IP地址。例如,对于系统类型,服务器中预设有iOS和Android两种关键词,并从设备数据中提取与关键词相同的iOS或者Android作为系统类型。
S1212、获取每个单一特征的关联度;
关联度用于表征每个单一特征之间的关联性,本实施例中,关联度的设置用于使得将单一特征组合后更容易表示特征的真实或异常。其中,关联度为预设的数值或等级。例如,获取到的设备数据中通常包含多个种类的数据,即设备类型和该设备所使用的系统类型,通常设备类型和系统类型相互关联后更容易表示特征的真实或异常。举例说明,对于苹果品牌、iOS系统和Android系统,作为单一特征来说均为真实的数据,但是将Android系统和苹果品牌作为组合特征则为异常数据。
S1213、将关联度相同的多个单一特征的组合作为组合特征。
关联度可以为字符、数字、字母的一种或者多种的组合,本申请实施例中,从设备数据中提取关联度相同的单一特征进行组合,即得到组合特征。
S1220、将组合特征与预设的分值特征进行比对;
分值特征为可以对组合特征区分正负,即真实或异常的取值,本申请实施例中,将提取到的多个组合特征进行赋值,举例说明,以设备类型和系统类型为例,为第一设备类型和第一系统类型作为第一组合特征,并赋值为1,将第二设备类型和第二系统类型作为第二组合特征,并赋值为2,确定1为分值特征的取值点,将第一设备类型和第一系统类型的组合特征划分到一个样本集中并标记为0作为正样本集,将第二设备类型和第二设备类型的组合特征划分到一个样本集中并标记为1作为负样本集。
本申请实施例中,终端发送的用户请求中包括多个种类的设备数据,多种设备数据的组合作为组合特征均被赋值,以及每个赋值的组合特征均设置有分值特征。将每个组合特征的值与对应的分值特征进行比对。
S1230、当组合特征与分值特征一致时,将组合特征添加到正样本集合中。
S1240、当组合特征与分值特征不一致时,将组合特征添加到负样本集合中。
本申请实施例提供一种训练异常检测模型的方法,具体地,步骤S1300之前还包括:
S1310、获取终端的样本数据;
需要说明的是,为了确保正样本数据的准确性,在获取正样本数据时,通过多种途径获取样本终端的多种设备数据;将多种设备数据分别进行比对;将比对一致的设备数据作为正样本数据。
例如,可以通过爬虫算法、自动化设备及正常验证等途径得到多种设备数据,例如,可以得到设备的类型、品牌、系统类型、版本、分辨率,IP地址等的任一种。在比对的过程中,将同一种类型的设备数据进行比对,例如,将多种途径获得的品牌的数据进行比对,将多种途径获得的版本的设备数据进行比对。其中,比对一致的数据认为是准确的,作为样本数据,如此,可以大大的提高样本数据的准确性。
在一些实施方式中,通过同类型的设备数据有多个,存在多个相同的或者一个或多个不同的数据时,选取相同个数较多的设备数据作为样本数据。
在获取负样本数据时,可以选取已经确定用户请求异常的数据作为负样本数据。
S1320、从样本数据中提取组合特征,其中,组合特征均设置有标记;
本申请实施例中,提取组合特征的方法请参照以上实施例,在此不再赘述。需要说明的是,上述正样本数据均为准确的样本数据。通过正样本数据对模型进行训练,训练得到的异常检测模型对用户请求的设备训练进行计算时,得到的分类结果包括两种,一种是符合正样本分类值的正常结果,另一种是不符合正样本分类值的异常结果。
S1330、通过标记的样本数据对预设的检测模型进行训练,得到异常检测模型,其中,样本数据包括正样本特征数据或负样本特征数据。
通过上述方法得到的正样本数据对OneClassSVM和孤立森林分类模型进行训练。本申请实施例中,获取负样本数据,并将负样本数据对OneClassSVM和孤立森林分类模型进行训练。以及采用正样本数据和负样本数据对Naive Bayes(朴素贝叶斯算法)进行训练,得到五种异常检测模型。
需要说明的是,利用正样本数据训练的模型可以识别与正样本数据具有相同特征的数据,利用负样本数据训练的模型可以识别与负样本数据具有相同特征的数据。
训练方法如下所示:
上述标记的训练数据输入到模型中,获取模型输出的激励分类值;比对期望分类值与激励分类值之间的距离是否小于或等于预设的阈值;当期望分类值与激励分类值之间的距离大于预设的阈值时,反复循环迭代的通过反向算法更新检测模型中的权重,至期望分类值与激励分类值之间的距离小于或等于预设的阈值时结束。
激励分类值是模型根据输入的样本数据得到的激励数据,在模型未被训练至收敛之前,激励分类值为离散性较大的数值,当模型未被训练至收敛之后,激励分类值为相对稳定的数据。
需要说明的是,当激励分类值与设定的期望分类值不一致时,需要采用随机梯度下降算法对模型中的权重进行校正,以使模型的输出结果与分类判断信息的期望结果相同。通过若干训练样本集(在一些实施方式中,训练时将所有样本数据打乱进行训练,以增加模型的靠干扰能力,增强输出的稳定性。)的反复的训练与校正,当检测模型输出分类数据与各训练样本的分类参照信息比对达到(不限于)99.5%时,训练结束。
为了避免正样本数据和负样本数据不平衡带来的误差,本申请实施例提供一种采用预设的判断方法对得到的检测结果进行判断,以确定发送用户请求的用户是否异常的方法。具体地,步骤S1300包括下述步骤:
S1301、获取多个检测结果的判断类别;
判断类别包括:检测正常和检测异常两种结果。本申请实施例中采用了五种异常检测模型,得到了五种检测结果。
S1302、根据每个模型预设的权重对多个模型得到的判断类别进行加权运算,得到发送用户请求的用户是否异常的判定结果。
每个模型预设的权重可以根据模型识别的准确率进行设置,对准确率高的设置的权重较大,准确率低的设置的权重较小。假设检测结果正常或异常的比重均设置为1则按照权重相乘即可得到检测结果为正常的数值和检测结果为异常的数值,将二者进行比较,数值大的即为最终的检测结果。
例如,例如五个模型的准确率相同,权重相同,最后得到的检测结果两个为正常,三个为异常,则结果为异常的数值较大,确定最终的结果为异常。
为解决上述技术问题本申请实施例还提供一种用户请求的检测装置。具体请参阅图3,图3为本实施例用户请求的检测装置基本结构框图。
如图3所示,一种用户请求的检测装置,包括:获取模块2100、处理模块2200和执行模块2300。其中,获取模块2100,用于获取发送用户请求的终端的设备数据;处理模块2200,用于采用预设的分值特征对从所述设备数据中提取到的组合特征构造的特征集合;执行模块2300,用于按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,得到判断所述用户请求是否异常的检测结果,并采用预设的判断方法对所述检测结果进行判断,以确定所述用户请求是否异常,其中,所述异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型。
用户请求的检测装置通过采用分值特征对从设备数据中提取到的组合特征构造特征集合,并按照所特征集合的类型将所述特征集合输入到异常检测模型中,该方法中对标称属性的文本型设备数据构造特征集合,可以挖掘出有效的分类特征集,提高识别的准确性。此外,采用判断方法对多个模型输出的结果进行判断,可以更加全面的得到检测结果,有效的避免了单一模型的片面性问题,同时降低了由于样本不均衡导致单一异常检测模型的不准性,提高了异常检测的准确率。
在一些实施方式中,所述处理模块包括:第一获取子模块,用于从所述设备数据中提取组合特征;第一处理子模块,用于将所述组合特征与预设的分值特征进行比对;第一执行子模块,用于当所述组合特征与所述分值特征一致时,将所述组合特征添加到正样本集合中;第二执行子模块,用于当所述组合特征与所述分值特征不一致时,将所述组合特征添加到负样本集合中。
在一些实施方式中,所述第一获取子模块包括:第二获取子模块,用于从所述设备数据中提取多个单一特征;第三获取子模块,用于获取每个单一特征的关联度;第二处理子模块,用于将关联度相同的多个单一特征的组合作为所述组合特征。
在一些实施方式中,还包括:第四获取子模块,用于获取所述终端的样本数据;第五获取子模块,用于从所述样本数据中提取组合特征,其中,所述组合特征均设置有标记;第三执行子模块,用于通过标记的样本数据对预设的检测模型进行训练,得到所述异常检测模型,其中,所述样本数据包括正样本特征数据和负样本特征数据。
可选地,所述执行模块包括:第六获取子模块,用于获取所述多个检测结果的判断类别;第四执行子模块,用于根据每个模型预设的权重对所述多个模型得到的判断类别进行加权运算,得到发送用户请求的用户是否异常的判定结果。
在一些实施方式中,所述获取模块包括:第七获取子模块,用于接收所述终端发送的用户请求;第八获取子模块,用于根据所述用户请求中的识别码从服务器中提取预存的设备数据。
在一些实施方式中,所述执行模块包括;第五执行子模块,用于当所述特征集合为正特征集合时,将所述特征集合输入到由正样本特征训练得到的异常检测模型中。
为解决上述技术问题,本申请实施例还提供计算机设备。具体请参阅图3,图3为本实施例计算机设备基本结构框图。
如图3所示,计算机设备的内部结构示意图。如图3所示,该计算机设备包括通过系统总线连接的处理器、非易失性存储介质、存储器和网络接口。其中,该计算机设备的非易失性存储介质存储有操作系统、数据库和计算机可读指令,数据库中可存储有控件信息序列,该计算机可读指令被处理器执行时,可使得处理器实现一种用户请求的检测方法。该计算机设备的处理器用于提供计算和控制能力,支撑整个计算机设备的运行。该计算机设备的存储器中可存储有计算机可读指令,该计算机可读指令被处理器执行时,可使得处理器执行一种用户请求的检测方法。该计算机设备的网络接口用于与终端连接通信。本领域技术人员可以理解,图3中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
本实施方式中处理器用于执行图2中获取模块2100、处理模块2200和执行模块2300的具体内容,存储器存储有执行上述模块所需的程序代码和各类数据。网络接口用于向用户终端或服务器之间的数据传输。本实施方式中的存储器存储有用户请求的检测方法中执行所有子模块所需的程序代码及数据,服务器能够调用服务器的程序代码及数据执行所有子模块的功能。
本申请还提供一种存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行上述任一实施例所述用户请求的检测方法的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,该计算机程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,前述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等非易失性存储介质,或随机存储记忆体(Random Access Memory,RAM)等。
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。

Claims (20)

  1. 一种用户请求的检测方法,其特征在于,包括:
    获取发送用户请求的终端的设备数据;
    采用预设的分值特征对从所述设备数据中提取到的组合特征构造的特征集合;
    按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,得到判断所述用户请求是否异常的检测结果,并采用预设的判断方法对所述检测结果进行判断,以确定所述用户请求是否异常,其中,所述异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型;
    其中,所述采用预设的分值特征对从所述设备数据中提取到的组合特征构造特征集合,包括:
    从所述设备数据中提取组合特征;
    将所述组合特征与预设的分值特征进行比对;
    当所述组合特征与所述分值特征一致时,将所述组合特征添加到正样本集合中;
    当所述组合特征与所述分值特征不一致时,将所述组合特征添加到负样本集合中。
  2. 根据权利要求1所述的用户请求的检测方法,其特征在于,所述从所述设备数据中提取组合特征,包括:
    从所述设备数据中提取多个单一特征;
    获取每个单一特征的关联度;
    将关联度相同的多个单一特征的组合作为所述组合特征。
  3. 根据权利要求1所述的用户请求的检测方法,其特征在于,所述按照所述特征集合的类型将所述特征集合输入到异常检测模型中,得到用户是否异常的检测结果之前,还包括:
    获取所述终端的样本数据;
    从所述样本数据中提取组合特征,其中,所述组合特征均设置有标记;
    通过标记的样本数据对预设的检测模型进行训练,得到所述异常检测模型,其中,所述样本数据包括正样本特征数据和负样本特征数据。
  4. 根据权利要求1所述的用户请求的检测方法,其特征在于,所述采用预设的判断方法对得到的检测结果进行判断,以确定发送用户请求的用户是否异常,包括:
    获取所述多个检测结果的判断类别;
    根据每个模型预设的权重对所述多个模型得到的判断类别进行加权运算,得到发送用户请求的用户是否异常的判定结果。
  5. 根据权利要求1所述的用户请求的检测方法,其特征在于,所述获取发送用户请求的终端的设备数据,包括:
    接收所述终端发送的用户请求;
    根据所述用户请求中的识别码从服务器中提取预存的设备数据。
  6. 根据权利要求1所述的用户请求的检测方法,其特征在于,所述按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,包括;
    当所述特征集合为正特征集合时,将所述特征集合输入到由正样本特征训练得到的异常检测模型中。
  7. 一种用户请求的检测装置,其特征在于,包括:
    获取模块,用于获取发送用户请求的终端的设备数据;
    处理模块,用于采用预设的分值特征对从所述设备数据中提取到的组合特征构造的特征集合;
    执行模块,用于按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,得到判断所述用户请求是否异常的检测结果,并采用预设的判断方法对所述检测结果进行判断,以确定所述用户请求是否异常,其中,所述异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型;
    其中,所述处理模块包括:第一获取子模块,用于从所述设备数据中提取组合特征;第一处理子模块,用于将所述组合特征与预设的分值特征进行比对;第一执行子模块,用于当所述组合特征与所述分值特征一致时,将所述组合特征添加到正样本集合中;第二执行子模块,用于当所述组合特征与所述分值特征不一致时,将所述组合特征添加到负样本集合中。
  8. 根据权利要求7所述的用户请求的检测装置,其特征在于,所述第一获取子模块包括:第二获取子模块,用于从所述设备数据中提取多个单一特征;第三获取子模块,用于获取每个单一特征的关联度;第二处理子模块,用于将关联度相同的多个单一特征的组合作为所述组合特征。
  9. 根据权利要求7所述的用户请求的检测装置,其特征在于,所述用户请求的检测装置还包括:
    第四获取子模块,用于获取所述终端的样本数据;第五获取子模块,用于从所述样本数据中提取组合特征,其中,所述组合特征均设置有标记;第三执行子模块,用于通过标记的样本数据对预设的检测模型进行训练,得到所述异常检测模型,其中,所述样本数据包括正样本特征数据和负样本特征数据。
  10. 根据权利要求7所述的用户请求的检测装置,其特征在于,所述执行模块包括:第六获取子模块,用于获取所述多个检测结果的判断类别;第四执行子模块,用于根据每个模型预设的权重对所述多个模型得到的判断类别进行加权运算,得到发送用户请求的用户是否异常的判定结果。
  11. 根据权利要求7所述的用户请求的检测装置,其特征在于,所述获取模块包括:第七获取子模块,用于接收所述终端发送的用户请求;第八获取子模块,用于根据所述用户请求中的识别码从服务器中提取预存的设备数据。
  12. 根据权利要求7所述的用户请求的检测装置,其特征在于,所述执行模块包括;第五执行子模块,用于当所述特征集合为正特征集合时,将所述特征集合输入到由正样本特征训练得到的异常检测模型中。
  13. 一种计算机设备,包括存储器和处理器,所述存储器中存储有计算机可读指令,所述计算机可读指令被所述处理器执行时,使得所述处理器执行以下步骤:
    获取发送用户请求的终端的设备数据;
    采用预设的分值特征对从所述设备数据中提取到的组合特征构造的特征集合;
    按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,得到判断所述用户请求是否异常的检测结果,并采用预设的判断方法对所述检测结果进行判断,以确定所述用户请求是否异常,其中,所述异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型;
    其中,所述采用预设的分值特征对从所述设备数据中提取到的组合特征构造特征集合,包括:
    从所述设备数据中提取组合特征;
    将所述组合特征与预设的分值特征进行比对;
    当所述组合特征与所述分值特征一致时,将所述组合特征添加到正样本集合中;
    当所述组合特征与所述分值特征不一致时,将所述组合特征添加到负样本集合中。
  14. 根据权利要求13所述的计算机设备,其特征在于,所述从所述设备数据中提取组合特征,包括:
    从所述设备数据中提取多个单一特征;
    获取每个单一特征的关联度;
    将关联度相同的多个单一特征的组合作为所述组合特征。
  15. 根据权利要求13所述的计算机设备,其特征在于,所述按照所述特征集合的类型将所述特征集合输入到异常检测模型中,得到用户是否异常的检测结果之前,所述计算机可读指令被所述处理器执行时,使得所述处理器执行以下步骤:
    获取所述终端的样本数据;
    从所述样本数据中提取组合特征,其中,所述组合特征均设置有标记;
    通过标记的样本数据对预设的检测模型进行训练,得到所述异常检测模型,其中,所述样本数据包括正样本特征数据和负样本特征数据。
  16. 根据权利要求13所述的计算机设备,其特征在于,所述采用预设的判断方法对得到的检测结果进行判断,以确定发送用户请求的用户是否异常,包括:
    获取所述多个检测结果的判断类别;
    根据每个模型预设的权重对所述多个模型得到的判断类别进行加权运算,得到发送用户请求的用户是否异常的判定结果。
  17. 一种存储有计算机可读指令的存储介质,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
    获取发送用户请求的终端的设备数据;
    采用预设的分值特征对从所述设备数据中提取到的组合特征构造的特征集合;
    按照所述特征集合的类型将所述特征集合输入到多个异常检测模型中,得到判断所述用户请求是否异常的检测结果,并采用预设的判断方法对所述检测结果进行判断,以确定所述用户请求是否异常,其中,所述异常检测模型为预先采用正样本或负样本特征集合训练至收敛状态,用于通过特征集合对终端进行安全性分类的检测模型;
    其中,所述采用预设的分值特征对从所述设备数据中提取到的组合特征构造特征集合,包括:
    从所述设备数据中提取组合特征;
    将所述组合特征与预设的分值特征进行比对;
    当所述组合特征与所述分值特征一致时,将所述组合特征添加到正样本集合中;
    当所述组合特征与所述分值特征不一致时,将所述组合特征添加到负样本集合中。
  18. 根据权利要求17所述的存储介质,其特征在于,所述从所述设备数据中提取组合特征,包括:
    从所述设备数据中提取多个单一特征;
    获取每个单一特征的关联度;
    将关联度相同的多个单一特征的组合作为所述组合特征。
  19. 根据权利要求17所述的存储介质,其特征在于,所述按照所述特征集合的类型将所述特征集合输入到异常检测模型中,得到用户是否异常的检测结果之前,所述计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:
    获取所述终端的样本数据;
    从所述样本数据中提取组合特征,其中,所述组合特征均设置有标记;
    通过标记的样本数据对预设的检测模型进行训练,得到所述异常检测模型,其中,所述样本数据包括正样本特征数据和负样本特征数据。
  20. 根据权利要求17所述的存储介质,其特征在于,所述采用预设的判断方法对得到的检测结果进行判断,以确定发送用户请求的用户是否异常,包括:
    获取所述多个检测结果的判断类别;
    根据每个模型预设的权重对所述多个模型得到的判断类别进行加权运算,得到发送用户请求的用户是否异常的判定结果。
PCT/CN2019/118396 2019-01-08 2019-11-14 用户请求的检测方法、装置、计算机设备及存储介质 WO2020143322A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910015151.7A CN109936561B (zh) 2019-01-08 2019-01-08 用户请求的检测方法、装置、计算机设备及存储介质
CN201910015151.7 2019-01-08

Publications (1)

Publication Number Publication Date
WO2020143322A1 true WO2020143322A1 (zh) 2020-07-16

Family

ID=66984938

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118396 WO2020143322A1 (zh) 2019-01-08 2019-11-14 用户请求的检测方法、装置、计算机设备及存储介质

Country Status (2)

Country Link
CN (1) CN109936561B (zh)
WO (1) WO2020143322A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112396513A (zh) * 2020-11-27 2021-02-23 中国银联股份有限公司 一种数据处理的方法及装置
CN112561389A (zh) * 2020-12-23 2021-03-26 北京元心科技有限公司 确定设备检测结果的方法、装置以及电子设备
CN114268489A (zh) * 2021-12-21 2022-04-01 福建瑞网科技有限公司 一种网络安全防护方法及装置
CN114416916A (zh) * 2020-10-12 2022-04-29 中移动信息技术有限公司 异常用户检测方法、装置、设备及存储介质

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109936561B (zh) * 2019-01-08 2022-05-13 平安科技(深圳)有限公司 用户请求的检测方法、装置、计算机设备及存储介质
CN110443274B (zh) * 2019-06-28 2024-05-07 平安科技(深圳)有限公司 异常检测方法、装置、计算机设备及存储介质
CN110392046B (zh) * 2019-06-28 2021-12-24 平安科技(深圳)有限公司 网络访问的异常检测方法和装置
CN110730164B (zh) * 2019-09-18 2022-09-16 平安科技(深圳)有限公司 安全预警方法及相关设备、计算机可读存储介质
CN110990867B (zh) * 2019-11-28 2023-02-07 上海观安信息技术股份有限公司 基于数据库的数据泄露检测模型的建模方法、装置,泄露检测方法、系统
CN110929799B (zh) * 2019-11-29 2023-05-12 上海盛付通电子支付服务有限公司 用于检测异常用户的方法、电子设备和计算机可读介质
CN111314291A (zh) * 2020-01-15 2020-06-19 北京小米移动软件有限公司 网址安全性检测方法及装置、存储介质
CN111222981A (zh) * 2020-01-16 2020-06-02 中国建设银行股份有限公司 可信度确定方法、装置、设备和存储介质
CN113495749A (zh) * 2020-03-20 2021-10-12 阿里巴巴集团控股有限公司 车载设备的识别方法、装置、系统、设备及可读介质
CN112508095A (zh) * 2020-12-07 2021-03-16 中国平安人寿保险股份有限公司 一种样本处理方法、装置、电子设备及存储介质
CN112929381B (zh) * 2021-02-26 2022-12-23 南方电网科学研究院有限责任公司 一种虚假注入数据的检测方法、装置设备和存储介质
CN113084388B (zh) * 2021-03-29 2023-05-09 广州明珞装备股份有限公司 焊接质量的检测方法、系统、装置及存储介质
CN114866338A (zh) * 2022-06-10 2022-08-05 阿里云计算有限公司 网络安全检测方法、装置及电子设备

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106843941A (zh) * 2016-12-31 2017-06-13 广东欧珀移动通信有限公司 信息处理方法、装置和计算机设备
US20170180418A1 (en) * 2015-12-21 2017-06-22 Symantec Corporation Accurate real-time identification of malicious bgp hijacks
CN106921500A (zh) * 2017-03-22 2017-07-04 深圳先进技术研究院 一种移动设备的身份认证方法及装置
CN108108743A (zh) * 2016-11-24 2018-06-01 百度在线网络技术(北京)有限公司 异常用户识别方法和用于识别异常用户的装置
CN108363811A (zh) * 2018-03-09 2018-08-03 北京京东金融科技控股有限公司 设备识别方法及装置、电子设备、存储介质
CN108647997A (zh) * 2018-04-13 2018-10-12 北京三快在线科技有限公司 一种检测异常数据的方法及装置
CN109886290A (zh) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 用户请求的检测方法、装置、计算机设备及存储介质
CN109905362A (zh) * 2019-01-08 2019-06-18 平安科技(深圳)有限公司 用户请求的检测方法、装置、计算机设备及存储介质
CN109936561A (zh) * 2019-01-08 2019-06-25 平安科技(深圳)有限公司 用户请求的检测方法、装置、计算机设备及存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107391569B (zh) * 2017-06-16 2020-09-15 阿里巴巴集团控股有限公司 数据类型的识别、模型训练、风险识别方法、装置及设备
CN107679557B (zh) * 2017-09-19 2020-11-27 平安科技(深圳)有限公司 驾驶模型训练方法、驾驶人识别方法、装置、设备及介质
CN108366045B (zh) * 2018-01-02 2020-09-01 北京奇艺世纪科技有限公司 一种风控评分卡的设置方法和装置
CN108259482B (zh) * 2018-01-04 2019-05-28 平安科技(深圳)有限公司 网络异常数据检测方法、装置、计算机设备及存储介质
CN108563548B (zh) * 2018-03-19 2020-10-16 创新先进技术有限公司 异常检测方法及装置

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170180418A1 (en) * 2015-12-21 2017-06-22 Symantec Corporation Accurate real-time identification of malicious bgp hijacks
CN108108743A (zh) * 2016-11-24 2018-06-01 百度在线网络技术(北京)有限公司 异常用户识别方法和用于识别异常用户的装置
CN106843941A (zh) * 2016-12-31 2017-06-13 广东欧珀移动通信有限公司 信息处理方法、装置和计算机设备
CN106921500A (zh) * 2017-03-22 2017-07-04 深圳先进技术研究院 一种移动设备的身份认证方法及装置
CN108363811A (zh) * 2018-03-09 2018-08-03 北京京东金融科技控股有限公司 设备识别方法及装置、电子设备、存储介质
CN108647997A (zh) * 2018-04-13 2018-10-12 北京三快在线科技有限公司 一种检测异常数据的方法及装置
CN109886290A (zh) * 2019-01-08 2019-06-14 平安科技(深圳)有限公司 用户请求的检测方法、装置、计算机设备及存储介质
CN109905362A (zh) * 2019-01-08 2019-06-18 平安科技(深圳)有限公司 用户请求的检测方法、装置、计算机设备及存储介质
CN109936561A (zh) * 2019-01-08 2019-06-25 平安科技(深圳)有限公司 用户请求的检测方法、装置、计算机设备及存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114416916A (zh) * 2020-10-12 2022-04-29 中移动信息技术有限公司 异常用户检测方法、装置、设备及存储介质
CN112396513A (zh) * 2020-11-27 2021-02-23 中国银联股份有限公司 一种数据处理的方法及装置
CN112561389A (zh) * 2020-12-23 2021-03-26 北京元心科技有限公司 确定设备检测结果的方法、装置以及电子设备
CN112561389B (zh) * 2020-12-23 2023-11-10 北京元心科技有限公司 确定设备检测结果的方法、装置以及电子设备
CN114268489A (zh) * 2021-12-21 2022-04-01 福建瑞网科技有限公司 一种网络安全防护方法及装置

Also Published As

Publication number Publication date
CN109936561B (zh) 2022-05-13
CN109936561A (zh) 2019-06-25

Similar Documents

Publication Publication Date Title
WO2020143322A1 (zh) 用户请求的检测方法、装置、计算机设备及存储介质
WO2020258657A1 (zh) 异常检测方法、装置、计算机设备及存储介质
CN111079022A (zh) 基于联邦学习的个性化推荐方法、装置、设备及介质
JP2023532669A (ja) 文書処理および応答生成システム
WO2021051558A1 (zh) 基于知识图谱的问答方法、装置和存储介质
WO2021010744A1 (ko) 음성 인식 기반의 세일즈 대화 분석 방법 및 장치
CN109189938A (zh) 用于更新知识图谱的方法和装置
WO2022100452A1 (zh) Ocr系统的评估方法、装置、设备及可读存储介质
WO2020107761A1 (zh) 广告文案处理方法、装置、设备及计算机可读存储介质
WO2021112463A1 (ko) 기업을 위한 정보 제공 장치 및 방법
CN107291775B (zh) 错误样本的修复语料生成方法和装置
WO2020253115A1 (zh) 基于语音识别的产品推荐方法、装置、设备和存储介质
WO2020107762A1 (zh) Ctr预估方法、装置及计算机可读存储介质
WO2020155773A1 (zh) 文本输入异常监控方法、装置、计算机设备及存储介质
US11977567B2 (en) Method of retrieving query, electronic device and medium
US11431749B2 (en) Method and computing device for generating indication of malicious web resources
CN110659206A (zh) 基于微服务的模拟架构建立方法、装置、介质及电子设备
CN107291774B (zh) 错误样本识别方法和装置
WO2020258672A1 (zh) 网络访问的异常检测方法和装置
WO2023005968A1 (zh) 文本类别识别方法、装置、电子设备和存储介质
CN110765973A (zh) 账号类型的识别方法和装置
JP6494799B2 (ja) ジオロケーションを使用して位置決めされた画像を用いた施設の固定
WO2015133856A1 (ko) 정답 키워드 제공 방법 및 장치
CN110704390B (zh) 获取服务器维护脚本的方法、装置、电子设备及介质
CN109905362B (zh) 用户请求的检测方法、装置、计算机设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19909595

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19909595

Country of ref document: EP

Kind code of ref document: A1