WO2021168617A1 - Processing method and apparatus for service risk management, electronic device, and storage medium - Google Patents

Processing method and apparatus for service risk management, electronic device, and storage medium Download PDF

Info

Publication number
WO2021168617A1
WO2021168617A1 PCT/CN2020/076457 CN2020076457W WO2021168617A1 WO 2021168617 A1 WO2021168617 A1 WO 2021168617A1 CN 2020076457 W CN2020076457 W CN 2020076457W WO 2021168617 A1 WO2021168617 A1 WO 2021168617A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
current
service data
business
service
Prior art date
Application number
PCT/CN2020/076457
Other languages
French (fr)
Chinese (zh)
Inventor
唐煜
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to PCT/CN2020/076457 priority Critical patent/WO2021168617A1/en
Priority to CN202080093339.4A priority patent/CN115004652B/en
Publication of WO2021168617A1 publication Critical patent/WO2021168617A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules

Definitions

  • This application relates to the technical field of electronic equipment, and more specifically, to a business risk control processing method, device, electronic equipment, and storage medium.
  • the current business security risk control system is mainly triggered from business characteristics, segmenting users according to business, and formulating relevant rules to identify black and gray users.
  • the black product users found tend to have high credibility, but they often exist among normal users. There are many potential black production users, and these black production users are difficult to identify by the rule system and subdivided users alone.
  • this application proposes a business risk control processing method, device, electronic equipment, and storage medium to solve the above problems.
  • an embodiment of the present application provides a business risk control processing method, and the method includes:
  • the device When the device performs current service access, obtain the current service data of the device; input the current service data into the trained prediction model to obtain the current function value output by the trained prediction model; based on the current function value And a preset credibility threshold to obtain the current detection result of the current service data, where the current detection result is used to characterize whether the current service data is malicious data; The way in which the current business access is handled.
  • an embodiment of the present application provides a business risk control processing device, the device includes: a current business data acquisition module for acquiring current business data of the device when the device is performing current business access; current function The value obtaining module is used to input the current business data into the trained prediction model to obtain the current function value output by the trained prediction model; the current detection result obtaining module is used to obtain the current function value based on the current function value and preset The credibility threshold is used to obtain the current detection result of the current business data, where the current detection result is used to characterize whether the current business data is malicious data; the processing mode determination module is used to determine based on the current detection result The processing mode for the current service access of the device.
  • an embodiment of the present application provides an electronic device, including a memory and a processor, the memory is coupled to the processor, the memory stores instructions, and the instructions are executed when the instructions are executed by the processor.
  • the processor executes the above method.
  • an embodiment of the present application provides a computer readable storage medium, and the computer readable storage medium stores program code, and the program code can be invoked by a processor to execute the above method.
  • the business risk control processing method, device, electronic equipment, and storage medium provided in the embodiments of this application acquire the current business data of the device when the device performs current business access, and input the current business data into the trained prediction model to obtain the trained
  • the current function value output by the prediction model is based on the current function value and the preset credibility threshold to obtain the current detection result of the current business data, where the current detection result is used to characterize whether the current business data is malicious data, and it is determined based on the current detection result
  • the function value output by the trained prediction model and the preset credibility threshold are used to determine whether the service data is malicious data, and the credibility of malicious data judgment is improved.
  • FIG. 1 shows a schematic flowchart of a business risk control processing method provided by an embodiment of the present application
  • FIG. 2 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application
  • FIG. 3 shows a schematic flowchart of step S206 of the business risk control processing method shown in FIG. 2 of the present application;
  • FIG. 4 shows a schematic flowchart of step S207 of the business risk control processing method shown in FIG. 2 of the present application;
  • FIG. 5 shows a schematic flowchart of step S2072 of the business risk control processing method shown in FIG. 4 of the present application
  • FIG. 6 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application.
  • FIG. 7 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application.
  • FIG. 8 shows a schematic flowchart of a business risk control processing method provided by yet another embodiment of the present application.
  • FIG. 9 shows a schematic flowchart of step S501 of the business risk control processing method shown in FIG. 8 of the present application.
  • FIG. 10 shows a schematic flowchart of step S5012 of the business risk control processing method shown in FIG. 9 of the present application.
  • Fig. 11 shows a block diagram of a business risk control processing device provided by an embodiment of the present application.
  • FIG. 12 shows a block diagram of an electronic device used to execute the business risk control processing method according to the embodiment of the present application
  • FIG. 13 shows a storage unit used to store or carry program code for implementing the business risk control processing method according to the embodiment of the present application according to an embodiment of the present application.
  • the rule system generally includes a data storage system composed of non-relational databases and relational databases, consisting of real-time computing training and discrete computing clusters.
  • the real-time computing cluster is used for online computing, and the offline computing cluster is used for periodic execution tasks.
  • the rule engine generates rules through the rule library, which optimizes the matching of rules and increases the efficiency of the real-time risk control system.
  • the rule system has added an index-based and model-based risk control rule evaluation mechanism to ensure the effectiveness of the risk control rules.
  • the rule system has the following problems: 1.
  • the inventor has discovered through long-term research and proposed the business risk control processing method, device, electronic equipment, and storage medium provided by the embodiments of this application.
  • the function value and preset credibility output by the trained prediction model The degree threshold determines whether the business data is malicious data, and improves the credibility of malicious data judgment.
  • the specific business risk control processing method will be described in detail in the subsequent embodiments.
  • FIG. 1 shows a schematic flowchart of a business risk control processing method provided by an embodiment of the present application.
  • the business risk control processing method determines whether the business data is malicious data through the function value output by the trained prediction model and the preset credibility threshold, thereby improving the credibility of malicious data judgment.
  • the business risk control processing method is applied to the business risk control processing device 200 shown in FIG. 11 and the electronic device 100 configured with the business risk control processing device 200 (FIG. 12).
  • the following will take an electronic device as an example to describe the specific process of this embodiment.
  • the electronic device applied in this embodiment may be a smart phone, a tablet computer, a wearable electronic device, etc., which is not limited here.
  • the business risk control processing method may specifically include the following steps:
  • Step S101 Acquire current service data of the device when the device performs current service access.
  • business visits may include, for example, application browsing in a software store, application download in a software store, and application installation in a software store, etc., and may include product browsing in a shopping platform, ordering of products on a shopping platform, and product collection on a shopping platform. It can also include game browsing in a game store, game download in a game store, etc., which are not limited here.
  • the device may include, but is not limited to: a mobile terminal, a tablet computer, a desktop computer, a wearable electronic device, etc., which is not limited herein.
  • the current service data of the device when the device performs current service access, can be obtained.
  • the current service data of the device may include: whether the device number corresponding to the device appears in two different places at the same time period, whether the device model does not meet the specifications, the normal active duration of the device, and whether the device corresponding number is Violations have been recorded, the number of violations corresponding to the device has been recorded, and the credit score of the number corresponding to the device, etc., are not limited here.
  • Step S102 Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
  • the current service data can be input into the trained prediction model, where the trained prediction model is obtained through machine learning.
  • the training data set is first collected , Where the attributes or features of one type of data in the training data set are distinguished from another type of data, and then the neural network is trained and modeled by the collected training data set according to a preset algorithm, so as to summarize based on the training data set Laws, get the trained prediction model.
  • the training data set may include, for example, service data of multiple devices and function values corresponding to the service data of multiple devices.
  • the trained prediction model can be used to output the current function value according to the current service data of the device.
  • the current service data of the device can be input into the trained prediction model, and the current function value output by the trained prediction model can be obtained.
  • the trained prediction model may be stored locally in the electronic device after pre-training is completed. Based on this, after obtaining the current business data of the device, the electronic device can directly call the trained prediction model locally. For example, it can directly send an instruction to the trained prediction model to instruct the trained prediction model to be stored in the target.
  • the area reads the current business data of the device, or the electronic device can directly input the current business data of the device into the trained prediction model stored locally, thereby effectively avoiding the reduction of the current business data input of the device due to the influence of network factors.
  • the speed of the predicted model is to improve the speed at which the trained predictive model obtains the current business data of the device and improve the user experience.
  • the trained prediction model may also be stored in a server that is in communication with the electronic device after the pre-training is completed. Based on this, after the electronic device obtains the current business data of the device, it can send an instruction through the network to the trained prediction model stored in the server to instruct the trained prediction model to read the current business data of the device through the network, or electronic The device can send the current business data of the device to the trained prediction model stored on the server through the network, so that by storing the trained prediction model on the server, the storage space of the electronic device is reduced, and the need for the electronic device is reduced. The impact of normal operation.
  • the current function value output by the trained prediction model can be sigmoid, where the sigmoid function is a common sigmoid function in biology, and it also becomes a sigmoid growth curve.
  • the sigmoid function is often used as the activation function of neural networks to map variables between 0-1.
  • the current function value output by the trained prediction model based on the current service data of the device is between 0-1, where the sigmoid result is generally greater than 0.5 as label 1, that is, the current service whose sigmoid result is greater than 0.5
  • the data is used as blacklist data (malicious data), and the sigmoid result is not greater than 0.5 as tag 0, that is, the current business data whose sigmoid result is not greater than 0.5 is regarded as whitelist data (non-malicious data).
  • Step S103 Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data.
  • the credibility threshold may be preset and stored as the preset credibility threshold, and the preset credibility threshold is used as the judgment result of the current function value output by the trained prediction model. Therefore, in this embodiment, after obtaining the current function value output by the trained prediction model, the current function value can be compared with a preset credibility threshold to obtain a comparison result, and based on the comparison result, the current business data of the current business data can be obtained. Test results.
  • the preset credibility threshold may be obtained by calculating the business data of multiple devices in the historical time period through the rule system.
  • the preset credibility threshold may be 0.5.
  • the current function value and 0.5 may be compared to obtain the comparison result, and based on the comparison As a result, the current detection result of the current business data is obtained.
  • the comparison result indicates that the current function value is less than the preset credibility threshold.
  • the obtained current business data detection result indicates that the current business data is non-malicious data, that is, the current business data corresponds to The device reaches sufficient normal active time and so on.
  • the comparison result indicates that the current function value is not less than the preset credibility threshold.
  • the obtained current business data detection result indicates that the current business data is malicious data, that is, the current business data corresponds to The same device number corresponding to the device appears in two different places in the same segment of the event, and the model of the device does not conform to the specification, etc.
  • Step S104 Determine a processing method for the current service access of the device based on the current detection result.
  • the processing method for the current service access of the device may be determined based on the current detection result. For example, when the current detection result of the current business data is that the current business data of the device is malicious data, the current business access of the device may be characterized by the software store swiping the amount and the shopping platform swiping the order. Therefore, the current business data can be rejected The current business access of the device. When the current detection result of the current business data is that the current business data of the device is non-malicious data, the current business access of the device is characterized by the fact that there is no software store swiping, and the shopping platform swiping the order is a normal business visit. Therefore, the current business access of the device can be performed.
  • the business risk control processing method obtains the current business data of the device when the device performs current business access, inputs the current business data into the trained prediction model, and obtains the current function value output by the trained prediction model Based on the current function value and the preset credibility threshold, the current detection result of the current business data is obtained, where the current detection result is used to characterize whether the current business data is malicious data, and the current business access for the device is determined based on the current detection result
  • the function value output by the trained prediction model and the preset credibility threshold are used to determine whether the business data is malicious data, and the credibility of malicious data judgment is improved.
  • FIG. 2 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application.
  • the current detection result is also used to characterize the current business data as uncertain data.
  • the process shown in FIG. 2 will be described in detail below.
  • the business risk control processing method may specifically include the following steps:
  • Step S201 When the device performs current service access, obtain the current service data of the device.
  • Step S202 Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
  • Step S203 Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data.
  • step S201 to step S203 please refer to step S101 to step S103, which will not be repeated here.
  • Step S204 When the current detection result characterizes that the current service data is malicious data, deny the current service access of the device.
  • the current detection result characterizes the current business data as malicious data
  • the current function value output by the trained prediction model is 0.8 and the preset credibility threshold is 0.5
  • the current detection The result indicates that the current business data is malicious data. Therefore, the current business access of the device can be denied to ensure the authenticity of the application ranking of the software store and the product ranking of the shopping platform, so as to reduce the influence of malicious data on the user's normal selection.
  • Step S205 When the current detection result characterizes that the current service data is non-malicious data, execute the current service access of the device.
  • the current detection result characterizes the current business data as non-malicious data
  • the current function value output by the trained prediction model is 0.2 and the preset credibility threshold is 0.5
  • the current business access of the device can be performed to provide real data for the application ranking of the software store, the product ranking of the shopping platform, etc., to provide a reference for the user's choice.
  • Step S206 When the current detection result characterizes that the current service data is uncertain data, obtain other service data when the device is accessing other services.
  • the rule label model has a certain probability of misjudgment, that is, there is a misjudgment for the result of sigmoid of 0.5 or less.
  • the preset credibility threshold is 0.5 and the current function value is 0.6
  • the current detection result obtained represents the current business data It is malicious data.
  • the current business data with the current function value of 0.6 is not necessarily malicious data. Therefore, only passing 0.5 as the basis for judgment may lead to misjudgment.
  • the calculation can be performed based on the business data of multiple devices in the historical time period, and the preset credibility threshold is set to reduce the interval of 0-1
  • the original two intervals (0-0.5, 0.5-1) are optimized into three intervals, that is, the current business data represented by the detection result obtained based on the current function value and the preset credibility threshold in this embodiment may include: Malicious data, uncertain data, and non-malicious data.
  • the preset credibility threshold may include 0.3 and 0.7, and accordingly, the interval of 0-1 may be divided into an interval of 0-0.3, an interval of 0.3-0.7, and an interval of 0.7-1, where,
  • the current function value is in the range of 0-0.3, it can be determined that the current business data corresponding to the current function value is non-malicious data.
  • the current function value is in the range of 0.3-0.7, it can be determined that the current business data corresponding to the current function value is Uncertain data, when the current function value is in the range of 0.7-1, it can be determined that the current business data corresponding to the current function value is malicious data.
  • the current detection result characterizes that the current service data is uncertain data
  • other service data of the device when accessing other services can be obtained, where ,
  • the obtained other business data may include: business data of the device in other aspects, violations of rules in other fields of the device, etc., which are not limited here.
  • the device can obtain other information when the device is downloading games in the game store or communicating with products on the shopping platform.
  • the business data is used as a reference for the current test results.
  • FIG. 3 shows a schematic flowchart of step S206 of the business risk control processing method shown in FIG. 2 of the present application.
  • the following will elaborate on the process shown in FIG. 3, and the method may specifically include the following steps:
  • Step S2061 When the current detection result characterizes that the current service data is uncertain data, obtain the service type of the current service data.
  • the business type of the current business data can be acquired.
  • the service type of the current service data can be determined according to the current service access. For example, if the current business visit is application browsing of a software store, the business type of the current business data can be determined to be the first type. If the current business visit is application download or installation of the software store, the business type of the current business data can be determined to be the second type. Type, if the current business visit is the product browsing of the shopping platform, the business type of the current business data can be determined to be the first type. If the current business visit is the order of the goods on the shopping platform, the business type of the current business data can be determined to be the second type Wait.
  • Step S2062 When the service type of the current service data meets the preset service type, obtain other service data when the device is accessing other services.
  • a preset service type may be preset and stored, and the preset service type is used as a basis for judging the service type of the current data. Therefore, in this embodiment, after obtaining the service type of the current service data , The service type of the current service data can be compared with the preset service type to determine whether the service type of the current service data meets the preset service type. If it is satisfied, the other service data of the device during other service access is obtained. If it is not satisfied, the current detection result obtained shall prevail, and the processing method of the current service access to the device is determined according to the current detection result.
  • the preset service type may be the second type, and when the current service visit is application download in the software store, application installation in the software store, or product order on the shopping platform, the current business data can be determined The business type meets the preset business type; when the current business visit is the application browsing of the software store or the product browsing of the shopping platform, it can be determined that the business type of the current business data does not meet the preset business type.
  • the service type of the current service data meets the transaction type
  • the business type of the current business data meets the transaction type, it means that the current business visit may be related to money, and business visits are more important. Therefore, in order to reduce the possibility of misjudgment, you can also obtain other services of the device during other business visits. Data to improve the accuracy of the current test results.
  • Step S207 Input the current service data and the other service data into the trained prediction model, and obtain other function values output by the trained prediction model.
  • the trained prediction model can also be used to output other function values based on the current business data and other business data of the device.
  • the current service data and other service data can be input into the trained prediction model to obtain other function values output by the trained prediction model.
  • FIG. 4 shows a schematic flowchart of step S207 of the business risk control processing method shown in FIG. 2 of the present application.
  • the following will elaborate on the process shown in FIG. 4, and the method may specifically include the following steps:
  • Step S2071 Obtain intelligence scores corresponding to other business data when the device is accessing other services, where the intelligence scores are used to characterize the probability that the other business data is not non-malicious data.
  • the intelligence score corresponding to the other business data of the device during other business access can be acquired, where the intelligence score is used to characterize or reflect The probability that other business data is not malicious data.
  • media situation analysis can be performed by contacting business-side features, and related policies can be generated for devices with uncertain data, so as to obtain device violations of rules in other areas, and determine whether devices violate rules in other areas.
  • Step S2072 Perform data enhancement processing on the other business data based on the intelligence score to obtain multiple other business data.
  • data enhancement processing (the frequency of other business data recurring) can be performed on other business data based on the intelligence score to obtain Multiple other business data.
  • data enhancement refers to an effective way to expand the size of data samples. Deep learning is a method based on big data. We currently hope that the larger the scale and the higher the quality of the data, the better, but in the actual process, it is difficult for the collected data to cover all scenarios.
  • enhancement methods for different data types For example, data enhancement for image data mainly includes methods such as image rotation, image segmentation, image RGB change, and image scaling.
  • FIG. 5 shows a schematic flowchart of step S2072 of the business risk control processing method shown in FIG. 4 of the present application.
  • the following will elaborate on the process shown in FIG. 5, and the method may specifically include the following steps:
  • Step S20721 Obtain the duration of the intelligence score corresponding to the other service data when the device is accessing other services.
  • the intelligence scores corresponding to other business data of the device during other business accesses can be acquired, as well as other business data of the device during other business accesses.
  • the business data corresponds to the duration of the intelligence score.
  • Step S20722 Perform data enhancement processing on the other business data based on the intelligence score and the duration to obtain multiple other business data.
  • the intelligence score can be based on the intelligence score.
  • the intelligence score and the data enhancement multiple are positively correlated, that is, the higher the intelligence score, the higher the data enhancement multiple, the lower the intelligence score, the lower the data enhancement multiple.
  • the duration and the data enhancement factor are positively correlated, that is, the longer the duration, the higher the data enhancement factor, and the shorter the duration, the lower the data enhancement factor.
  • Step S2073 Input the current service data and the multiple other service data into the trained prediction model, and obtain other function values output by the trained prediction model.
  • Step S208 Obtain the current service data and other detection results of the other service data based on the other function value and the preset credibility threshold, where the other detection results are used to characterize the current service Whether the data is malicious data.
  • the other function values can be compared with a preset credibility threshold to obtain a comparison result, and based on the comparison result, the current business data can be obtained.
  • Other test results since the input data of the trained prediction model changes from current business data to current business data and other business data, the trained prediction model is based on the current function value output by the current business data and based on the current business data and Other function values output by other business data are different, that is, the current detection result obtained is different from other detection results, so that the purpose of optimizing the detection result and reducing judgment can be achieved.
  • Step S209 Determine a processing mode for the current service access of the device based on the other detection results.
  • the processing mode for the current service access of the device may be determined based on the other detection results.
  • FIG. 6 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application. The process shown in Figure 6 will be described in detail below.
  • the business risk control processing method may specifically include the following steps:
  • Step S301 When the device performs the current service access, obtain the current service data of the device.
  • Step S302 Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
  • Step S303 Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data, or Uncertain data.
  • Step S304 Determine a processing mode for the current service access of the device based on the current detection result.
  • step S301 to step S304 please refer to step S101 to step S104, which will not be repeated here.
  • Step S305 Obtain multiple first function values output by the trained prediction model in the first time period and multiple second function values output in the second time period, where the first time period And the second time period are adjacent time periods.
  • multiple first function values output based on business data in the first time period of the trained prediction model, and multiple second function values output based on business data in the second time period can be obtained.
  • first time period and the second time period are adjacent time periods. It should be noted that the length of the first time period and the second time period are not limited here, and the first time period can be before the second time period or after the second time period, which is not limited here. .
  • Step S306 Obtain a plurality of first detection results based on the plurality of first function values and a preset credibility threshold, and obtain a plurality of second detections based on the plurality of second function values and a preset credibility threshold result.
  • the function value and the preset credibility threshold obtain multiple first detection results, and the multiple second detection results are obtained based on the multiple second function values and the preset credibility threshold.
  • Step S307 Obtain the proportion of the plurality of first detection results that characterize the business data as uncertain data as the first proportion, and obtain the proportion of the plurality of second detection results that characterize the business data as uncertain data as the second proportion. Proportion.
  • the multiple first detection results include: the first detection result that characterizes the business data as malicious data, the first detection result that characterizes the business data as non-malicious data, and the first detection result that characterizes the business data as uncertain data.
  • the multiple second detection results include: the second detection result that characterizes the business data as malicious data, the second detection result that characterizes the business data as non-malicious data, and the first detection result that characterizes the business data as uncertain data. 2. Test results. Therefore, the proportion of the multiple second test results that characterize the business data as uncertain data can be obtained as the second proportion. Specifically, the number of the second test results that characterize the business data as uncertain data can be used as the numerator to increase the number of test results. The second detection result is used as the denominator to calculate the calculation result obtained as the second ratio.
  • Step S308 When the difference between the first ratio and the second ratio is greater than a specified difference, retrain the trained prediction model.
  • a designated difference may be preset and stored, and the designated difference may be used as a basis for determining the difference between the first ratio and the second ratio. Therefore, in this embodiment, after the first ratio and the second ratio are obtained, the difference between the first ratio and the second ratio can be calculated to obtain the difference between the first ratio and the second ratio, and The difference between the first ratio and the second ratio is compared with the specified difference. When the comparison result indicates that the difference between the first ratio and the second ratio is greater than the specified difference, it indicates the difference between two adjacent time periods. If the uncertain data in the business data fluctuates greatly, the trained prediction model needs to be retrained. When the comparison result indicates that the difference between the first proportion and the second proportion is not greater than the specified difference, the characterization is If there is no change or small change in the uncertain data in the business data of two adjacent time periods, there is no need to retrain the trained prediction model.
  • the business risk control processing method provided in another embodiment of the present application also monitors uncertain data in the detection result determined based on the function value output by the trained prediction model, and re-predicts when the detection result is abnormal.
  • the model is trained to optimize the prediction model and improve the accuracy of the detection results.
  • FIG. 7 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application. The following will elaborate on the process shown in Figure 7.
  • the business risk control processing method may specifically include the following steps:
  • Step S401 Obtain a first training data set, where the first training data set includes first service data of multiple devices and function values corresponding to the first service data of the multiple devices.
  • the embodiment of this application also includes a training method for the prediction model, wherein the training of the prediction model may be performed in advance according to the acquired training data set, and each subsequent time When the function value is predicted, the prediction can be made according to the prediction model, without the need to train the prediction model every time a prediction is made.
  • a first training data set may be obtained, and the first training data set includes first service data of multiple devices and function values corresponding to the first service data of multiple devices.
  • the first training data set may be collected in a historical time period.
  • Step S402 Based on the first training data set, the first service data of the multiple devices are used as input data, and the function values corresponding to the first service data of the multiple devices are used as output data, using a machine learning algorithm Perform training to obtain the first prediction model as the trained prediction model.
  • a machine learning algorithm may be used for training, so as to obtain the first prediction model as the trained prediction model.
  • the machine learning algorithms used can include: neural networks, Long Short-Term Memory (LSTM) networks, threshold loop units, simple loop units, autoencoders, decision trees, random forests, feature mean classification, classification Regression tree, hidden Markov, K-Nearest Neighbor (KNN) algorithm, logistic regression model, Bayesian model, Gaussian model and KL divergence (Kullback-Leibler divergence), etc.
  • the following takes a neural network as an example to illustrate the training of the initial model based on the training data set.
  • the first business data of multiple devices in a set of data in the training data set are used as the input samples (input data) of the neural network, and the function values corresponding to the first business data of multiple devices in the set of data are used as the output samples of the neural network ( Output Data).
  • the neurons in the input layer are fully connected with the neurons in the hidden layer, and the neurons in the hidden layer are fully connected with the neurons in the output layer, which can effectively extract potential features of different granularities.
  • the number of hidden layers can be multiple, so as to better fit the non-linear relationship and make the prediction model obtained by training more accurate.
  • the training process of the prediction model may be completed by electronic equipment, or may not be completed by electronic equipment.
  • the electronic device can be used only as a direct user or an indirect user.
  • the prediction model may periodically or irregularly obtain new training data, and train and update the prediction model.
  • Step S403 When the device performs current service access, obtain the current service data of the device.
  • Step S404 Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
  • Step S405 Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data.
  • Step S406 Determine a processing mode for the current service access of the device based on the current detection result.
  • step S403 to step S406 please refer to step S101 to step S104, which will not be repeated here.
  • the first prediction model is obtained as a trained prediction model through the first training data set and the machine learning algorithm, so as to improve the trained prediction model based on the input data.
  • the accuracy of the output data is obtained as a trained prediction model through the first training data set and the machine learning algorithm, so as to improve the trained prediction model based on the input data.
  • FIG. 8 shows a schematic flowchart of a business risk control processing method provided by yet another embodiment of the present application. The following will elaborate on the process shown in Figure 8.
  • the business risk control processing method may specifically include the following steps:
  • Step S501 Obtain a first training data set, where the first training data set includes first service data of multiple devices and function values corresponding to the first service data of the multiple devices.
  • FIG. 9 shows a schematic flowchart of step S501 of the business risk control processing method shown in FIG. 8 of the present application.
  • the process shown in FIG. 9 will be described in detail below, and the method may specifically include the following steps:
  • Step S5011 Obtain the first service data of the multiple devices.
  • Step S5012 Add tags to the first service data of the multiple devices respectively based on preset rules to obtain the first service data tags of the multiple devices.
  • this embodiment can also provide a rule system, where the rule system can generate a blacklist and a whitelist based on historical data, so as to add tags to the collected first business data, and violate the rules based on the overall historical data.
  • set the credibility threshold of the label The rules are mainly composed of two aspects: a.
  • the business history generates a blacklist. In business risk control, users will give feedback to the official after their account is stolen. The same device number appears in two different places at the same time, and the device model does not meet the specifications. This accurate information will form a business blacklist. And get the credibility of the blacklist.
  • the business history generates a whitelist. According to the normal active duration of the device, a business whitelist is formed, and the credibility of the whitelist is obtained.
  • the first service data of the multiple devices may be labeled based on a preset rule system (preset rules) to obtain the first service data of the multiple devices.
  • a business data label when the first service data of the device is obtained, a blacklist label or a whitelist label is added to the first service data based on a preset rule, wherein, when the preset rule determines the value of the first service data of the device When at least part of the information does not meet the requirements, it is determined that the first service data of the device is blacklist data, and a blacklist tag can be added to the first service data of the device, for example, tag 1.
  • the preset rule determines that the first service data of the device is the first When all the information of the business data meets the regulations, it is determined that the first business data of the device is whitelist data, and a whitelist label may be added to the first business data of the device, for example, a label 0 is added.
  • FIG. 10 shows a schematic flowchart of step S5012 of the business risk control processing method shown in FIG. 9 of the present application.
  • the process shown in FIG. 10 will be described in detail below, and the method may specifically include the following steps:
  • Step S50121 respectively detect whether the first service data of the multiple devices meet the preset rule.
  • the preset rules may include those in the rule system that determine that they meet the blacklist Rules corresponding to business data. That is, after acquiring the first service data of multiple devices, it is possible to separately detect whether the first service data of multiple devices meet the blacklist data.
  • Step S50122 Add a first tag to the first service data of the device that is detected to meet the preset rule, and add a second tag to the first service data of the device that is detected to not meet the preset rule to obtain multiple devices The first business data label.
  • the detection result is obtained by separately detecting whether the first service data of multiple devices meets the preset rule, and the first service data of the detected device satisfying the preset rule is added to the first service data according to the detection result.
  • Label adding a second label to the detected first service data of the device that does not meet the preset rule, so as to obtain the first service data label of the multiple devices.
  • the preset rule includes the rule corresponding to the business data that is determined to meet the blacklist in the rule system
  • the first business data of the device that is detected to meet the blacklist can be added to the first label, such as adding label 1, it will be detected
  • a second label is added to the first service data of devices that do not meet the blacklist.
  • label 0 is added, the first service data labels of multiple devices can be obtained, that is, multiple labels 0 and multiple labels 1 can be obtained and output.
  • Step S50123 Obtain the proportion of the first business data that meets the preset rule among the first business data of the multiple devices as the first proportion.
  • the first service data of multiple devices includes: the first service data of devices that meet the preset rules and the first service data of devices that do not meet the preset rules. Therefore, multiple devices can be acquired.
  • the first service data of the first service data that meets the preset rule is used as the numerator, and the calculation result obtained by using the first service data of multiple devices as the denominator is used as the first proportion.
  • Step S50124 Acquire the proportion of the first business data that does not meet the preset rule among the first business data of the multiple devices as a second proportion.
  • the first service data of multiple devices includes: the first service data of devices that meet the preset rules and the first service data of devices that do not meet the preset rules. Therefore, multiple devices can be acquired.
  • the first service data of the first service data that does not meet the preset rule is used as the numerator, and the calculation result obtained by using the first service data of multiple devices as the denominator is used as the second proportion.
  • Step S50125 Obtain a credibility threshold based on the first proportion and the second proportion as a preset credibility threshold.
  • the credibility threshold may be obtained based on the first proportion and the second proportion as the preset credibility threshold.
  • the black sample credibility threshold and the white sample credibility threshold may be obtained based on the first proportion and the second proportion as the preset credibility threshold.
  • the reliability threshold can be 0-0.25, 0.25-0.75, and 0.75-1, that is, business data with a function value in the range of 0-0.25 is non-malicious data, and business data with a function value in the range of 0.25-0.75 is uncertain data, and the function value
  • the business data at 0.75-1 is malicious data.
  • Step S5013 Obtain a first training data set, where the training data set includes first service data labels of multiple devices and function values corresponding to the first service data labels of multiple devices.
  • Step S502 Based on the first training data set, the first service data of the multiple devices are used as input data, and the function values corresponding to the first service data of the multiple devices are used as output data, using a machine learning algorithm Perform training to obtain the first prediction model as the trained prediction model.
  • the first service data of multiple devices can be processed by Onehot as input data, and the function values corresponding to the first service data of multiple devices can be used as output data.
  • the machine learning algorithm is trained to obtain the first prediction model as the trained prediction model.
  • the first service data of multiple devices can be processed by Onehot as input data, and the function values corresponding to the first service data of multiple devices can be used as output data.
  • the algorithm is trained to obtain the first prediction model as the trained prediction model.
  • Onehot processing also known as one-bit effective encoding, which mainly uses N-bit status registers to encode N states. Each state has its own independent register bit, and only one bit is valid at any time.
  • DeepFM In machine learning, it is often used to process discrete features of data and program sparse features. Based on these two characteristics, the DeepFM algorithm is used as the base algorithm of the prediction model. DeepFM is a typical Wide&Deep algorithm.
  • the Wide side uses the FM algorithm, which has the function of Memorization, and can memorize the original characteristics of the black production; its Deep side is a deep neural network model, and the number of layers and each layer on the Deep side can be selected according to the data feature dimension and the size of the data volume. Layer nodes (3 layers of neural networks are generally selected, and the number of nodes in each layer is generally the same).
  • This side has the characteristics of Generalization, which can generate new features for predicting black and gray users.
  • Step S503 Obtain the detection result of the first service data of the multiple devices based on the function values corresponding to the first service data of the multiple devices and the preset credibility threshold.
  • the function values corresponding to the first service data of the multiple devices can be collected, the function values can be compared with a preset credibility threshold to obtain a comparison result, and multiple devices can be obtained based on the comparison result.
  • the detection result of the first service data of the device can be compared with a preset credibility threshold to obtain a comparison result, and multiple devices can be obtained based on the comparison result.
  • Step S504 When the detection result of the first service data of the multiple devices characterizes that the first service data of the target device in the first service data of the multiple devices is uncertain data, acquire the first service data of the target device 2. Business data.
  • the prediction model trained for time is more accurate
  • the second business data of the target device can be acquired, where the acquired second business data of the target device may include: business data of the target device in other aspects, and violations of rules in other areas of the device, etc. , It is not limited here.
  • Step S505 Obtain a second training data set.
  • the second training data set includes the first service data of the multiple devices, the function values corresponding to the first service data of the multiple devices, and the first service data of the target device. Two service data and the function value corresponding to the second service data of the target device.
  • a second training data set can be obtained.
  • the second training data set includes first service data of multiple devices, function values corresponding to the first service data of multiple devices, and second service data of the target device. And the function value corresponding to the second service data of the target device.
  • the second training data set may be collected in a historical time period.
  • Step S506 Based on the second training data set, the first service data of the multiple devices and the second service data of the target device are used as input data, and the first service data of the multiple devices are corresponding to the The function value and the function value corresponding to the second service data of the target device are used as output data, and the second prediction model is obtained as the trained prediction model by training with a machine learning algorithm.
  • the second service data of the target device can be repeatedly obtained to continuously reduce the number of the first service data that characterizes the target device as uncertain data in the detection result.
  • the number is lower than the specified threshold or the number of repetitions reaches the specified number of times, training can be stopped, and the second model can be used as a trained prediction model for online prediction.
  • Step S507 Acquire current service data of the device when the device performs current service access.
  • Step S508 Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
  • Step S509 Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data.
  • Step S510 Determine a processing mode for the current service access of the device based on the current detection result.
  • step S507 to step S510 please refer to step S101 to step S104, which will not be repeated here.
  • the first prediction model is obtained by training through the first training data set and the machine learning algorithm, and when the detection result based on the first prediction model includes uncertain data,
  • the second training data set is used to train and optimize the model to improve the accuracy of the output data obtained by the trained prediction model based on the input data.
  • the embodiments of the present application can achieve the following effects: (1) Perform high-dimensional crossover and extraction of features.
  • the features are uninterpretable, and it is more difficult for black-produced users to find the business rules of features and break them.
  • 2The rules of the scheme participate in label credibility setting and data enhancement, and do not directly judge black products based on the rules.
  • the characteristics of the offline and online risk control processes are the same, so it has better data consistency.
  • 3Using DeepFM as the base model makes the system have the characteristics of Memorization and Generalization, which can not only remember historical black production information but also cross out the characteristics of potential black production information.
  • the credibility of the label is set according to the rules, and the sigmoid function of the model is combined as the output, which can prevent the label noise from deviating from the entire model and improve the credibility of the model.
  • 5 using detectors to monitor changes in the state of uncertain data in the system, and judging the possibility of black production users changing their attack methods, can provide a basis for model retraining. 6 Tracking the uncertain data part can further modify the model and at the same time improve the interpretability of the model.
  • FIG. 11 shows a block diagram of a business risk control processing apparatus 200 provided by an embodiment of the present application.
  • the business risk control processing device 200 includes: a current business data acquisition module 210, a current function value acquisition module 220, a current detection result acquisition module 230, and a processing method determination module 240, wherein :
  • the current business data acquisition module 210 is configured to acquire current business data of the device when the device performs current business access.
  • the current function value obtaining module 220 is configured to input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
  • the current detection result obtaining module 230 is configured to obtain the current detection result of the current service data based on the current function value and the preset credibility threshold, wherein the current detection result is used to characterize whether the current service data is Is malicious data.
  • the processing mode determining module 240 is configured to determine the processing mode for the current service access of the device based on the current detection result.
  • processing method determining module 240 includes: a current service access denial submodule and a current service access execution submodule, wherein:
  • the current service access rejection submodule is configured to reject the current service access of the device when the current detection result indicates that the current service data is malicious data.
  • the current service access execution submodule is configured to execute the current service access of the device when the current detection result characterizes that the current service data is non-malicious data.
  • the current detection result is also used to characterize that the current business data is uncertain data
  • the business risk control processing device 200 further includes: other business data acquisition modules, other function value acquisition modules, and other detection result acquisition modules And other processing methods to determine the module, including:
  • the other business data acquisition module is used to acquire other business data when the device is accessing other business when the current detection result characterizes that the current business data is uncertain data.
  • the other business data acquisition module includes: a business type acquisition sub-module and other business data acquisition sub-modules, wherein:
  • the service type obtaining submodule is configured to obtain the service type of the current service data when the current detection result indicates that the current service data is uncertain data.
  • the other business data acquisition submodule is used to acquire other business data when the device is accessing other business when the business type of the current business data meets the preset business type.
  • the other business data acquisition sub-module includes: another business data acquisition unit, wherein:
  • the other service data obtaining unit is used to obtain other service data when the device is accessing other services when the service type of the current service data meets the transaction type.
  • the other function value obtaining module is configured to input the current service data and the other service data into the trained prediction model to obtain other function values output by the trained prediction model.
  • the other function value obtaining module includes: an intelligence score obtaining submodule, multiple other business data obtaining submodules, and other function value obtaining submodules, wherein:
  • the intelligence score obtaining sub-module is used to obtain the intelligence score corresponding to other business data when the device is accessing other services, where the intelligence score is used to characterize the probability that the other business data is not non-malicious data.
  • Multiple other business data acquisition sub-modules are used to perform data enhancement processing on the other business data based on the intelligence score to obtain multiple other business data.
  • the multiple other business data acquiring submodules include: a duration acquiring unit and multiple other business data acquiring units, wherein:
  • the duration acquisition unit is used to acquire the duration of the intelligence score corresponding to the other business data of the device during other business visits.
  • Multiple other business data acquisition units are configured to perform data enhancement processing on the other business data based on the intelligence score and the duration time to obtain multiple other business data.
  • the other function value obtaining submodule is configured to input the current service data and the multiple other service data into the trained prediction model to obtain other function values output by the trained prediction model.
  • the other detection result obtaining module is configured to obtain the current service data and other detection results of the other service data based on the other function value and the preset credibility threshold, wherein the other detection results are used for Characterize whether the current business data is malicious data.
  • the other processing method determining module is configured to determine the processing method for the current service access of the device based on the other detection results.
  • the business risk control processing device 200 further includes: a function value acquisition module, a detection result acquisition module, a ratio acquisition module, and a retraining module, wherein:
  • the function value acquisition module is used to acquire multiple first function values output by the trained prediction model in the first time period and multiple second function values output in the second time period, wherein the The first time period and the second time period are adjacent time periods.
  • the detection result obtaining module is configured to obtain a plurality of first detection results based on the plurality of first function values and a preset credibility threshold, and obtain a plurality of detection results based on the plurality of second function values and a preset credibility threshold. The second test result.
  • Proportion acquisition module configured to acquire the proportion of the plurality of first detection results that characterize the business data as uncertain data as the first proportion, and obtain the proportion of the plurality of second detection results that characterize the business data as uncertain data As the second ratio.
  • the retraining module is used for retraining the trained prediction model when the difference between the first ratio and the second ratio is greater than a specified difference.
  • the business risk control processing device 200 further includes: a first training data set acquisition module and a first prediction model acquisition module, wherein:
  • the first training data set acquisition module is configured to acquire a first training data set, where the first training data set includes first service data of multiple devices and function values corresponding to the first service data of the multiple devices.
  • the first training data set acquisition module includes: a first business data acquisition sub-module, a first business data label acquisition sub-module, and a first training data set acquisition sub-module, wherein:
  • the first service data obtaining submodule is used to obtain the first service data of the multiple devices.
  • the first service data label obtaining sub-module is configured to respectively add labels to the first service data of the multiple devices based on preset rules to obtain the first service data labels of the multiple devices.
  • the first service data label obtaining submodule includes: a prediction rule detection unit and a first service data label obtaining unit, wherein:
  • the prediction rule detection unit is configured to respectively detect whether the first service data of the multiple devices meet the preset rule.
  • the first service data label obtaining unit is configured to add a first label to the first service data of a device that is detected to meet the preset rule, and to add a first label to the first service data of a device that does not meet the preset rule.
  • the second label is to obtain the first service data label of multiple devices.
  • the first service data label obtaining submodule includes: a first proportion obtaining unit, a second proportion obtaining unit, and a credibility threshold obtaining unit, wherein:
  • the first proportion obtaining unit is configured to obtain the proportion of the first business data satisfying the preset rule among the first business data of the multiple devices as the first proportion.
  • the second proportion acquiring unit is configured to acquire the proportion of the first service data that does not satisfy the preset rule among the first business data of the multiple devices as the second proportion.
  • the credibility threshold obtaining unit is configured to obtain a credibility threshold based on the first proportion and the second proportion as a preset credibility threshold.
  • the first training data set acquisition sub-module is configured to acquire a first training data set, the training data set including first service data labels of multiple devices, and function values corresponding to the first service data labels of multiple devices.
  • the first prediction model obtaining module is configured to use the first service data of the multiple devices as input data, and use the function values corresponding to the first service data of the multiple devices as output based on the first training data set Data is trained through a machine learning algorithm to obtain the first prediction model as the trained prediction model.
  • the first prediction module obtaining module includes: a first prediction model obtaining sub-module, wherein:
  • the first prediction model obtaining sub-module is configured to perform Onehot processing on the first service data of the multiple devices as input data based on the first training data set, and use the first service data of the multiple devices The corresponding function value is used as the output data, and the first prediction model is obtained as the trained prediction model through training of the machine learning algorithm.
  • the first prediction model obtaining sub-module includes: a first prediction model obtaining unit, wherein:
  • the first prediction model obtaining unit is configured to, based on the first training data set, perform Onehot processing on the first service data of the multiple devices as input data, and correspond to the first service data of the multiple devices
  • the function value is used as the output data, and the first prediction model is obtained as the trained prediction model by training through the DeepFM algorithm.
  • the business risk control processing device 200 further includes: a first detection result acquisition module, a second business data acquisition module, a second training data set acquisition module, and a second prediction model acquisition module, wherein:
  • the first detection result obtaining module is configured to obtain the detection result of the first service data of the multiple devices based on the function value corresponding to the first service data of the multiple devices and the preset credibility threshold.
  • the second business data acquisition module is configured to acquire when the detection result of the first business data of the multiple devices characterizes that the first business data of the target device in the first business data of the multiple devices is uncertain data
  • the second service data of the target device is uncertain data.
  • the second training data set acquisition module is configured to acquire a second training data set, the second training data set includes the first service data of the multiple devices, and the function values corresponding to the first service data of the multiple devices , The second service data of the target device and the function value corresponding to the second service data of the target device.
  • the second prediction model obtaining module is configured to use the first service data of the multiple devices and the second service data of the target device as input data based on the second training data set, and the The function value corresponding to the first service data and the function value corresponding to the second service data of the target device are used as output data, and the second prediction model is obtained as the trained prediction model by training with a machine learning algorithm.
  • the coupling between the modules may be electrical, mechanical or other forms of coupling.
  • the functional modules in the various embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
  • the above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
  • FIG. 12 shows a structural block diagram of an electronic device 100 provided by an embodiment of the present application.
  • the electronic device 100 may be an electronic device capable of running application programs, such as a smart phone, a tablet computer, or an e-book.
  • the electronic device 100 in this application may include one or more of the following components: a processor 110, a memory 120, and one or more application programs, where one or more application programs may be stored in the memory 120 and configured to be composed of one Or multiple processors 110 execute, and one or more programs are configured to execute the method described in the foregoing method embodiment.
  • the processor 110 may include one or more processing cores.
  • the processor 110 uses various interfaces and lines to connect various parts of the entire electronic device 100, and executes by running or executing instructions, programs, code sets, or instruction sets stored in the memory 120, and calling data stored in the memory 120.
  • Various functions and processing data of the electronic device 100 may adopt at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA).
  • DSP Digital Signal Processing
  • FPGA Field-Programmable Gate Array
  • PDA Programmable Logic Array
  • the processor 110 may be integrated with one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), a modem, and the like.
  • the CPU mainly processes the operating system, user interface, and application programs; the GPU is used for rendering and drawing the content to be displayed; the modem is used for processing wireless communication. It can be understood that the above-mentioned modem may not be integrated into the processor 110, but may be implemented by a communication chip alone.
  • the memory 120 may include random access memory (RAM) or read-only memory (Read-Only Memory).
  • the memory 120 may be used to store instructions, programs, codes, code sets or instruction sets.
  • the memory 120 may include a program storage area and a data storage area, where the program storage area may store instructions for implementing the operating system and instructions for implementing at least one function (such as touch function, sound playback function, image playback function, etc.) , Instructions used to implement the following various method embodiments, etc.
  • the storage data area can also store data (such as phone book, audio and video data, chat record data) created by the electronic device 100 during use.
  • FIG. 13 shows a structural block diagram of a computer-readable storage medium provided by an embodiment of the present application.
  • the computer-readable medium 300 stores program code, and the program code can be invoked by a processor to execute the method described in the foregoing method embodiment.
  • the computer-readable storage medium 300 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the computer-readable storage medium 300 includes a non-transitory computer-readable storage medium.
  • the computer-readable storage medium 300 has storage space for the program code 310 for executing any method steps in the above-mentioned methods. These program codes can be read from or written into one or more computer program products.
  • the program code 310 may be compressed in a suitable form, for example.
  • the business risk control processing method, device, electronic device, and storage medium provided in the embodiments of the application obtain current business data of the device when the device performs current business access, and input the current business data into the trained prediction model , Obtain the current function value output by the trained prediction model, and obtain the current detection result of the current business data based on the current function value and the preset credibility threshold, where the current detection result is used to characterize whether the current business data is malicious data, Determine the current service access processing method for the device based on the current detection result, so as to determine whether the service data is malicious data through the function value output by the trained prediction model and the preset credibility threshold, and improve the credibility of malicious data judgment .

Abstract

The present application relates to the technical field of electronic devices, and discloses a processing method and apparatus for service risk management, an electronic device, and a storage medium. The method comprises: when a device is accessing a current service, acquiring current service data of the device; inputting the current service data into a trained prediction model, so as to obtain a current function value outputted by the trained prediction model; acquiring a current detection result of the current service data on the basis of the current function value and a preset credibility threshold, wherein the current detection result is used to indicate whether or not the current service data is malicious data; and determining a processing procedure for the device accessing the current service on the basis of the current detection result. In an embodiment of the present application, a function value outputted by a trained prediction model and a preset credibility threshold are used to determine whether service data is malicious data, thereby improving the credibility of malicious data determination.

Description

业务风控处理方法、装置、电子设备以及存储介质Business risk control processing method, device, electronic equipment and storage medium 技术领域Technical field
本申请涉及电子设备技术领域,更具体地,涉及一种业务风控处理方法、装置、电子设备以及存储介质。This application relates to the technical field of electronic equipment, and more specifically, to a business risk control processing method, device, electronic equipment, and storage medium.
背景技术Background technique
目前的业务安全风控系统主要从业务特点触发,根据业务细分用户,并制定相关的规则识别黑灰产用户。首先通过Hadoop、Spark等大数据工具对每日业务进行统计分析,提取和业务相关的一些静态或动态特征,根据业务特点设计规则库,利用规则库和提取到的用户特征为用户进行评级,根据评级结果设置用户的风险等级,并根据风险等级给用户开放相关权限,从而拒绝用户的某些行为,在这个过程中,发现的黑产用户往往可信度较高,但在正常用户中往往存在着很多潜在的黑产用户,而这部分黑产用户单靠规则系统和细分用户难以识别。The current business security risk control system is mainly triggered from business characteristics, segmenting users according to business, and formulating relevant rules to identify black and gray users. First, use Hadoop, Spark and other big data tools to perform statistical analysis on daily business, extract some static or dynamic characteristics related to the business, design a rule base based on business characteristics, use the rule base and the extracted user characteristics to rate users, according to The rating results set the user’s risk level, and open relevant permissions to the user based on the risk level, thereby rejecting certain behaviors of the user. In this process, the black product users found tend to have high credibility, but they often exist among normal users. There are many potential black production users, and these black production users are difficult to identify by the rule system and subdivided users alone.
发明内容Summary of the invention
鉴于上述问题,本申请提出了一种业务风控处理方法、装置、电子设备以及存储介质,以解决上述问题。In view of the above problems, this application proposes a business risk control processing method, device, electronic equipment, and storage medium to solve the above problems.
第一方面,本申请实施例提供了一种业务风控处理方法,所述方法包括:In the first aspect, an embodiment of the present application provides a business risk control processing method, and the method includes:
在设备进行当前业务访问时,获取所述设备的当前业务数据;将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值;基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据;基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。When the device performs current service access, obtain the current service data of the device; input the current service data into the trained prediction model to obtain the current function value output by the trained prediction model; based on the current function value And a preset credibility threshold to obtain the current detection result of the current service data, where the current detection result is used to characterize whether the current service data is malicious data; The way in which the current business access is handled.
第二方面,本申请实施例提供了一种业务风控处理装置,所述装置包括:当前业务数据获取模块,用于在设备进行当前业务访问时,获取所述设备的当前业务数据;当前函数值获取模块,用于将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值;当前检测结果获得模块,用于基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据;处理方式确定模块,用于基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。In the second aspect, an embodiment of the present application provides a business risk control processing device, the device includes: a current business data acquisition module for acquiring current business data of the device when the device is performing current business access; current function The value obtaining module is used to input the current business data into the trained prediction model to obtain the current function value output by the trained prediction model; the current detection result obtaining module is used to obtain the current function value based on the current function value and preset The credibility threshold is used to obtain the current detection result of the current business data, where the current detection result is used to characterize whether the current business data is malicious data; the processing mode determination module is used to determine based on the current detection result The processing mode for the current service access of the device.
第三方面,本申请实施例提供了一种电子设备,包括存储器和处理器,所述存储器耦接到所述处理器,所述存储器存储指令,当所述指令由所述处理器执行时所述处理器执行上述方法。In a third aspect, an embodiment of the present application provides an electronic device, including a memory and a processor, the memory is coupled to the processor, the memory stores instructions, and the instructions are executed when the instructions are executed by the processor. The processor executes the above method.
第四方面,本申请实施例提供了一种计算机可读取存储介质,所述计算机可读取存储介质中存储有程序代码,所述程序代码可被处理器调用执行上述方法。In a fourth aspect, an embodiment of the present application provides a computer readable storage medium, and the computer readable storage medium stores program code, and the program code can be invoked by a processor to execute the above method.
本申请实施例提供的业务风控处理方法、装置、电子设备以及存储介质,在设备进行当前业务访问时,获取设备的当前业务数据,将当前业务数据输入已训练的预测模型,获取已训练的预测模型输出的当前函数值,基于当前函数值和预设可信度阈值,获得当前业务数据的当前检测结果,其中,当前检测结果用于表征当前业务数据是否为恶意数据,基于当前检测结果确定针对该设备的当前业务访问的处理方式,从而通过已训练的预测模型输出的函数值和预设可信度阈值确定业务数据是否为恶意数据,提高恶意数据判断的可信度。The business risk control processing method, device, electronic equipment, and storage medium provided in the embodiments of this application acquire the current business data of the device when the device performs current business access, and input the current business data into the trained prediction model to obtain the trained The current function value output by the prediction model is based on the current function value and the preset credibility threshold to obtain the current detection result of the current business data, where the current detection result is used to characterize whether the current business data is malicious data, and it is determined based on the current detection result According to the processing method of the current service access of the device, the function value output by the trained prediction model and the preset credibility threshold are used to determine whether the service data is malicious data, and the credibility of malicious data judgment is improved.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those skilled in the art, other drawings can be obtained from these drawings without creative work.
图1示出了本申请一个实施例提供的业务风控处理方法的流程示意图;FIG. 1 shows a schematic flowchart of a business risk control processing method provided by an embodiment of the present application;
图2示出了本申请又一个实施例提供的业务风控处理方法的流程示意图;FIG. 2 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application;
图3示出了本申请的图2所示的业务风控处理方法的步骤S206的流程示意图;FIG. 3 shows a schematic flowchart of step S206 of the business risk control processing method shown in FIG. 2 of the present application;
图4示出了本申请的图2所示的业务风控处理方法的步骤S207的流程示意图;FIG. 4 shows a schematic flowchart of step S207 of the business risk control processing method shown in FIG. 2 of the present application;
图5示出了本申请的图4所示的业务风控处理方法的步骤S2072的流程示意图;FIG. 5 shows a schematic flowchart of step S2072 of the business risk control processing method shown in FIG. 4 of the present application;
图6示出了本申请再一个实施例提供的业务风控处理方法的流程示意图;FIG. 6 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application;
图7示出了本申请另一个实施例提供的业务风控处理方法的流程示意图;FIG. 7 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application;
图8示出了本申请又再一个实施例提供的业务风控处理方法的流程示意图;FIG. 8 shows a schematic flowchart of a business risk control processing method provided by yet another embodiment of the present application;
图9示出了本申请的图8所示的业务风控处理方法的步骤S501的流程示意图;FIG. 9 shows a schematic flowchart of step S501 of the business risk control processing method shown in FIG. 8 of the present application;
图10示出了本申请的图9所示的业务风控处理方法的步骤S5012的流程示意图;FIG. 10 shows a schematic flowchart of step S5012 of the business risk control processing method shown in FIG. 9 of the present application;
图11示出了本申请实施例提供的业务风控处理装置的模块框图;Fig. 11 shows a block diagram of a business risk control processing device provided by an embodiment of the present application;
图12示出了本申请实施例用于执行根据本申请实施例的业务风控处理方法的电子设备的框 图;FIG. 12 shows a block diagram of an electronic device used to execute the business risk control processing method according to the embodiment of the present application;
图13示出了本申请实施例的用于保存或者携带实现根据本申请实施例的业务风控处理方法的程序代码的存储单元。FIG. 13 shows a storage unit used to store or carry program code for implementing the business risk control processing method according to the embodiment of the present application according to an embodiment of the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application.
针对目前通过规则系统和细分用户进行黑产用户识别的方法,发明人经过研究发现,该规则系统一般包含了有非关系数据库和关系数据库组成的数据存储系统,由实时计算集训和离散计算集群组成的计算集群系统,实时计算集群用于在线计算,离线计算集群用于周期性的执行任务,规则引擎通过规则库生成规则,实现了对规则匹配的优化,增了了实时风控系统的效率,并且规则系统加入了基于指标和基于模型的风控规则评价机制,以保障风控规则的有效性。然而,该规则系统存在以下问题:1、根据不同的业务需要制定不同的规则,在规则制定下存在相互独立、协同能力弱以及规则涉及复杂等缺点;2、由于根据业务制定的规则大多数是认为交叉出的和业务相关的可解释性特征,故黑产用户在制造虚假数据的时候能够较好突破;3、业务风控系统在实时防护和离线防护上存在不一致性。离线数据的业务特征更加全面,针对的规则更多,实时数据由于数据来源时间上的不一致性,往往针对的规则有限。Aiming at the current method of identifying black users through a rule system and segmented users, the inventor found through research that the rule system generally includes a data storage system composed of non-relational databases and relational databases, consisting of real-time computing training and discrete computing clusters. The real-time computing cluster is used for online computing, and the offline computing cluster is used for periodic execution tasks. The rule engine generates rules through the rule library, which optimizes the matching of rules and increases the efficiency of the real-time risk control system. , And the rule system has added an index-based and model-based risk control rule evaluation mechanism to ensure the effectiveness of the risk control rules. However, the rule system has the following problems: 1. Different rules are formulated according to different business needs, and there are shortcomings such as mutual independence, weak coordination and complex rules under the rule setting; 2. Because most of the rules formulated according to the business are It is believed that the interpretable features related to the business are crossed out, so the black production users can make a better breakthrough when creating false data; 3. The business risk control system is inconsistent in real-time protection and offline protection. The business characteristics of offline data are more comprehensive, and more rules are targeted. Real-time data often has limited rules due to the time inconsistency of the data source.
针对上述问题,发明人经过长期的研究发现,并提出了本申请实施例提供的业务风控处理方法、装置、电子设备以及存储介质,通过已训练的预测模型输出的函数值和预设可信度阈值确定业务数据是否为恶意数据,提高恶意数据判断的可信度。其中,具体的业务风控处理方法在后续的实施例中进行详细的说明。In response to the above problems, the inventor has discovered through long-term research and proposed the business risk control processing method, device, electronic equipment, and storage medium provided by the embodiments of this application. The function value and preset credibility output by the trained prediction model The degree threshold determines whether the business data is malicious data, and improves the credibility of malicious data judgment. Among them, the specific business risk control processing method will be described in detail in the subsequent embodiments.
请参阅图1,图1示出了本申请一个实施例提供的业务风控处理方法的流程示意图。所述业务风控处理方法通过已训练的预测模型输出的函数值和预设可信度阈值确定业务数据是否为恶意数据,提高恶意数据判断的可信度。在具体的实施例中,所述业务风控处理方法应用于如图11所示的业务风控处理装置200以及配置有所述业务风控处理装置200的电子设备100(图12)。下面将以电子设备为例,说明本实施例的具体流程,当然,可以理解的,本实施例所应用的电子设备可以为智能手机、平板电脑、穿戴式电子设备等,在此不做限定。下面将针对图1所示的流程进行详细的阐述,所述业务风控处理方法具体可以包括以下步骤:Please refer to FIG. 1, which shows a schematic flowchart of a business risk control processing method provided by an embodiment of the present application. The business risk control processing method determines whether the business data is malicious data through the function value output by the trained prediction model and the preset credibility threshold, thereby improving the credibility of malicious data judgment. In a specific embodiment, the business risk control processing method is applied to the business risk control processing device 200 shown in FIG. 11 and the electronic device 100 configured with the business risk control processing device 200 (FIG. 12). The following will take an electronic device as an example to describe the specific process of this embodiment. Of course, it is understandable that the electronic device applied in this embodiment may be a smart phone, a tablet computer, a wearable electronic device, etc., which is not limited here. The following will elaborate on the process shown in Figure 1. The business risk control processing method may specifically include the following steps:
步骤S101:在设备进行当前业务访问时,获取所述设备的当前业务数据。Step S101: Acquire current service data of the device when the device performs current service access.
在一些实施方式中,业务访问例如可以包括软件商店的应用浏览、软件商店的应用下载,软件商店的应用安装等,可以包括购物平台的商品浏览、购物平台的商品下单、购物平台的商品收藏等,也可以包括游戏商店的游戏浏览、游戏商店的游戏下载等,在此不做限定。在一些实施方式中,该设备可以包括但不限于:移动终端、平板电脑、台式电脑、穿戴式电子设备等,在此不做限定。In some embodiments, business visits may include, for example, application browsing in a software store, application download in a software store, and application installation in a software store, etc., and may include product browsing in a shopping platform, ordering of products on a shopping platform, and product collection on a shopping platform. It can also include game browsing in a game store, game download in a game store, etc., which are not limited here. In some embodiments, the device may include, but is not limited to: a mobile terminal, a tablet computer, a desktop computer, a wearable electronic device, etc., which is not limited herein.
在本实施例中,在设备进行当前业务访问时,可以获取该设备的当前业务数据。例如,在设备进行当前软件商店访问时,可以获取该设备的当前业务数据,或,在设备进行当前购物平台访问时,可以获取该设备的当前业务数据等。在一些实施方式中,设备的当前业务数据可以包括:设备对应的编号是否在同一时间段出现在不同的两地,设备的型号是否不符合规范,设备的正常活跃时长,设备对应的编号是否被记录过违规,设备对应的编号被记录过违规的次数,设备对应的编号的信用分等,在此不做限定。In this embodiment, when the device performs current service access, the current service data of the device can be obtained. For example, when the device is visiting the current software store, the current business data of the device can be acquired, or when the device is visiting the current shopping platform, the current business data of the device can be acquired, etc. In some implementations, the current service data of the device may include: whether the device number corresponding to the device appears in two different places at the same time period, whether the device model does not meet the specifications, the normal active duration of the device, and whether the device corresponding number is Violations have been recorded, the number of violations corresponding to the device has been recorded, and the credit score of the number corresponding to the device, etc., are not limited here.
步骤S102:将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值。Step S102: Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
在本实施例中,在获设备的当前业务数据后,可以将当前业务数据输入已训练的预测模型,其中,该已训练的预测模型是通过机器学习获得的,具体地,首先采集训练数据集,其中,训练数据集中的一类数据的属性或特征区别于另一类数据,然后通过将采集的训练数据集按照预设的算法对神经网络进行训练建模,从而基于该训练数据集总结出规律,得到已训练的预测模型。在本实施例中,训练数据集例如可以是包括多个设备的业务数据,以及多个设备的业务数据对应的函数值。其中,该已训练的预测模型可以用于根据设备的当前业务数据,输出当前函数值。于本实施例中,可以将设备的当前业务数据输入已训练的预测模型,获取已训练的预测模型输出的当前函数值。In this embodiment, after obtaining the current service data of the device, the current service data can be input into the trained prediction model, where the trained prediction model is obtained through machine learning. Specifically, the training data set is first collected , Where the attributes or features of one type of data in the training data set are distinguished from another type of data, and then the neural network is trained and modeled by the collected training data set according to a preset algorithm, so as to summarize based on the training data set Laws, get the trained prediction model. In this embodiment, the training data set may include, for example, service data of multiple devices and function values corresponding to the service data of multiple devices. Among them, the trained prediction model can be used to output the current function value according to the current service data of the device. In this embodiment, the current service data of the device can be input into the trained prediction model, and the current function value output by the trained prediction model can be obtained.
在一些实施方式中,该已训练的预测模型可以预先训练完成后存储在电子设备本地。基于此,电子设备在获取设备的当前业务数据后,可以直接在本地调用该已训练的预测模型,例如,可以直接发送指令至已训练的预测模型,以指示该已训练的预测模型在目标存储区域读取该设备的当前业务数据,或者电子设备可以直接将该设备的当前业务数据输入存储在本地的已训练的预测模型,从而有效避免由于网络因素的影响降低设备的当前业务数据输入已训练的预测模型的速度,以提升已训练的预测模型获取设备的当前业务数据的速度,提升用户体验。In some embodiments, the trained prediction model may be stored locally in the electronic device after pre-training is completed. Based on this, after obtaining the current business data of the device, the electronic device can directly call the trained prediction model locally. For example, it can directly send an instruction to the trained prediction model to instruct the trained prediction model to be stored in the target. The area reads the current business data of the device, or the electronic device can directly input the current business data of the device into the trained prediction model stored locally, thereby effectively avoiding the reduction of the current business data input of the device due to the influence of network factors. The speed of the predicted model is to improve the speed at which the trained predictive model obtains the current business data of the device and improve the user experience.
在一些实施方式中,该已训练的预测模型也可以预先训练完成后存储在与电子设备通信连接的 服务器。基于此,电子设备在获取设备的当前业务数据后,可以通过网络发送指令至存储在服务器的已训练的预测模型,以指示该已训练的预测模型通过网络读取设备的当前业务数据,或者电子设备可以通过网络将设备的当前业务数据发送至存储在服务器的已训练的预测模型,从而通过将已训练的预测模型存储在服务器的方式,减少对电子设备的存储空间的占用,降低对电子设备正常运行的影响。In some embodiments, the trained prediction model may also be stored in a server that is in communication with the electronic device after the pre-training is completed. Based on this, after the electronic device obtains the current business data of the device, it can send an instruction through the network to the trained prediction model stored in the server to instruct the trained prediction model to read the current business data of the device through the network, or electronic The device can send the current business data of the device to the trained prediction model stored on the server through the network, so that by storing the trained prediction model on the server, the storage space of the electronic device is reduced, and the need for the electronic device is reduced. The impact of normal operation.
其中,已训练的预测模型输出的当前函数值可以为sigmoid,其中,sigmoid函数是一个在生物学中常见的S型函数,也成为S型生长曲线,在信息科学中,由于其单增以及反函数单增等性质,sigmoid函数常被用作神经网络的激活函数,将变量映射到0-1之间。在一些实施方式中,已训练的预测模型基于设备的当前业务数据输出的当前函数值处于0-1之间,其中,一般将sigmoid结果大于0.5的作为标签1,即将sigmoid结果大于0.5的当前业务数据作为黑名单数据(恶意数据),将sigmoid结果不大于0.5的作为标签0,即将sigmoid结果不大于0.5的当前业务数据作为白名单数据(非恶意数据)。Among them, the current function value output by the trained prediction model can be sigmoid, where the sigmoid function is a common sigmoid function in biology, and it also becomes a sigmoid growth curve. In information science, due to its single increase and reverse The sigmoid function is often used as the activation function of neural networks to map variables between 0-1. In some embodiments, the current function value output by the trained prediction model based on the current service data of the device is between 0-1, where the sigmoid result is generally greater than 0.5 as label 1, that is, the current service whose sigmoid result is greater than 0.5 The data is used as blacklist data (malicious data), and the sigmoid result is not greater than 0.5 as tag 0, that is, the current business data whose sigmoid result is not greater than 0.5 is regarded as whitelist data (non-malicious data).
步骤S103:基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据。Step S103: Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data.
在一些实施方式中,可以预先设置并存储可信度阈值作为预设可信度阈值,该预设可信度阈值用于作为已训练的预测模型输出的当前函数值的判断结果,因此,在本实施例中,在获取已训练的预测模型输出的当前函数值后,可以将当前函数值与预设可信度阈值进行比较,以获得比较结果,并基于该比较结果获得当前业务数据的当前检测结果。其中,该预设可信度阈值可以是通过规则系统对多个设备的在历史时间段内的业务数据进行计算获得。In some embodiments, the credibility threshold may be preset and stored as the preset credibility threshold, and the preset credibility threshold is used as the judgment result of the current function value output by the trained prediction model. Therefore, in In this embodiment, after obtaining the current function value output by the trained prediction model, the current function value can be compared with a preset credibility threshold to obtain a comparison result, and based on the comparison result, the current business data of the current business data can be obtained. Test results. Wherein, the preset credibility threshold may be obtained by calculating the business data of multiple devices in the historical time period through the rule system.
在一些实施方式中,该预设可信度阈值可以为0.5,那么在获得已训练的预测模型输出的当前函数值后,可以将当前函数值和0.5进行比较,以获得比较结果,并基于比较结果获得当前业务数据的当前检测结果。其中,当当前函数值小于0.5时,比较结果表征当前函数值小于预设可信度阈值,此时,获得的当前业务数据的检测结果表征当前业务数据为非恶意数据,即该当前业务数据对应的设备达到足够的正常活跃时长等。当当前函数值不小于0.5时,比较结果表征当前函数值不小于预设可信度阈值,此时,获得的当前业务数据的检测结果表征当前业务数据为恶意数据,即该当前业务数据对应的设备对应的同一个设备编号在同一段事件出现在两个不同的地方,设备的型号不符合规范等。In some embodiments, the preset credibility threshold may be 0.5. After obtaining the current function value output by the trained prediction model, the current function value and 0.5 may be compared to obtain the comparison result, and based on the comparison As a result, the current detection result of the current business data is obtained. Wherein, when the current function value is less than 0.5, the comparison result indicates that the current function value is less than the preset credibility threshold. At this time, the obtained current business data detection result indicates that the current business data is non-malicious data, that is, the current business data corresponds to The device reaches sufficient normal active time and so on. When the current function value is not less than 0.5, the comparison result indicates that the current function value is not less than the preset credibility threshold. At this time, the obtained current business data detection result indicates that the current business data is malicious data, that is, the current business data corresponds to The same device number corresponding to the device appears in two different places in the same segment of the event, and the model of the device does not conform to the specification, etc.
步骤S104:基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。Step S104: Determine a processing method for the current service access of the device based on the current detection result.
在一些实施方式中,在获得当前业务数据的当前检测结果后,可以基于该当前检测结果确定针对设备的当前业务访问的处理方式。例如,在获得当前业务数据的当前检测结果为表征该设备的当前业务数据为恶意数据时,表征该设备的当前业务访问可能存在软件商店刷量,购物平台刷单的情况,因此,可以拒绝该设备的当前业务访问。在获得当前业务数据的当前检测结果为表征该设备的当前业务数据为非恶意数据时,表征该设备的当前业务访问不存在软件商店刷量,购物平台刷单的情况,是正常的业务访问,因此,可以执行该设备的当前业务访问。In some implementation manners, after the current detection result of the current service data is obtained, the processing method for the current service access of the device may be determined based on the current detection result. For example, when the current detection result of the current business data is that the current business data of the device is malicious data, the current business access of the device may be characterized by the software store swiping the amount and the shopping platform swiping the order. Therefore, the current business data can be rejected The current business access of the device. When the current detection result of the current business data is that the current business data of the device is non-malicious data, the current business access of the device is characterized by the fact that there is no software store swiping, and the shopping platform swiping the order is a normal business visit. Therefore, the current business access of the device can be performed.
本申请一个实施例提供的业务风控处理方法,在设备进行当前业务访问时,获取设备的当前业务数据,将当前业务数据输入已训练的预测模型,获取已训练的预测模型输出的当前函数值,基于当前函数值和预设可信度阈值,获得当前业务数据的当前检测结果,其中,当前检测结果用于表征当前业务数据是否为恶意数据,基于当前检测结果确定针对该设备的当前业务访问的处理方式,从而通过已训练的预测模型输出的函数值和预设可信度阈值确定业务数据是否为恶意数据,提高恶意数据判断的可信度。The business risk control processing method provided by an embodiment of the present application obtains the current business data of the device when the device performs current business access, inputs the current business data into the trained prediction model, and obtains the current function value output by the trained prediction model Based on the current function value and the preset credibility threshold, the current detection result of the current business data is obtained, where the current detection result is used to characterize whether the current business data is malicious data, and the current business access for the device is determined based on the current detection result In this way, the function value output by the trained prediction model and the preset credibility threshold are used to determine whether the business data is malicious data, and the credibility of malicious data judgment is improved.
请参阅图2,图2示出了本申请又一个实施例提供的业务风控处理方法的流程示意图。其中,当前检测结果还用于表征当前业务数据为不确定数据,下面将针对图2所示的流程进行详细的阐述,所述业务风控处理方法具体可以包括以下步骤:Please refer to FIG. 2, which shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application. Wherein, the current detection result is also used to characterize the current business data as uncertain data. The process shown in FIG. 2 will be described in detail below. The business risk control processing method may specifically include the following steps:
步骤S201:在设备进行当前业务访问时,获取所述设备的当前业务数据。Step S201: When the device performs current service access, obtain the current service data of the device.
步骤S202:将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值。Step S202: Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
步骤S203:基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据。Step S203: Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data.
其中,步骤S201-步骤S203的具体描述请参阅步骤S101-步骤S103,在此不再赘述。For the specific description of step S201 to step S203, please refer to step S101 to step S103, which will not be repeated here.
步骤S204:当所述当前检测结果表征所述当前业务数据为恶意数据时,拒绝所述设备的当前业务访问。Step S204: When the current detection result characterizes that the current service data is malicious data, deny the current service access of the device.
在一些实施方式中,当确定当前检测结果表征当前业务数据为恶意数据时,例如,在已训练的预测模型输出的当前函数值为0.8,预设可信度阈值为0.5时,可以确定当前检测结果表征当前业务数据为恶意数据,因此,可以拒绝设备的当前业务访问,以保证软件商店的应用排名、购物平台的商品排名等的真实性,以减少恶意数据对用户正常选择的影响。In some embodiments, when it is determined that the current detection result characterizes the current business data as malicious data, for example, when the current function value output by the trained prediction model is 0.8 and the preset credibility threshold is 0.5, it can be determined that the current detection The result indicates that the current business data is malicious data. Therefore, the current business access of the device can be denied to ensure the authenticity of the application ranking of the software store and the product ranking of the shopping platform, so as to reduce the influence of malicious data on the user's normal selection.
步骤S205:当所述当前检测结果表征所述当前业务数据为非恶意数据时,执行所述设备的当前业务访问。Step S205: When the current detection result characterizes that the current service data is non-malicious data, execute the current service access of the device.
在一些实施方式中,当确定当前检测结果表征当前业务数据为非恶意数据时,例如,在已训练的预测模型输出的当前函数值为0.2,预设可信度阈值为0.5时,可以确定当前检测结果表征当前业务数据为非恶意数据,因此,可以执行设备的当前业务访问,以为软件商店的应用排名、购物平台的商品排名等提供真实数据,以为用户的选择提供参考。In some embodiments, when it is determined that the current detection result characterizes the current business data as non-malicious data, for example, when the current function value output by the trained prediction model is 0.2 and the preset credibility threshold is 0.5, it can be determined that the current The detection result indicates that the current business data is non-malicious data. Therefore, the current business access of the device can be performed to provide real data for the application ranking of the software store, the product ranking of the shopping platform, etc., to provide a reference for the user's choice.
步骤S206:当所述当前检测结果表征所述当前业务数据为不确定数据时,获取所述设备在进行其他业务访问时的其他业务数据。Step S206: When the current detection result characterizes that the current service data is uncertain data, obtain other service data when the device is accessing other services.
其中,在实际中规则标签模型存在一定概率的误判,即对sigmoid为0.5上下的结果存在误判。例如,当预设可信度阈值为0.5,当前函数值为0.6时,根据规则标签模型的判断规则,由于当前函数值大于预设可信度阈值,所以所获取的当前检测结果表征当前业务数据为恶意数据,但是,在实际中,当前函数值为0.6的当前业务数据不一定是恶意数据,因此,仅通过0.5作为判断依据可能导致误判。Among them, in practice, the rule label model has a certain probability of misjudgment, that is, there is a misjudgment for the result of sigmoid of 0.5 or less. For example, when the preset credibility threshold is 0.5 and the current function value is 0.6, according to the judgment rule of the rule label model, because the current function value is greater than the preset credibility threshold, the current detection result obtained represents the current business data It is malicious data. However, in practice, the current business data with the current function value of 0.6 is not necessarily malicious data. Therefore, only passing 0.5 as the basis for judgment may lead to misjudgment.
因此,为了减少规则标签模型的误判几率,于本实施例中,可以根据多个设备在历史时间段内的业务数据进行计算,通过所设置的预设可信度阈值将0-1的区间由原本的两个区间(0-0.5、0.5-1)优化为三个区间,即本实施例可以基于当前函数值和预设可信度阈值获得的检测结果所表征的当前业务数据可以包括:恶意数据、不确定数据以及非恶意数据。例如,所设置的预设可信度阈值可以包括0.3和0.7,则相应地,可以将0-1的区间划分为0-0.3的区间、0.3-0.7的区间以及0.7-1的区间,其中,当前函数值处于0-0.3的区间时,可以确定该当前函数值对应的当前业务数据为非恶意数据,当前函数值处于0.3-0.7的区间时,可以确定该当前函数值对应的当前业务数据为不确定数据,当前函数值处于0.7-1的区间时,可以确定该当前函数值对应的当前业务数据为恶意数据。Therefore, in order to reduce the probability of misjudgment of the rule label model, in this embodiment, the calculation can be performed based on the business data of multiple devices in the historical time period, and the preset credibility threshold is set to reduce the interval of 0-1 The original two intervals (0-0.5, 0.5-1) are optimized into three intervals, that is, the current business data represented by the detection result obtained based on the current function value and the preset credibility threshold in this embodiment may include: Malicious data, uncertain data, and non-malicious data. For example, the preset credibility threshold may include 0.3 and 0.7, and accordingly, the interval of 0-1 may be divided into an interval of 0-0.3, an interval of 0.3-0.7, and an interval of 0.7-1, where, When the current function value is in the range of 0-0.3, it can be determined that the current business data corresponding to the current function value is non-malicious data. When the current function value is in the range of 0.3-0.7, it can be determined that the current business data corresponding to the current function value is Uncertain data, when the current function value is in the range of 0.7-1, it can be determined that the current business data corresponding to the current function value is malicious data.
在一些实施方式中,当当前检测结果表征当前业务数据为不确定数据时,为了更加准确的判断设备的当前业务数据是否为恶意数据,可以获取设备在进行其他业务访问时的其他业务数据,其中,获取的其他业务数据可以包括:设备在其他方面的业务数据,设备在其他领域的违反规则的情况等,在此不做限定。例如,当设备的当前业务访问为软件商店的应用下载时,若当前检测结果表征当前业务数据为不确定数据时,则可以获取设备在进行游戏商店的游戏下载、购物平台的商品沟通时的其他业务数据,以作为当前检测结果的参考。In some embodiments, when the current detection result characterizes that the current service data is uncertain data, in order to more accurately determine whether the current service data of the device is malicious data, other service data of the device when accessing other services can be obtained, where , The obtained other business data may include: business data of the device in other aspects, violations of rules in other fields of the device, etc., which are not limited here. For example, when the current business visit of the device is an application download in a software store, if the current detection result indicates that the current business data is uncertain data, the device can obtain other information when the device is downloading games in the game store or communicating with products on the shopping platform. The business data is used as a reference for the current test results.
请参阅图3,图3示出了本申请的图2所示的业务风控处理方法的步骤S206的流程示意图。下面将针对图3所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:Please refer to FIG. 3, which shows a schematic flowchart of step S206 of the business risk control processing method shown in FIG. 2 of the present application. The following will elaborate on the process shown in FIG. 3, and the method may specifically include the following steps:
步骤S2061:当所述当前检测结果表征所述当前业务数据为不确定数据时,获取所述当前业务数据的业务类型。Step S2061: When the current detection result characterizes that the current service data is uncertain data, obtain the service type of the current service data.
在一些实施方式中,当当前检测结果表征当前业务数据为不确定数据时,可以获取当前业务数据的业务类型。其中,当前业务数据的业务类型可以根据当前业务访问确定。例如,若当前业务访问为软件商店的应用浏览,可以确定当前业务数据的业务类型为第一类型,若当前业务访问为软件商店的应用下载或安装,可以确定当前业务数据的业务类型为第二类型,若当前业务访问为购物平台的商品浏览,可以确定当前业务数据的业务类型为第一类型,若当前业务访问为购物平台的商品下单,可以确定当前业务数据的业务类型为第二类型等。In some embodiments, when the current detection result indicates that the current business data is uncertain data, the business type of the current business data can be acquired. Among them, the service type of the current service data can be determined according to the current service access. For example, if the current business visit is application browsing of a software store, the business type of the current business data can be determined to be the first type. If the current business visit is application download or installation of the software store, the business type of the current business data can be determined to be the second type. Type, if the current business visit is the product browsing of the shopping platform, the business type of the current business data can be determined to be the first type. If the current business visit is the order of the goods on the shopping platform, the business type of the current business data can be determined to be the second type Wait.
步骤S2062:当所述当前业务数据的业务类型满足预设业务类型时,获取所述设备在进行其他业务访问时的其他业务数据。Step S2062: When the service type of the current service data meets the preset service type, obtain other service data when the device is accessing other services.
在一些实施方式中,可以预先设置并存储预设业务类型,该预设业务类型用于作为当前数据的业务类型的判断依据,因此,在本实施例中,在获取当前业务数据的业务类型后,可以将当前业务数据的业务类型和预设业务类型进行比较,以判断当前业务数据的业务类型是否满足预设业务类型,若满足,则获取设备在进行其他业务访问时的其他业务数据,若不满足,则以获得的当前检测结果为准,并根据当前检测结果确定对设备的当前业务访问的处理方式。例如,于本实施例中,该预设业务类型可以为第二类型,则当当前业务访问为软件商店的应用下载、软件商店的应用安装、购物平台的商品下单时,可以确定当前业务数据的业务类型满足预设业务类型;当当前业务访问为软件商店的应用浏览或购物平台的商品浏览时,可以确定当前业务数据的业务类型不满足预设业务类型。In some embodiments, a preset service type may be preset and stored, and the preset service type is used as a basis for judging the service type of the current data. Therefore, in this embodiment, after obtaining the service type of the current service data , The service type of the current service data can be compared with the preset service type to determine whether the service type of the current service data meets the preset service type. If it is satisfied, the other service data of the device during other service access is obtained. If it is not satisfied, the current detection result obtained shall prevail, and the processing method of the current service access to the device is determined according to the current detection result. For example, in this embodiment, the preset service type may be the second type, and when the current service visit is application download in the software store, application installation in the software store, or product order on the shopping platform, the current business data can be determined The business type meets the preset business type; when the current business visit is the application browsing of the software store or the product browsing of the shopping platform, it can be determined that the business type of the current business data does not meet the preset business type.
在一些实施方式中,当当前业务数据的业务类型满足交易类型时,确定当前业务数据的业务类型满足预设业务类型,可以获取设备在进行其他业务访问时的其他业务数据。其中,若当前业务数据的业务类型满足交易类型,表征该当前业务访问可能与金钱相关,业务访问较为重要,因此,为了减少误判的可能性,还可以获取设备在其他业务访问时的其他业务数据,以提升当前检测结果的准确性。In some embodiments, when the service type of the current service data meets the transaction type, it is determined that the service type of the current service data meets the preset service type, and other service data of the device when accessing other services can be obtained. Among them, if the business type of the current business data meets the transaction type, it means that the current business visit may be related to money, and business visits are more important. Therefore, in order to reduce the possibility of misjudgment, you can also obtain other services of the device during other business visits. Data to improve the accuracy of the current test results.
步骤S207:将所述当前业务数据和所述其他业务数据输入所述已训练的预测模型,获得所述已训练的预测模型输出的其他函数值。Step S207: Input the current service data and the other service data into the trained prediction model, and obtain other function values output by the trained prediction model.
其中,该已训练的预测模型还可以用于根据设备的当前业务数据和其他业务数据,输出其他函 数值。在一些实施方式中,在获设备的其他业务数据后,可以将当前业务数据和其他业务数据输入已训练的预测模型,获得已训练的预测模型输出的其他函数值。Among them, the trained prediction model can also be used to output other function values based on the current business data and other business data of the device. In some embodiments, after obtaining other service data of the device, the current service data and other service data can be input into the trained prediction model to obtain other function values output by the trained prediction model.
请参阅图4,图4示出了本申请的图2所示的业务风控处理方法的步骤S207的流程示意图。下面将针对图4所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:Please refer to FIG. 4, which shows a schematic flowchart of step S207 of the business risk control processing method shown in FIG. 2 of the present application. The following will elaborate on the process shown in FIG. 4, and the method may specifically include the following steps:
步骤S2071:获取所述设备在进行其他业务访问时的其他业务数据对应的情报分数,其中,所述情报分数用于表征所述其他业务数据不为非恶意数据的概率。Step S2071: Obtain intelligence scores corresponding to other business data when the device is accessing other services, where the intelligence scores are used to characterize the probability that the other business data is not non-malicious data.
在本实施例中,在获取设备在进行其他业务访问时的其他业务数据后,可以获取该设备在进行其他业务访问时的其他业务数据对应的情报分数,其中,该情报分数用于表征或反映其他业务数据不是恶意数据的概率。在一些实施方式中,可以通过联系业务侧特征进行媒体情况分析,针对不确定数据的设备生成相关的策略,从而获取设备在其他领域违反规则的情况,基于设备在其他领域违反规则的情况,确定设备在进行其他业务访问时的其他业务数据对应的情报分数。In this embodiment, after acquiring other business data of the device during other business access, the intelligence score corresponding to the other business data of the device during other business access can be acquired, where the intelligence score is used to characterize or reflect The probability that other business data is not malicious data. In some implementations, media situation analysis can be performed by contacting business-side features, and related policies can be generated for devices with uncertain data, so as to obtain device violations of rules in other areas, and determine whether devices violate rules in other areas. The intelligence score corresponding to other business data when the device is accessing other business.
步骤S2072:基于所述情报分数对所述其他业务数据进行数据增强处理,获得多个其他业务数据。Step S2072: Perform data enhancement processing on the other business data based on the intelligence score to obtain multiple other business data.
在本实施例中,在获取设备在进行其他业务访问时的其他业务数据对应的情报分数后,可以基于该情报分数对其他业务数据进行数据增强处理(其他业务数据重复出现的频率),以获得多个其他业务数据。其中,数据增强是指扩充数据样本规模的一种有效地方法。深度学习是基于大数据的一种方法,我们当前希望数据的规模越大、质量越高越好,但在实际过程中,采集到的数据很难覆盖所有的场景。针对不同数据类型有不同的增强方式。如:针对图像数据的数据增强主要包括图像旋转、图像分割、图像RGB变化以及图像缩放等方法,针对文本文件有同义词替换、文档裁剪、词向量预处理以及字典使用等方法。而针对业务风控中面对的数值型数据,主要有特征交叉组合、样本数据重复出现等方式,以通过数据增强处理的方式,获得多个其他业务数据。In this embodiment, after acquiring the intelligence score corresponding to other business data when the device is performing other business access, data enhancement processing (the frequency of other business data recurring) can be performed on other business data based on the intelligence score to obtain Multiple other business data. Among them, data enhancement refers to an effective way to expand the size of data samples. Deep learning is a method based on big data. We currently hope that the larger the scale and the higher the quality of the data, the better, but in the actual process, it is difficult for the collected data to cover all scenarios. There are different enhancement methods for different data types. For example, data enhancement for image data mainly includes methods such as image rotation, image segmentation, image RGB change, and image scaling. For text files, there are methods such as synonym replacement, document cropping, word vector preprocessing, and dictionary use. Regarding the numerical data faced in business risk control, there are mainly features such as cross combination of characteristics and repeated appearance of sample data, so as to obtain multiple other business data through data enhancement processing.
请参阅图5,图5示出了本申请的图4所示的业务风控处理方法的步骤S2072的流程示意图。下面将针对图5所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:Please refer to FIG. 5, which shows a schematic flowchart of step S2072 of the business risk control processing method shown in FIG. 4 of the present application. The following will elaborate on the process shown in FIG. 5, and the method may specifically include the following steps:
步骤S20721:获取所述设备在进行其他业务访问时的其他业务数据对应所述情报分数的持续时长。Step S20721: Obtain the duration of the intelligence score corresponding to the other service data when the device is accessing other services.
在一些实施方式中,在获取设备在进行其他业务访问时的其他业务数据后,可以获取该设备在进行其他业务访问时的其他业务数据对应的情报分数,以及设备在进行其他业务访问时的其他业务数据对应该情报分数的持续时长。在一些实施方式中,可以通过联系业务侧特征进行媒体情况分析,针对不确定数据的设备生成相关的策略(在线)以及根据不确定数据对应设备未来几天的情况(离线),从而获取设备在其他领域违反规则的情况以及设备在该维度下维持状态的持续时长,基于设备在其他领域违反规则的情况以及设备在该维度下维持状态的持续时长,确定设备在进行其他业务访问时的其他业务数据对应的情报分数,以及设备在进行其他业务访问时的其他业务数据对应该情报分数的持续时长。In some implementations, after acquiring other business data of the device during other business accesses, the intelligence scores corresponding to other business data of the device during other business accesses can be acquired, as well as other business data of the device during other business accesses. The business data corresponds to the duration of the intelligence score. In some implementations, you can analyze the media situation by contacting the business side features, generate relevant strategies for devices with uncertain data (online), and correspond to the situation of the device in the next few days based on the uncertain data (offline), so as to obtain the status of the device. Violation of rules in other areas and the duration of the device's status in this dimension. Based on the violation of rules in other areas and the duration of the device's status in this dimension, determine the device's other services when accessing other services The intelligence score corresponding to the data, and the duration of other business data corresponding to the intelligence score when the device is accessing other services.
步骤S20722:基于所述情报分数和所述持续时长对所述其他业务数据进行数据增强处理,获得多个其他业务数据。Step S20722: Perform data enhancement processing on the other business data based on the intelligence score and the duration to obtain multiple other business data.
在本实施例中,在获取设备在进行其他业务访问时的其他业务数据对应的情报分数,以及设备在进行其他业务访问时的其他业务数据对应该情报分数的持续时长后,可以基于该情报分数和持续时长对其他业务数据进行数据增强处理,以获得多个其他业务数据。其中,情报分数和数据增强的倍数成正相关,即情报分数越高,数据增强倍数越高,情报分数越低,数据增强倍数越低。在一些实施方式中,持续时长和数据增强倍数成正相关,即持续时长越长,数据增强倍数越高,持续时长越短,数据增强倍数越低。In this embodiment, after obtaining the intelligence score corresponding to other business data when the device is performing other business visits, and the duration of the other business data corresponding to the intelligence score when the device is performing other business visits, the intelligence score can be based on the intelligence score. Perform data enhancement processing on other business data to obtain multiple other business data. Among them, the intelligence score and the data enhancement multiple are positively correlated, that is, the higher the intelligence score, the higher the data enhancement multiple, the lower the intelligence score, the lower the data enhancement multiple. In some embodiments, the duration and the data enhancement factor are positively correlated, that is, the longer the duration, the higher the data enhancement factor, and the shorter the duration, the lower the data enhancement factor.
步骤S2073:将所述当前业务数据和所述多个其他业务数据输入所述已训练的预测模型,获得所述已训练的预测模型输出的其他函数值。Step S2073: Input the current service data and the multiple other service data into the trained prediction model, and obtain other function values output by the trained prediction model.
步骤S208:基于所述其他函数值和所述预设可信度阈值,获得所述当前业务数据和所述其他业务数据的其他检测结果,其中,所述其他检测结果用于表征所述当前业务数据是否为恶意数据。Step S208: Obtain the current service data and other detection results of the other service data based on the other function value and the preset credibility threshold, where the other detection results are used to characterize the current service Whether the data is malicious data.
在本实施例中,在获取已训练的预测模型输出的其他函数值后,可以将其他函数值与预设可信度阈值进行比较,以获得比较结果,并基于该比较结果获得当前业务数据的其他检测结果。可以理解的是,由于已训练的预测模型的输入数据从当前业务数据变为当前业务数据和其他业务数据,因此,已训练的预测模型基于当前业务数据输出的当前函数值和基于当前业务数据和其他业务数据输出的其他函数值不同,即获得的当前检测结果和其他检测结果不同,从而可以实现优化检测结果,减少判断的目的。In this embodiment, after obtaining other function values output by the trained prediction model, the other function values can be compared with a preset credibility threshold to obtain a comparison result, and based on the comparison result, the current business data can be obtained. Other test results. It is understandable that since the input data of the trained prediction model changes from current business data to current business data and other business data, the trained prediction model is based on the current function value output by the current business data and based on the current business data and Other function values output by other business data are different, that is, the current detection result obtained is different from other detection results, so that the purpose of optimizing the detection result and reducing judgment can be achieved.
步骤S209:基于所述其他检测结果确定针对所述设备的当前业务访问的处理方式。Step S209: Determine a processing mode for the current service access of the device based on the other detection results.
在一些实施方式中,在获得当前业务数据的其他检测结果后,可以基于该其他检测结果确定针对设备的当前业务访问的处理方式。In some embodiments, after obtaining other detection results of the current service data, the processing mode for the current service access of the device may be determined based on the other detection results.
本申请又一个实施例提供的业务风控处理方法,还在当前检测结果表征当前业务数据不确定时,通过获取设备在其他方面的其他业务数据共同输入已训练的预测模型获取检测结果,以提升检测结果的准确性。In the business risk control processing method provided by another embodiment of the present application, when the current detection result indicates that the current business data is uncertain, other business data of the device in other aspects are jointly input into the trained prediction model to obtain the detection result, so as to improve The accuracy of the test results.
请参阅图6,图6示出了本申请再一个实施例提供的业务风控处理方法的流程示意图。下面将 针对图6所示的流程进行详细的阐述,所述业务风控处理方法具体可以包括以下步骤:Please refer to FIG. 6. FIG. 6 shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application. The process shown in Figure 6 will be described in detail below. The business risk control processing method may specifically include the following steps:
步骤S301:在设备进行当前业务访问时,获取所述设备的当前业务数据。Step S301: When the device performs the current service access, obtain the current service data of the device.
步骤S302:将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值。Step S302: Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
步骤S303:基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据,或者为不确定数据。Step S303: Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data, or Uncertain data.
步骤S304:基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。Step S304: Determine a processing mode for the current service access of the device based on the current detection result.
其中,步骤S301-步骤S304的具体描述请参阅步骤S101-步骤S104,在此不再赘述。For the specific description of step S301 to step S304, please refer to step S101 to step S104, which will not be repeated here.
步骤S305:获取所述已训练的预测模型在第一时间段内输出的多个第一函数值,和在第二时间段内输出的多个第二函数值,其中,所述第一时间段和所述第二时间段为相邻时间段。Step S305: Obtain multiple first function values output by the trained prediction model in the first time period and multiple second function values output in the second time period, where the first time period And the second time period are adjacent time periods.
其中,在业务安全中,存在攻防的情况,黑产用户攻击业务的方式不是一成不变的,会随着时间改变攻击方式,伪造新的数据来欺骗已训练的预测模型,因此,在本实施例中,可以监测检查结果表征业务数据为不确定数据的变化情况来确定已训练的预测模型是否需要进行重新训练。在一些实施方式中,可以获取该已训练的预测模型在第一时间段内基于业务数据输出的多个第一函数值,和在第二时间段内基于业务数据输出的多个第二函数值,其中,第一时间段和第二时间段为相邻时间段。需要说明的是,第一时间段和第二时间段的时长在此不做限定,且第一时间段可以在第二时间段在前,也可以在第二时间段之后,在此不做限定。Among them, in business security, there are offensive and defensive situations. The black production users attack the business in a way that is not static. They will change the attack method over time and forge new data to deceive the trained prediction model. Therefore, in this embodiment , It can monitor the changes of the inspection results that characterize the business data as uncertain data to determine whether the trained prediction model needs to be retrained. In some embodiments, multiple first function values output based on business data in the first time period of the trained prediction model, and multiple second function values output based on business data in the second time period can be obtained. , Where the first time period and the second time period are adjacent time periods. It should be noted that the length of the first time period and the second time period are not limited here, and the first time period can be before the second time period or after the second time period, which is not limited here. .
步骤S306:基于所述多个第一函数值和预设可信度阈值获得多个第一检测结果,并基于所述多个第二函数值和预设可信度阈值获得多个第二检测结果。Step S306: Obtain a plurality of first detection results based on the plurality of first function values and a preset credibility threshold, and obtain a plurality of second detections based on the plurality of second function values and a preset credibility threshold result.
在本实施例中,在获得已训练的预测模型在第一时间段内输出的多个第一函数值,和在第二时间段输出的多个第二函数值后,可以基于多个第一函数值和预设可信度阈值获得多个第一检测结果,以及基于多个第二函数值和预设可信度阈值获得多个第二检测结果。In this embodiment, after obtaining the multiple first function values output by the trained prediction model in the first time period and the multiple second function values output in the second time period, it may be based on the multiple first function values. The function value and the preset credibility threshold obtain multiple first detection results, and the multiple second detection results are obtained based on the multiple second function values and the preset credibility threshold.
步骤S307:获取所述多个第一检测结果中表征业务数据为不确定数据的比例作为第一比例,并获取所述多个第二检测结果中表征业务数据为不确定数据的比例作为第二比例。Step S307: Obtain the proportion of the plurality of first detection results that characterize the business data as uncertain data as the first proportion, and obtain the proportion of the plurality of second detection results that characterize the business data as uncertain data as the second proportion. Proportion.
其中,可以理解的是,多个第一检测结果中包括:表征业务数据为恶意数据的第一检测结果、表征业务数据为非恶意数据的第一检测结果以及表征业务数据为不确定数据的第一检测结果。因此,可以获取多个第一检测结果中表征业务数据为不确定数据的比例作为第一比例,具体地,可以通过将表征业务数据为不确定数据的第一检测结果的数量作为分子,将多个第一检测结果作为分母进行计算获得的计算结果作为第一比例。It is understandable that the multiple first detection results include: the first detection result that characterizes the business data as malicious data, the first detection result that characterizes the business data as non-malicious data, and the first detection result that characterizes the business data as uncertain data. One test result. Therefore, the proportion of the first test results that characterize the business data as uncertain data can be obtained as the first proportion. Specifically, the number of the first test results that characterize the business data as uncertain data can be used as the numerator to increase the number of The first detection result is used as the denominator to calculate the calculation result obtained as the first ratio.
其中,可以理解的是,多个第二检测结果中包括:表征业务数据为恶意数据的第二检测结果、表征业务数据为非恶意数据的第二检测结果以及表征业务数据为不确定数据的第二检测结果。因此,可以获取多个第二检测结果中表征业务数据为不确定数据的比例作为第二比例,具体地,可以通过将表征业务数据为不确定数据的第二检测结果的数量作为分子,将多个第二检测结果作为分母进行计算获得的计算结果作为第二比例。It is understandable that the multiple second detection results include: the second detection result that characterizes the business data as malicious data, the second detection result that characterizes the business data as non-malicious data, and the first detection result that characterizes the business data as uncertain data. 2. Test results. Therefore, the proportion of the multiple second test results that characterize the business data as uncertain data can be obtained as the second proportion. Specifically, the number of the second test results that characterize the business data as uncertain data can be used as the numerator to increase the number of test results. The second detection result is used as the denominator to calculate the calculation result obtained as the second ratio.
步骤S308:当所述第一比例和所述第二比例之间的差值大于指定差值时,对所述已训练的预测模型重新进行训练。Step S308: When the difference between the first ratio and the second ratio is greater than a specified difference, retrain the trained prediction model.
在一些实施方式中,可以预先设置并存储指定差值,该指定差值用于作为第一比例和第二比例之间的差值的判断依据。因此,在本实施例中,在获得第一比例和第二比例后,可以将第一比例和第二比例进行差值计算,以获得第一比例和第二比例之间的差值,并将第一比例和第二比例之间的差值与指定差值进行比较,当比较结果表征第一比例和第二比例之间的差值大于指定差值时,表征在相邻两个时间段的业务数据中的不确定数据出现较大的波动,则需要对已训练的预测模型进行重新训练,当比较结果表征第一比例和第二比例之间的差值不大于指定差值时,表征在相邻两个时间段的业务数据中的不确定数据没有变化或变化较小,则不需要对已训练的预测模型进行重新训练。In some embodiments, a designated difference may be preset and stored, and the designated difference may be used as a basis for determining the difference between the first ratio and the second ratio. Therefore, in this embodiment, after the first ratio and the second ratio are obtained, the difference between the first ratio and the second ratio can be calculated to obtain the difference between the first ratio and the second ratio, and The difference between the first ratio and the second ratio is compared with the specified difference. When the comparison result indicates that the difference between the first ratio and the second ratio is greater than the specified difference, it indicates the difference between two adjacent time periods. If the uncertain data in the business data fluctuates greatly, the trained prediction model needs to be retrained. When the comparison result indicates that the difference between the first proportion and the second proportion is not greater than the specified difference, the characterization is If there is no change or small change in the uncertain data in the business data of two adjacent time periods, there is no need to retrain the trained prediction model.
本申请再一个实施例提供的业务风控处理方法,还对基于已训练的预测模型输出的函数值确定的检测结果中的不确定数据进行监测,并在监测到检测结果异常时,重新对预测模型进行训练,以优化预测模型,提升检测结果的准确性。The business risk control processing method provided in another embodiment of the present application also monitors uncertain data in the detection result determined based on the function value output by the trained prediction model, and re-predicts when the detection result is abnormal. The model is trained to optimize the prediction model and improve the accuracy of the detection results.
请参阅图7,图7示出了本申请另一个实施例提供的业务风控处理方法的流程示意图。下面将针对图7所示的流程进行详细的阐述,所述业务风控处理方法具体可以包括以下步骤:Please refer to FIG. 7, which shows a schematic flowchart of a business risk control processing method provided by another embodiment of the present application. The following will elaborate on the process shown in Figure 7. The business risk control processing method may specifically include the following steps:
步骤S401:获取第一训练数据集,所述第一训练数据集包括多个设备的第一业务数据,以及所述多个设备的第一业务数据对应的函数值。Step S401: Obtain a first training data set, where the first training data set includes first service data of multiple devices and function values corresponding to the first service data of the multiple devices.
针对前述实施例中的已训练的预测模型,本申请实施例中还包括对该预测模型的训练方法,其中,对预测模型的训练可以是根据获取的训练数据集预先进行的,后续在每次进行函数值的预测时,则可以根据该预测模型进行预测,而无需每次进行预测时对预测模型进行训练。Regarding the trained prediction model in the foregoing embodiment, the embodiment of this application also includes a training method for the prediction model, wherein the training of the prediction model may be performed in advance according to the acquired training data set, and each subsequent time When the function value is predicted, the prediction can be made according to the prediction model, without the need to train the prediction model every time a prediction is made.
在本实施例中,可以获取第一训练数据集,该第一训练数据集包括多个设备的第一业务数据,以及多个设备的第一业务数据对应的函数值。在一些实施方式中,可以在历史时间段内收集第一训练数据集。In this embodiment, a first training data set may be obtained, and the first training data set includes first service data of multiple devices and function values corresponding to the first service data of multiple devices. In some embodiments, the first training data set may be collected in a historical time period.
步骤S402:基于所述第一训练数据集,将所述多个设备的第一业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型。Step S402: Based on the first training data set, the first service data of the multiple devices are used as input data, and the function values corresponding to the first service data of the multiple devices are used as output data, using a machine learning algorithm Perform training to obtain the first prediction model as the trained prediction model.
在本申请实施例中,针对该第一训练数据集,可以采用机器学习算法进行训练,从而获得第一预测模型作为已训练的预测模型。其中,采用的机器学习算法可以包括:神经网络、长短期记忆(Long Short-Term Memory,LSTM)网络、门限循环单元、简单循环单元、自动编码器、决策树、随机森林、特征均值分类、分类回归树、隐马尔科夫、K最近邻(k-NearestNeighbor,KNN)算法、逻辑回归模型、贝叶斯模型、高斯模型以及KL散度(Kullback–Leibler divergence)等。In the embodiment of the present application, for the first training data set, a machine learning algorithm may be used for training, so as to obtain the first prediction model as the trained prediction model. Among them, the machine learning algorithms used can include: neural networks, Long Short-Term Memory (LSTM) networks, threshold loop units, simple loop units, autoencoders, decision trees, random forests, feature mean classification, classification Regression tree, hidden Markov, K-Nearest Neighbor (KNN) algorithm, logistic regression model, Bayesian model, Gaussian model and KL divergence (Kullback-Leibler divergence), etc.
下面以神经网络为例,对根据训练数据集合训练初始模型进行说明。The following takes a neural network as an example to illustrate the training of the initial model based on the training data set.
训练数据集中一组数据中的多个设备的第一业务数据作为神经网络的输入样本(输入数据),一组数据中多个设备的第一业务数据对应的函数值作为神经网络的输出样本(输出数据)。输入层中的神经元与隐藏层的神经元全连接,隐藏层的神经元与输出层的神经元全连接,从而能够有效提取不同粒度的潜在特征。并且隐藏层数目可以为多个,从而能更好地拟合非线性关系,使得训练得到的预测模型更加准确。The first business data of multiple devices in a set of data in the training data set are used as the input samples (input data) of the neural network, and the function values corresponding to the first business data of multiple devices in the set of data are used as the output samples of the neural network ( Output Data). The neurons in the input layer are fully connected with the neurons in the hidden layer, and the neurons in the hidden layer are fully connected with the neurons in the output layer, which can effectively extract potential features of different granularities. And the number of hidden layers can be multiple, so as to better fit the non-linear relationship and make the prediction model obtained by training more accurate.
可以理解的,对预测模型的训练过程可以由电子设备完成,也可以不由电子设备完成。当训练过程不由电子设备完成时,则电子设备可以只是作为直接使用者,也可以是间接使用者。It is understandable that the training process of the prediction model may be completed by electronic equipment, or may not be completed by electronic equipment. When the training process is not completed by the electronic device, the electronic device can be used only as a direct user or an indirect user.
在一些实施方式中,预测模型可以周期性的或者不定期的获取新的训练数据,对该预测模型进行训练和更新。In some embodiments, the prediction model may periodically or irregularly obtain new training data, and train and update the prediction model.
步骤S403:在设备进行当前业务访问时,获取所述设备的当前业务数据。Step S403: When the device performs current service access, obtain the current service data of the device.
步骤S404:将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值。Step S404: Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
步骤S405:基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据。Step S405: Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data.
步骤S406:基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。Step S406: Determine a processing mode for the current service access of the device based on the current detection result.
其中,步骤S403-步骤S406的具体描述请参阅步骤S101-步骤S104,在此不再赘述。For the specific description of step S403 to step S406, please refer to step S101 to step S104, which will not be repeated here.
本申请另一个实施例提供的业务风控处理方法,还通过第一训练数据集和机器学习算法进行训练获得第一预测模型作为已训练的预测模型,以提升已训练的预测模型基于输入数据获得输出数据的准确性。In the business risk control processing method provided by another embodiment of the present application, the first prediction model is obtained as a trained prediction model through the first training data set and the machine learning algorithm, so as to improve the trained prediction model based on the input data. The accuracy of the output data.
请参阅图8,图8示出了本申请又再一个实施例提供的业务风控处理方法的流程示意图。下面将针对图8所示的流程进行详细的阐述,所述业务风控处理方法具体可以包括以下步骤:Please refer to FIG. 8, which shows a schematic flowchart of a business risk control processing method provided by yet another embodiment of the present application. The following will elaborate on the process shown in Figure 8. The business risk control processing method may specifically include the following steps:
步骤S501:获取第一训练数据集,所述第一训练数据集包括多个设备的第一业务数据,以及所述多个设备的第一业务数据对应的函数值。Step S501: Obtain a first training data set, where the first training data set includes first service data of multiple devices and function values corresponding to the first service data of the multiple devices.
请参阅图9,图9示出了本申请的图8所示的业务风控处理方法的步骤S501的流程示意图。下面将针对图9所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:Please refer to FIG. 9, which shows a schematic flowchart of step S501 of the business risk control processing method shown in FIG. 8 of the present application. The process shown in FIG. 9 will be described in detail below, and the method may specifically include the following steps:
步骤S5011:获取所述多个设备的第一业务数据。Step S5011: Obtain the first service data of the multiple devices.
步骤S5012:基于预设规则分别对所述多个设备的第一业务数据添加标签,获得多个设备的第一业务数据标签。Step S5012: Add tags to the first service data of the multiple devices respectively based on preset rules to obtain the first service data tags of the multiple devices.
在一些实施方式中,本实施例还可以提供规则系统,其中,该规则系统可以根据历史数据生成黑名单和白名单,从而为采集的第一业务数据添加标签,并根据历史数据中总体触犯规则的情况,设置标签的可信度阈值。其规则主要由两个方面组成:a、业务历史生成黑名单。在业务风控中,用户在账号被盗后会向官方进行反馈,同一个设备编号同一段时间出现在不同的两地,设备的型号不符合规范,这些准确的信息会形成业务的黑名单,并得到黑名单可信度。b、业务历史生成白名单。根据设备正常活跃的时长,形成业务白名单,并得到白名单可信度。In some implementations, this embodiment can also provide a rule system, where the rule system can generate a blacklist and a whitelist based on historical data, so as to add tags to the collected first business data, and violate the rules based on the overall historical data. In the case, set the credibility threshold of the label. The rules are mainly composed of two aspects: a. The business history generates a blacklist. In business risk control, users will give feedback to the official after their account is stolen. The same device number appears in two different places at the same time, and the device model does not meet the specifications. This accurate information will form a business blacklist. And get the credibility of the blacklist. b. The business history generates a whitelist. According to the normal active duration of the device, a business whitelist is formed, and the credibility of the whitelist is obtained.
在本实施例中,在获取多个设备的第一业务数据后,可以基于预设的规则系统(预设规则)分别对多个设备的第一业务数据添加标签,以获得多个设备的第一业务数据标签。在一些实施方式中,在获得设备的第一业务数据时,基于预设规则为第一业务数据添加黑名单标签或添加白名单标签,其中,当预设规则判定该设备的第一业务数据的至少部分信息不符合规定时,确定该设备的第一业务数据为黑名单数据,可以为该设备的第一业务数据添加黑名单标签,例如添加标签1,当预设规则判定该设备的第一业务数据的所有信息均符合规定时,确定该设备的第一业务数据为白名单数据,可以为该设备的第一业务数据添加白名单标签,例如添加标签0。In this embodiment, after acquiring the first service data of multiple devices, the first service data of the multiple devices may be labeled based on a preset rule system (preset rules) to obtain the first service data of the multiple devices. A business data label. In some embodiments, when the first service data of the device is obtained, a blacklist label or a whitelist label is added to the first service data based on a preset rule, wherein, when the preset rule determines the value of the first service data of the device When at least part of the information does not meet the requirements, it is determined that the first service data of the device is blacklist data, and a blacklist tag can be added to the first service data of the device, for example, tag 1. When the preset rule determines that the first service data of the device is the first When all the information of the business data meets the regulations, it is determined that the first business data of the device is whitelist data, and a whitelist label may be added to the first business data of the device, for example, a label 0 is added.
请参阅图10,图10示出了本申请的图9所示的业务风控处理方法的步骤S5012的流程示意图。下面 将针对图10所示的流程进行详细的阐述,所述方法具体可以包括以下步骤:Please refer to FIG. 10, which shows a schematic flowchart of step S5012 of the business risk control processing method shown in FIG. 9 of the present application. The process shown in FIG. 10 will be described in detail below, and the method may specifically include the following steps:
步骤S50121:分别检测所述多个设备的第一业务数据是否满足所述预设规则。Step S50121: respectively detect whether the first service data of the multiple devices meet the preset rule.
在一些实施方式中,在获取多个设备的第一业务数据后,可以分别检测多个设备的第一业务数据是否满足预设规则,其中,预设规则可以包括规则系统中判定满足黑名单的业务数据对应的规则。也就是说,在获取多个设备的第一业务数据后,可以分别检测多个设备的第一业务数据是否满足黑名单数据。In some embodiments, after acquiring the first service data of multiple devices, it can be detected whether the first service data of multiple devices meets preset rules, where the preset rules may include those in the rule system that determine that they meet the blacklist Rules corresponding to business data. That is, after acquiring the first service data of multiple devices, it is possible to separately detect whether the first service data of multiple devices meet the blacklist data.
步骤S50122:将检测到满足所述预设规则的设备的第一业务数据添加第一标签,将检测到不满足所述预设规则的设备的第一业务数据添加第二标签,获得多个设备的第一业务数据标签。Step S50122: Add a first tag to the first service data of the device that is detected to meet the preset rule, and add a second tag to the first service data of the device that is detected to not meet the preset rule to obtain multiple devices The first business data label.
在一些实施方式中,通过分别检测多个设备的第一业务数据是否满足预设规则的方式,获得检测结果,根据检测结果将检测到的满足预设规则的设备的第一业务数据添加第一标签,将检测到的不满足预设规则的设备的第一业务数据添加第二标签,从而获得多个设备的第一业务数据标签。其中,在预设规则为包括规则系统中判定满足黑名单的业务数据对应的规则时,可以将检测到满足黑名单的设备的第一业务数据添加第一标签,如添加标签1,将检测到不满足黑名单的设备的第一业务数据添加第二标签,如添加标签0,则可以获得多个设备的第一业务数据标签,即获得并输出多个标签0和多个标签1。In some embodiments, the detection result is obtained by separately detecting whether the first service data of multiple devices meets the preset rule, and the first service data of the detected device satisfying the preset rule is added to the first service data according to the detection result. Label, adding a second label to the detected first service data of the device that does not meet the preset rule, so as to obtain the first service data label of the multiple devices. Among them, when the preset rule includes the rule corresponding to the business data that is determined to meet the blacklist in the rule system, the first business data of the device that is detected to meet the blacklist can be added to the first label, such as adding label 1, it will be detected A second label is added to the first service data of devices that do not meet the blacklist. For example, when label 0 is added, the first service data labels of multiple devices can be obtained, that is, multiple labels 0 and multiple labels 1 can be obtained and output.
步骤S50123:获取所述多个设备的第一业务数据中满足所述预设规则的第一业务数据的占比作为第一占比。Step S50123: Obtain the proportion of the first business data that meets the preset rule among the first business data of the multiple devices as the first proportion.
其中,可以理解的是,多个设备的第一业务数据中包括:满足预设规则的设备的第一业务数据和不满足预设规则的设备的第一业务数据,因此,可以获取多个设备的第一业务数据中满足预设规则的第一业务数据作为分子,将多个设备的第一业务数据作为分母获得的计算结果作为第一占比。It can be understood that the first service data of multiple devices includes: the first service data of devices that meet the preset rules and the first service data of devices that do not meet the preset rules. Therefore, multiple devices can be acquired. Among the first service data of, the first service data that meets the preset rule is used as the numerator, and the calculation result obtained by using the first service data of multiple devices as the denominator is used as the first proportion.
步骤S50124:获取所述多个设备的第一业务数据中不满足所述预设规则的第一业务数据的占比作为第二占比。Step S50124: Acquire the proportion of the first business data that does not meet the preset rule among the first business data of the multiple devices as a second proportion.
其中,可以理解的是,多个设备的第一业务数据中包括:满足预设规则的设备的第一业务数据和不满足预设规则的设备的第一业务数据,因此,可以获取多个设备的第一业务数据中不满足预设规则的第一业务数据作为分子,将多个设备的第一业务数据作为分母获得的计算结果作为第二占比。It can be understood that the first service data of multiple devices includes: the first service data of devices that meet the preset rules and the first service data of devices that do not meet the preset rules. Therefore, multiple devices can be acquired. Among the first service data of, the first service data that does not meet the preset rule is used as the numerator, and the calculation result obtained by using the first service data of multiple devices as the denominator is used as the second proportion.
步骤S50125:基于所述第一占比和所述第二占比获得可信度阈值作为预设可信度阈值。Step S50125: Obtain a credibility threshold based on the first proportion and the second proportion as a preset credibility threshold.
在一些实施方式中,在获得第一占比和第二占比后,可以基于第一占比和第二占比获得可信度阈值作为预设可信度阈值。具体地,在获得第一占比和第二占比后,可以基于第一占比和第二占比获得黑样本可信度阈值和白样本可信度阈值工作作为预设可信度阈值。其中,在预设规则为包括规则系统中判定满足黑名单的业务数据对应的规则时,则可以基于第一占比获得黑样本可信度阈值,并基于第二占比获得白样本可信度阈值。In some embodiments, after obtaining the first proportion and the second proportion, the credibility threshold may be obtained based on the first proportion and the second proportion as the preset credibility threshold. Specifically, after obtaining the first proportion and the second proportion, the black sample credibility threshold and the white sample credibility threshold may be obtained based on the first proportion and the second proportion as the preset credibility threshold. Wherein, when the preset rule includes the rule corresponding to the business data satisfying the blacklist in the rule system, the black sample credibility threshold can be obtained based on the first proportion, and the white sample credibility can be obtained based on the second proportion Threshold.
例如,当多个设备的第一业务数据包括4个,其中3个满足预设规则,其中一个不满足预设规则,则第一占比为0.75,第二占比为0.25,由此,可信度阈值可以为0-0.25、0.25-0.75以及0.75-1,即函数值在0-0.25区间的业务数据为非恶意数据,函数值在0.25-0.75区间的业务数据为不确定数据,函数值在0.75-1的业务数据为恶意数据。For example, when the first service data of multiple devices includes four, three of which meet the preset rule, and one of them does not meet the preset rule, the first proportion is 0.75, and the second proportion is 0.25. Therefore, The reliability threshold can be 0-0.25, 0.25-0.75, and 0.75-1, that is, business data with a function value in the range of 0-0.25 is non-malicious data, and business data with a function value in the range of 0.25-0.75 is uncertain data, and the function value The business data at 0.75-1 is malicious data.
步骤S5013:获取第一训练数据集,所述训练数据集包括多个设备的第一业务数据标签,以及多个设备的第一业务数据标签对应的函数值。Step S5013: Obtain a first training data set, where the training data set includes first service data labels of multiple devices and function values corresponding to the first service data labels of multiple devices.
步骤S502:基于所述第一训练数据集,将所述多个设备的第一业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型。Step S502: Based on the first training data set, the first service data of the multiple devices are used as input data, and the function values corresponding to the first service data of the multiple devices are used as output data, using a machine learning algorithm Perform training to obtain the first prediction model as the trained prediction model.
在一些实施方式中,可以基于第一训练数据集,将多个设备的第一业务数据进行Onehot处理后作为输入数据,以及将多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型。In some embodiments, based on the first training data set, the first service data of multiple devices can be processed by Onehot as input data, and the function values corresponding to the first service data of multiple devices can be used as output data. The machine learning algorithm is trained to obtain the first prediction model as the trained prediction model.
在一些实施方式中,可以基于第一训练数据集,将多个设备的第一业务数据进行Onehot处理后作为输入数据,将多个设备的第一业务数据对应的函数值作为输出数据,通过DeepFM算法进行训练获得第一预测模型作为已训练的预测模型。In some embodiments, based on the first training data set, the first service data of multiple devices can be processed by Onehot as input data, and the function values corresponding to the first service data of multiple devices can be used as output data. The algorithm is trained to obtain the first prediction model as the trained prediction model.
具体地,业务安全针对黑灰产用户需要Menorization和Generalization的特点,既要记住黑产用户的特征,又需要通过黑产用户的特征挖掘出新的特征来预测潜在的黑产用户。同时,业务安全在添加标签后的数据具有离散型特征多,连续型特征较少的特点,对其离散型特征进行Onehot处理后,其维度会变得非常高,不利于树模型等算法的处理,其中,Onehot处理:又称为一位有效编码,主要是采用N位状态寄存器来对N个状态进行编码,每个状态都由他独立的寄存器位,并且在任意时候只有一位有效。在机器学习中,经常使用其对数据的离散特征进行处理,使其编程稀疏的特征。基于这两个特点,使用DeepFM的算法作为预测模型的基算法。DeepFM是一种典型的Wide&Deep算法。其Wide端使用的是FM算法,具有Memorization的作用,能够记忆黑产的原有特征;其Deep侧是深度神经网络模型,可以根据数据特征维度和数据量的大小选择Deep侧的层数和各层节点(一般选用3层神经网络,各层节点数一般相同), 该侧具有Generalization的特点,能够很好的产生新的特征,用于预测黑灰产用户。Specifically, business security needs Menorization and Generalization characteristics for black and gray users. It not only needs to remember the characteristics of black and gray users, but also needs to dig out new features based on the characteristics of black and gray users to predict potential black and gray users. At the same time, the data after adding labels for business security has the characteristics of more discrete features and fewer continuous features. After the discrete features are processed by Onehot, the dimensionality will become very high, which is not conducive to the processing of tree models and other algorithms. , Among them, Onehot processing: also known as one-bit effective encoding, which mainly uses N-bit status registers to encode N states. Each state has its own independent register bit, and only one bit is valid at any time. In machine learning, it is often used to process discrete features of data and program sparse features. Based on these two characteristics, the DeepFM algorithm is used as the base algorithm of the prediction model. DeepFM is a typical Wide&Deep algorithm. The Wide side uses the FM algorithm, which has the function of Memorization, and can memorize the original characteristics of the black production; its Deep side is a deep neural network model, and the number of layers and each layer on the Deep side can be selected according to the data feature dimension and the size of the data volume. Layer nodes (3 layers of neural networks are generally selected, and the number of nodes in each layer is generally the same). This side has the characteristics of Generalization, which can generate new features for predicting black and gray users.
步骤S503:基于所述多个设备的第一业务数据对应的函数值和预设可信度阈值,获得所述多个设备的第一业务数据的检测结果。Step S503: Obtain the detection result of the first service data of the multiple devices based on the function values corresponding to the first service data of the multiple devices and the preset credibility threshold.
在本实施例中,在收集获得多个设备的第一业务数据对应的函数值后,可以将函数值与预设可信度阈值进行比较,以获得比较结果,并基于该比较结果获得多个设备的第一业务数据的检测结果。In this embodiment, after the function values corresponding to the first service data of the multiple devices are collected, the function values can be compared with a preset credibility threshold to obtain a comparison result, and multiple devices can be obtained based on the comparison result. The detection result of the first service data of the device.
步骤S504:当所述多个设备的第一业务数据的检测结果表征所述多个设备的第一业务数据中的目标设备的第一业务数据为不确定数据时,获取所述目标设备的第二业务数据。Step S504: When the detection result of the first service data of the multiple devices characterizes that the first service data of the target device in the first service data of the multiple devices is uncertain data, acquire the first service data of the target device 2. Business data.
在一些实施方式中,当多个设备的第一业务数据的检测结果表征多个设备的第一业务数据中的目标设备的第一业务数据为不确定数据时,为了时训练的预测模型更加准确,减少误判的情况,可以获取目标设备的第二业务数据,其中,获取的目标设备的第二业务数据可以包括:目标设备在其他方面的业务数据,设备在其他领域的违反规则的情况等,在此不做限定。In some embodiments, when the detection result of the first service data of multiple devices characterizes that the first service data of the target device in the first service data of the multiple devices is uncertain data, the prediction model trained for time is more accurate To reduce misjudgments, the second business data of the target device can be acquired, where the acquired second business data of the target device may include: business data of the target device in other aspects, and violations of rules in other areas of the device, etc. , It is not limited here.
步骤S505:获取第二训练数据集,所述第二训练数据集包括所述多个设备的第一业务数据,所述多个设备的第一业务数据对应的函数值,所述目标设备的第二业务数据以及所述目标设备的第二业务数据对应的函数值。Step S505: Obtain a second training data set. The second training data set includes the first service data of the multiple devices, the function values corresponding to the first service data of the multiple devices, and the first service data of the target device. Two service data and the function value corresponding to the second service data of the target device.
在本实施例中,可以获取第二训练数据集,该第二训练数据集包括多个设备的第一业务数据,多个设备的第一业务数据对应的函数值,目标设备的第二业务数据以及目标设备的第二业务数据对应的函数值。在一些实施方式中,可以在历史时间段内收集第二训练数据集。In this embodiment, a second training data set can be obtained. The second training data set includes first service data of multiple devices, function values corresponding to the first service data of multiple devices, and second service data of the target device. And the function value corresponding to the second service data of the target device. In some embodiments, the second training data set may be collected in a historical time period.
步骤S506:基于所述第二训练数据集,将所述多个设备的第一业务数据和所述目标设备的第二业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值和所述目标设备的第二业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第二预测模型作为已训练的预测模型。Step S506: Based on the second training data set, the first service data of the multiple devices and the second service data of the target device are used as input data, and the first service data of the multiple devices are corresponding to the The function value and the function value corresponding to the second service data of the target device are used as output data, and the second prediction model is obtained as the trained prediction model by training with a machine learning algorithm.
其中,在基于第二训练数据集进行训练的过程中,可以重复获取目标设备的第二业务数据,以不断的减少检测结果中表征目标设备的第一业务数据为不确定数据的数量,当其数量低于指定阈值或者重复次数达到指定次数时,可以停止训练,将第二模型作为已训练的预测模型用作线上预测。Among them, in the process of training based on the second training data set, the second service data of the target device can be repeatedly obtained to continuously reduce the number of the first service data that characterizes the target device as uncertain data in the detection result. When the number is lower than the specified threshold or the number of repetitions reaches the specified number of times, training can be stopped, and the second model can be used as a trained prediction model for online prediction.
步骤S507:在设备进行当前业务访问时,获取所述设备的当前业务数据。Step S507: Acquire current service data of the device when the device performs current service access.
步骤S508:将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值。Step S508: Input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
步骤S509:基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据。Step S509: Obtain a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data.
步骤S510:基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。Step S510: Determine a processing mode for the current service access of the device based on the current detection result.
其中,步骤S507-步骤S510的具体描述请参阅步骤S101-步骤S104,在此不再赘述。For the specific description of step S507 to step S510, please refer to step S101 to step S104, which will not be repeated here.
本申请又再一个实施例提供的业务风控处理方法,还通过第一训练数据集和机器学习算法进行训练获得第一预测模型,并在基于第一预测模型的检测结果包括不确定数据时,通过第二训练数据集再进行模型的训练和优化,以提升已训练的预测模型基于输入数据获得输出数据的准确性。In the business risk control processing method provided in yet another embodiment of the present application, the first prediction model is obtained by training through the first training data set and the machine learning algorithm, and when the detection result based on the first prediction model includes uncertain data, The second training data set is used to train and optimize the model to improve the accuracy of the output data obtained by the trained prediction model based on the input data.
因此,本申请实施例能够能够实现如下效果:①将特征进行高维的交叉和提取,其特征具有不可解释性,黑产用户更加难以找到特征在业务上的规律,将其攻破。②该方案的规则参与到标签可信度设定和数据增强上,并不直接根据规则给出黑产用户判断,其离线和在线风控过程中的特征均相同,故在具有更好的数据一致性。③使用DeepFM作为基模型,使得系统具有Memorization和Generalization的特点,不但可以记住历史黑产信息还能够交叉出潜在黑产信息具有的特征。④在输入模型的过程中,根据规则对标签的可信度进行阈值设定,并结合模型的sigmoid函数作为输出,可以防止标签噪声带偏整个模型,提高模型的可信度。⑤使用检测器监测系统不确定数据的状态变化,判断黑产用户改变攻击方式的可能性,可以为模型重训练提供依据。⑥对不确定数据部分进行追踪,可以进一步修正模型,同时也能提高模型的可解释性。Therefore, the embodiments of the present application can achieve the following effects: (1) Perform high-dimensional crossover and extraction of features. The features are uninterpretable, and it is more difficult for black-produced users to find the business rules of features and break them. ②The rules of the scheme participate in label credibility setting and data enhancement, and do not directly judge black products based on the rules. The characteristics of the offline and online risk control processes are the same, so it has better data consistency. ③Using DeepFM as the base model makes the system have the characteristics of Memorization and Generalization, which can not only remember historical black production information but also cross out the characteristics of potential black production information. ④ In the process of inputting the model, the credibility of the label is set according to the rules, and the sigmoid function of the model is combined as the output, which can prevent the label noise from deviating from the entire model and improve the credibility of the model. ⑤Using detectors to monitor changes in the state of uncertain data in the system, and judging the possibility of black production users changing their attack methods, can provide a basis for model retraining. ⑥ Tracking the uncertain data part can further modify the model and at the same time improve the interpretability of the model.
请参阅图11,图11示出了本申请实施例提供的业务风控处理装置200的模块框图。下面将针对图11所示的框图进行阐述,所述业务风控处理装置200包括:当前业务数据获取模块210、当前函数值获取模块220、当前检测结果获得模块230以及处理方式确定模块240,其中:Please refer to FIG. 11, which shows a block diagram of a business risk control processing apparatus 200 provided by an embodiment of the present application. The following will elaborate on the block diagram shown in FIG. 11, the business risk control processing device 200 includes: a current business data acquisition module 210, a current function value acquisition module 220, a current detection result acquisition module 230, and a processing method determination module 240, wherein :
当前业务数据获取模块210,用于在设备进行当前业务访问时,获取所述设备的当前业务数据。The current business data acquisition module 210 is configured to acquire current business data of the device when the device performs current business access.
当前函数值获取模块220,用于将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值。The current function value obtaining module 220 is configured to input the current service data into the trained prediction model, and obtain the current function value output by the trained prediction model.
当前检测结果获得模块230,用于基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据。The current detection result obtaining module 230 is configured to obtain the current detection result of the current service data based on the current function value and the preset credibility threshold, wherein the current detection result is used to characterize whether the current service data is Is malicious data.
处理方式确定模块240,用于基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。The processing mode determining module 240 is configured to determine the processing mode for the current service access of the device based on the current detection result.
进一步地,所述处理方式确定模块240包括:当前业务访问拒绝子模块和当前业务访问执行子模块,其中:Further, the processing method determining module 240 includes: a current service access denial submodule and a current service access execution submodule, wherein:
当前业务访问拒绝子模块,用于当所述当前检测结果表征所述当前业务数据为恶意数据时,拒绝所述设备的当前业务访问。The current service access rejection submodule is configured to reject the current service access of the device when the current detection result indicates that the current service data is malicious data.
当前业务访问执行子模块,用于当所述当前检测结果表征所述当前业务数据为非恶意数据时,执行所述设备的当前业务访问。The current service access execution submodule is configured to execute the current service access of the device when the current detection result characterizes that the current service data is non-malicious data.
进一步地,所述当前检测结果还用于表征所述当前业务数据为不确定数据,所述业务风控处理装置200还包括:其他业务数据获取模块、其他函数值获得模块、其他检测结果获得模块以及其他处理方式确定模块,其中:Further, the current detection result is also used to characterize that the current business data is uncertain data, and the business risk control processing device 200 further includes: other business data acquisition modules, other function value acquisition modules, and other detection result acquisition modules And other processing methods to determine the module, including:
其他业务数据获取模块,用于当所述当前检测结果表征所述当前业务数据为不确定数据时,获取所述设备在进行其他业务访问时的其他业务数据。The other business data acquisition module is used to acquire other business data when the device is accessing other business when the current detection result characterizes that the current business data is uncertain data.
进一步地,所述其他业务数据获取模块包括:业务类型获取子模块和其他业务数据获取子模块,其中:Further, the other business data acquisition module includes: a business type acquisition sub-module and other business data acquisition sub-modules, wherein:
业务类型获取子模块,用于当所述当前检测结果表征所述当前业务数据为不确定数据时,获取所述当前业务数据的业务类型。The service type obtaining submodule is configured to obtain the service type of the current service data when the current detection result indicates that the current service data is uncertain data.
其他业务数据获取子模块,用于当所述当前业务数据的业务类型满足预设业务类型时,获取所述设备在进行其他业务访问时的其他业务数据。The other business data acquisition submodule is used to acquire other business data when the device is accessing other business when the business type of the current business data meets the preset business type.
进一步地,所述其他业务数据获取子模块包括:其他业务数据获取单元,其中:Further, the other business data acquisition sub-module includes: another business data acquisition unit, wherein:
其他业务数据获取单元,用于当所述当前业务数据的业务类型满足交易类型时,获取所述设备在进行其他业务访问时的其他业务数据。The other service data obtaining unit is used to obtain other service data when the device is accessing other services when the service type of the current service data meets the transaction type.
其他函数值获得模块,用于将所述当前业务数据和所述其他业务数据输入所述已训练的预测模型,获得所述已训练的预测模型输出的其他函数值。The other function value obtaining module is configured to input the current service data and the other service data into the trained prediction model to obtain other function values output by the trained prediction model.
进一步地,所述其他函数值获得模块包括:情报分数获取子模块、多个其他业务数据获取子模块以及其他函数值获得子模块,其中:Further, the other function value obtaining module includes: an intelligence score obtaining submodule, multiple other business data obtaining submodules, and other function value obtaining submodules, wherein:
情报分数获取子模块,用于获取所述设备在进行其他业务访问时的其他业务数据对应的情报分数,其中,所述情报分数用于表征所述其他业务数据不为非恶意数据的概率。The intelligence score obtaining sub-module is used to obtain the intelligence score corresponding to other business data when the device is accessing other services, where the intelligence score is used to characterize the probability that the other business data is not non-malicious data.
多个其他业务数据获取子模块,用于基于所述情报分数对所述其他业务数据进行数据增强处理,获得多个其他业务数据。Multiple other business data acquisition sub-modules are used to perform data enhancement processing on the other business data based on the intelligence score to obtain multiple other business data.
进一步地,所述多个其他业务数据获取子模块包括:持续时长获取单元和多个其他业务数据获取单元,其中:Further, the multiple other business data acquiring submodules include: a duration acquiring unit and multiple other business data acquiring units, wherein:
持续时长获取单元,用于获取所述设备在进行其他业务访问时的其他业务数据对应所述情报分数的持续时长。The duration acquisition unit is used to acquire the duration of the intelligence score corresponding to the other business data of the device during other business visits.
多个其他业务数据获取单元,用于基于所述情报分数和所述持续时长对所述其他业务数据进行数据增强处理,获得多个其他业务数据。Multiple other business data acquisition units are configured to perform data enhancement processing on the other business data based on the intelligence score and the duration time to obtain multiple other business data.
其他函数值获得子模块,用于将所述当前业务数据和所述多个其他业务数据输入所述已训练的预测模型,获得所述已训练的预测模型输出的其他函数值。The other function value obtaining submodule is configured to input the current service data and the multiple other service data into the trained prediction model to obtain other function values output by the trained prediction model.
其他检测结果获得模块,用于基于所述其他函数值和所述预设可信度阈值,获得所述当前业务数据和所述其他业务数据的其他检测结果,其中,所述其他检测结果用于表征所述当前业务数据是否为恶意数据。The other detection result obtaining module is configured to obtain the current service data and other detection results of the other service data based on the other function value and the preset credibility threshold, wherein the other detection results are used for Characterize whether the current business data is malicious data.
其他处理方式确定模块,用于基于所述其他检测结果确定针对所述设备的当前业务访问的处理方式。The other processing method determining module is configured to determine the processing method for the current service access of the device based on the other detection results.
进一步地,所述业务风控处理装置200还包括:函数值获取模块、检测结果获得模块、比例获取模块以及重新训练模块,其中:Further, the business risk control processing device 200 further includes: a function value acquisition module, a detection result acquisition module, a ratio acquisition module, and a retraining module, wherein:
函数值获取模块,用于获取所述已训练的预测模型在第一时间段内输出的多个第一函数值,和在第二时间段内输出的多个第二函数值,其中,所述第一时间段和所述第二时间段为相邻时间段。The function value acquisition module is used to acquire multiple first function values output by the trained prediction model in the first time period and multiple second function values output in the second time period, wherein the The first time period and the second time period are adjacent time periods.
检测结果获得模块,用于基于所述多个第一函数值和预设可信度阈值获得多个第一检测结果,并基于所述多个第二函数值和预设可信度阈值获得多个第二检测结果。The detection result obtaining module is configured to obtain a plurality of first detection results based on the plurality of first function values and a preset credibility threshold, and obtain a plurality of detection results based on the plurality of second function values and a preset credibility threshold. The second test result.
比例获取模块,用于获取所述多个第一检测结果中表征业务数据为不确定数据的比例作为第一比例,并获取所述多个第二检测结果中表征业务数据为不确定数据的比例作为第二比例。Proportion acquisition module, configured to acquire the proportion of the plurality of first detection results that characterize the business data as uncertain data as the first proportion, and obtain the proportion of the plurality of second detection results that characterize the business data as uncertain data As the second ratio.
重新训练模块,用于当所述第一比例和所述第二比例之间的差值大于指定差值时,对所述已训练的预测模型重新进行训练。The retraining module is used for retraining the trained prediction model when the difference between the first ratio and the second ratio is greater than a specified difference.
进一步地,所述业务风控处理装置200还包括:第一训练数据集获取模块和第一预测模型获得模块,其中:Further, the business risk control processing device 200 further includes: a first training data set acquisition module and a first prediction model acquisition module, wherein:
第一训练数据集获取模块,用于获取第一训练数据集,所述第一训练数据集包括多个设备的第一业务数据,以及所述多个设备的第一业务数据对应的函数值。The first training data set acquisition module is configured to acquire a first training data set, where the first training data set includes first service data of multiple devices and function values corresponding to the first service data of the multiple devices.
进一步地,所述第一训练数据集获取模块包括:第一业务数据获取子模块、第一业务数据标签获得子模块以及第一训练数据集获取子模块,其中:Further, the first training data set acquisition module includes: a first business data acquisition sub-module, a first business data label acquisition sub-module, and a first training data set acquisition sub-module, wherein:
第一业务数据获取子模块,用于获取所述多个设备的第一业务数据。The first service data obtaining submodule is used to obtain the first service data of the multiple devices.
第一业务数据标签获得子模块,用于基于预设规则分别对所述多个设备的第一业务数据添加标签,获得多个设备的第一业务数据标签。The first service data label obtaining sub-module is configured to respectively add labels to the first service data of the multiple devices based on preset rules to obtain the first service data labels of the multiple devices.
进一步地,所述第一业务数据标签获得子模块包括:预测规则检测单元和第一业务数据标签获得单元,其中:Further, the first service data label obtaining submodule includes: a prediction rule detection unit and a first service data label obtaining unit, wherein:
预测规则检测单元,用于分别检测所述多个设备的第一业务数据是否满足所述预设规则。The prediction rule detection unit is configured to respectively detect whether the first service data of the multiple devices meet the preset rule.
第一业务数据标签获得单元,用于将检测到满足所述预设规则的设备的第一业务数据添加第一标签,将检测到不满足所述预设规则的设备的第一业务数据添加第二标签,获得多个设备的第一业务数据标签。The first service data label obtaining unit is configured to add a first label to the first service data of a device that is detected to meet the preset rule, and to add a first label to the first service data of a device that does not meet the preset rule. The second label is to obtain the first service data label of multiple devices.
进一步地,所述第一业务数据标签获得子模块包括:第一占比获取单元、第二占比获取单元以及可信度阈值获得单元,其中:Further, the first service data label obtaining submodule includes: a first proportion obtaining unit, a second proportion obtaining unit, and a credibility threshold obtaining unit, wherein:
第一占比获取单元,用于获取所述多个设备的第一业务数据中满足所述预设规则的第一业务数据的占比作为第一占比。The first proportion obtaining unit is configured to obtain the proportion of the first business data satisfying the preset rule among the first business data of the multiple devices as the first proportion.
第二占比获取单元,用于获取所述多个设备的第一业务数据中不满足所述预设规则的第一业务数据的占比作为第二占比。The second proportion acquiring unit is configured to acquire the proportion of the first service data that does not satisfy the preset rule among the first business data of the multiple devices as the second proportion.
可信度阈值获得单元,用于基于所述第一占比和所述第二占比获得可信度阈值作为预设可信度阈值。The credibility threshold obtaining unit is configured to obtain a credibility threshold based on the first proportion and the second proportion as a preset credibility threshold.
第一训练数据集获取子模块,用于获取第一训练数据集,所述训练数据集包括多个设备的第一业务数据标签,以及多个设备的第一业务数据标签对应的函数值。The first training data set acquisition sub-module is configured to acquire a first training data set, the training data set including first service data labels of multiple devices, and function values corresponding to the first service data labels of multiple devices.
第一预测模型获得模块,用于基于所述第一训练数据集,将所述多个设备的第一业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型。The first prediction model obtaining module is configured to use the first service data of the multiple devices as input data, and use the function values corresponding to the first service data of the multiple devices as output based on the first training data set Data is trained through a machine learning algorithm to obtain the first prediction model as the trained prediction model.
进一步地,所述第一预测模块获得模块包括:第一预测模型获得子模块,其中:Further, the first prediction module obtaining module includes: a first prediction model obtaining sub-module, wherein:
第一预测模型获得子模块,用于基于所述第一训练数据集,将所述多个设备的第一业务数据进行Onehot处理后作为输入数据,以及将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型。The first prediction model obtaining sub-module is configured to perform Onehot processing on the first service data of the multiple devices as input data based on the first training data set, and use the first service data of the multiple devices The corresponding function value is used as the output data, and the first prediction model is obtained as the trained prediction model through training of the machine learning algorithm.
进一步地,所述第一预测模型获得子模块包括:第一预测模型获得单元,其中:Further, the first prediction model obtaining sub-module includes: a first prediction model obtaining unit, wherein:
第一预测模型获得单元,用于基于所述第一训练数据集,将所述多个设备的第一业务数据进行Onehot处理后作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过DeepFM算法进行训练获得第一预测模型作为已训练的预测模型。The first prediction model obtaining unit is configured to, based on the first training data set, perform Onehot processing on the first service data of the multiple devices as input data, and correspond to the first service data of the multiple devices The function value is used as the output data, and the first prediction model is obtained as the trained prediction model by training through the DeepFM algorithm.
进一步地,所述业务风控处理装置200还包括:第一检测结果获得模块、第二业务数据获取模块、第二训练数据集获取模块以及第二预测模型获得模块,其中:Further, the business risk control processing device 200 further includes: a first detection result acquisition module, a second business data acquisition module, a second training data set acquisition module, and a second prediction model acquisition module, wherein:
第一检测结果获得模块,用于基于所述多个设备的第一业务数据对应的函数值和预设可信度阈值,获得所述多个设备的第一业务数据的检测结果。The first detection result obtaining module is configured to obtain the detection result of the first service data of the multiple devices based on the function value corresponding to the first service data of the multiple devices and the preset credibility threshold.
第二业务数据获取模块,用于当所述多个设备的第一业务数据的检测结果表征所述多个设备的第一业务数据中的目标设备的第一业务数据为不确定数据时,获取所述目标设备的第二业务数据。The second business data acquisition module is configured to acquire when the detection result of the first business data of the multiple devices characterizes that the first business data of the target device in the first business data of the multiple devices is uncertain data The second service data of the target device.
第二训练数据集获取模块,用于获取第二训练数据集,所述第二训练数据集包括所述多个设备的第一业务数据,所述多个设备的第一业务数据对应的函数值,所述目标设备的第二业务数据以及所述目标设备的第二业务数据对应的函数值。The second training data set acquisition module is configured to acquire a second training data set, the second training data set includes the first service data of the multiple devices, and the function values corresponding to the first service data of the multiple devices , The second service data of the target device and the function value corresponding to the second service data of the target device.
第二预测模型获得模块,用于基于所述第二训练数据集,将所述多个设备的第一业务数据和所述目标设备的第二业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值和所述目标设备的第二业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第二预测模型作为已训练的预测模型。The second prediction model obtaining module is configured to use the first service data of the multiple devices and the second service data of the target device as input data based on the second training data set, and the The function value corresponding to the first service data and the function value corresponding to the second service data of the target device are used as output data, and the second prediction model is obtained as the trained prediction model by training with a machine learning algorithm.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述装置和模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of the description, the specific working process of the device and module described above can refer to the corresponding process in the foregoing method embodiment, which will not be repeated here.
在本申请所提供的几个实施例中,模块相互之间的耦合可以是电性,机械或其它形式的耦合。In the several embodiments provided in this application, the coupling between the modules may be electrical, mechanical or other forms of coupling.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。In addition, the functional modules in the various embodiments of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
请参阅图12,其示出了本申请实施例提供的一种电子设备100的结构框图。该电子设备100可以是智能手机、平板电脑、电子书等能够运行应用程序的电子设备。本申请中的电子设备100可以包括一个或多个如下部件:处理器110、存储器120以及一个或多个应用程序,其中一个或多个应用程序可以被存储在存储器120中并被配置为由一个或多个处理器110执行,一个或多个程序配置用于执行如前述方法实施例所描述的方法。Please refer to FIG. 12, which shows a structural block diagram of an electronic device 100 provided by an embodiment of the present application. The electronic device 100 may be an electronic device capable of running application programs, such as a smart phone, a tablet computer, or an e-book. The electronic device 100 in this application may include one or more of the following components: a processor 110, a memory 120, and one or more application programs, where one or more application programs may be stored in the memory 120 and configured to be composed of one Or multiple processors 110 execute, and one or more programs are configured to execute the method described in the foregoing method embodiment.
其中,处理器110可以包括一个或者多个处理核。处理器110利用各种接口和线路连接整个电子设备100内的各个部分,通过运行或执行存储在存储器120内的指令、程序、代码集或指令集, 以及调用存储在存储器120内的数据,执行电子设备100的各种功能和处理数据。可选地,处理器110可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable Logic Array,PLA)中的至少一种硬件形式来实现。处理器110可集成中央处理器(Central Processing Unit,CPU)、图形处理器(Graphics Processing Unit,GPU)和调制解调器等中的一种或几种的组合。其中,CPU主要处理操作系统、用户界面和应用程序等;GPU用于负责待显示内容的渲染和绘制;调制解调器用于处理无线通信。可以理解的是,上述调制解调器也可以不集成到处理器110中,单独通过一块通信芯片进行实现。The processor 110 may include one or more processing cores. The processor 110 uses various interfaces and lines to connect various parts of the entire electronic device 100, and executes by running or executing instructions, programs, code sets, or instruction sets stored in the memory 120, and calling data stored in the memory 120. Various functions and processing data of the electronic device 100. Optionally, the processor 110 may adopt at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). A kind of hardware form to realize. The processor 110 may be integrated with one or a combination of a central processing unit (CPU), a graphics processing unit (GPU), a modem, and the like. Among them, the CPU mainly processes the operating system, user interface, and application programs; the GPU is used for rendering and drawing the content to be displayed; the modem is used for processing wireless communication. It can be understood that the above-mentioned modem may not be integrated into the processor 110, but may be implemented by a communication chip alone.
存储器120可以包括随机存储器(Random Access Memory,RAM),也可以包括只读存储器(Read-Only Memory)。存储器120可用于存储指令、程序、代码、代码集或指令集。存储器120可包括存储程序区和存储数据区,其中,存储程序区可存储用于实现操作系统的指令、用于实现至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现下述各个方法实施例的指令等。存储数据区还可以存储电子设备100在使用中所创建的数据(比如电话本、音视频数据、聊天记录数据)等。The memory 120 may include random access memory (RAM) or read-only memory (Read-Only Memory). The memory 120 may be used to store instructions, programs, codes, code sets or instruction sets. The memory 120 may include a program storage area and a data storage area, where the program storage area may store instructions for implementing the operating system and instructions for implementing at least one function (such as touch function, sound playback function, image playback function, etc.) , Instructions used to implement the following various method embodiments, etc. The storage data area can also store data (such as phone book, audio and video data, chat record data) created by the electronic device 100 during use.
请参阅图13,其示出了本申请实施例提供的一种计算机可读存储介质的结构框图。该计算机可读介质300中存储有程序代码,所述程序代码可被处理器调用执行上述方法实施例中所描述的方法。Please refer to FIG. 13, which shows a structural block diagram of a computer-readable storage medium provided by an embodiment of the present application. The computer-readable medium 300 stores program code, and the program code can be invoked by a processor to execute the method described in the foregoing method embodiment.
计算机可读存储介质300可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。可选地,计算机可读存储介质300包括非易失性计算机可读介质(non-transitory computer-readable storage medium)。计算机可读存储介质300具有执行上述方法中的任何方法步骤的程序代码310的存储空间。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。程序代码310可以例如以适当形式进行压缩。The computer-readable storage medium 300 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM. Optionally, the computer-readable storage medium 300 includes a non-transitory computer-readable storage medium. The computer-readable storage medium 300 has storage space for the program code 310 for executing any method steps in the above-mentioned methods. These program codes can be read from or written into one or more computer program products. The program code 310 may be compressed in a suitable form, for example.
综上所述,本申请实施例提供的业务风控处理方法、装置、电子设备以及存储介质,在设备进行当前业务访问时,获取设备的当前业务数据,将当前业务数据输入已训练的预测模型,获取已训练的预测模型输出的当前函数值,基于当前函数值和预设可信度阈值,获得当前业务数据的当前检测结果,其中,当前检测结果用于表征当前业务数据是否为恶意数据,基于当前检测结果确定针对该设备的当前业务访问的处理方式,从而通过已训练的预测模型输出的函数值和预设可信度阈值确定业务数据是否为恶意数据,提高恶意数据判断的可信度。In summary, the business risk control processing method, device, electronic device, and storage medium provided in the embodiments of the application obtain current business data of the device when the device performs current business access, and input the current business data into the trained prediction model , Obtain the current function value output by the trained prediction model, and obtain the current detection result of the current business data based on the current function value and the preset credibility threshold, where the current detection result is used to characterize whether the current business data is malicious data, Determine the current service access processing method for the device based on the current detection result, so as to determine whether the service data is malicious data through the function value output by the trained prediction model and the preset credibility threshold, and improve the credibility of malicious data judgment .
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不驱使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions recorded in the foregoing embodiments are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not drive the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims (20)

  1. 一种业务风控处理方法,其特征在于,所述方法包括:A business risk control processing method, characterized in that the method includes:
    在设备进行当前业务访问时,获取所述设备的当前业务数据;When the device is performing current business access, acquiring the current business data of the device;
    将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值;Input the current business data into the trained prediction model, and obtain the current function value output by the trained prediction model;
    基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据;Obtaining a current detection result of the current service data based on the current function value and a preset credibility threshold, where the current detection result is used to characterize whether the current service data is malicious data;
    基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。Based on the current detection result, a processing mode for the current service access to the device is determined.
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式,包括:The method according to claim 1, wherein the determining a processing mode for the current service access to the device based on the current detection result comprises:
    当所述当前检测结果表征所述当前业务数据为恶意数据时,拒绝所述设备的当前业务访问;When the current detection result characterizes that the current service data is malicious data, deny the current service access of the device;
    当所述当前检测结果表征所述当前业务数据为非恶意数据时,执行所述设备的当前业务访问。When the current detection result characterizes that the current service data is non-malicious data, execute the current service access of the device.
  3. 根据权利要求1或2所述的方法,其特征在于,所述当前检测结果还用于表征所述当前业务数据为不确定数据,所述基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果之后,还包括:The method according to claim 1 or 2, wherein the current detection result is further used to characterize that the current service data is uncertain data, and the current function value and a preset credibility threshold are based on After obtaining the current detection result of the current business data, it further includes:
    当所述当前检测结果表征所述当前业务数据为不确定数据时,获取所述设备在进行其他业务访问时的其他业务数据;When the current detection result characterizes that the current service data is uncertain data, acquiring other service data when the device is accessing other services;
    将所述当前业务数据和所述其他业务数据输入所述已训练的预测模型,获得所述已训练的预测模型输出的其他函数值;Inputting the current business data and the other business data into the trained prediction model to obtain other function values output by the trained prediction model;
    基于所述其他函数值和所述预设可信度阈值,获得所述当前业务数据和所述其他业务数据的其他检测结果,其中,所述其他检测结果用于表征所述当前业务数据是否为恶意数据;Based on the other function value and the preset credibility threshold, obtain the current service data and other detection results of the other service data, where the other detection results are used to characterize whether the current service data is Malicious data;
    基于所述其他检测结果确定针对所述设备的当前业务访问的处理方式。Determine the processing mode for the current service access of the device based on the other detection results.
  4. 根据权利要求3所述的方法,其特征在于,所述当所述当前检测结果表征所述当前业务数据为不确定数据时,获取所述设备在进行其他业务访问时的其他业务数据,包括:The method according to claim 3, wherein, when the current detection result characterizes that the current service data is uncertain data, acquiring other service data of the device when accessing other services comprises:
    当所述当前检测结果表征所述当前业务数据为不确定数据时,获取所述当前业务数据的业务类型;When the current detection result characterizes that the current service data is uncertain data, acquiring the service type of the current service data;
    当所述当前业务数据的业务类型满足预设业务类型时,获取所述设备在进行其他业务访问时的其他业务数据。When the service type of the current service data meets the preset service type, obtain other service data when the device is accessing other services.
  5. 根据权利要求4所述的方法,其特征在于,所述当所述当前业务数据的业务类型满足预设业务类型时,获取所述设备在进行其他业务访问时的其他业务数据,包括:The method according to claim 4, characterized in that, when the service type of the current service data meets the preset service type, acquiring other service data when the device is accessing other services comprises:
    当所述当前业务数据的业务类型满足交易类型时,获取所述设备在进行其他业务访问时的其他业务数据。When the service type of the current service data satisfies the transaction type, obtain other service data when the device is accessing other services.
  6. 根据权利要求3-5任一项所述的方法,其特征在于,所述将所述当前业务数据和所述其他业务数据输入所述已训练的预测模型,获得所述已训练的预测模型输出的其他函数值,包括:The method according to any one of claims 3-5, wherein said inputting said current business data and said other business data into said trained prediction model to obtain the output of said trained prediction model Other function values of, including:
    获取所述设备在进行其他业务访问时的其他业务数据对应的情报分数,其中,所述情报分数用于表征所述其他业务数据不为非恶意数据的概率;Acquiring an intelligence score corresponding to other business data when the device is performing other business access, where the intelligence score is used to characterize the probability that the other business data is not non-malicious data;
    基于所述情报分数对所述其他业务数据进行数据增强处理,获得多个其他业务数据;Performing data enhancement processing on the other business data based on the intelligence score to obtain multiple other business data;
    将所述当前业务数据和所述多个其他业务数据输入所述已训练的预测模型,获得所述已训练的预测模型输出的其他函数值。The current service data and the multiple other service data are input into the trained prediction model to obtain other function values output by the trained prediction model.
  7. 根据权利要求6所述的方法,其特征在于,所述基于所述情报分数对所述其他业务数据进行数据增强处理,获得多个其他业务数据,包括:The method according to claim 6, wherein the performing data enhancement processing on the other business data based on the intelligence score to obtain multiple other business data comprises:
    获取所述设备在进行其他业务访问时的其他业务数据对应所述情报分数的持续时长;Acquiring the duration of other business data corresponding to the intelligence score when the device is performing other business access;
    基于所述情报分数和所述持续时长对所述其他业务数据进行数据增强处理,获得多个其他业务数据。Perform data enhancement processing on the other business data based on the intelligence score and the duration to obtain multiple other business data.
  8. 根据权利要求6或7所述的方法,其特征在于,所述情报分数与所述数据增强的倍数成正相关。The method according to claim 6 or 7, wherein the intelligence score is positively correlated with the multiple of the data enhancement.
  9. 根据权利要求3-8任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 3-8, wherein the method further comprises:
    获取所述已训练的预测模型在第一时间段内输出的多个第一函数值,和在第二时间段内输出的多个第二函数值,其中,所述第一时间段和所述第二时间段为相邻时间段;Obtain multiple first function values output by the trained prediction model in the first time period and multiple second function values output in the second time period, wherein the first time period and the The second time period is an adjacent time period;
    基于所述多个第一函数值和预设可信度阈值获得多个第一检测结果,并基于所述多个第二函数值和预设可信度阈值获得多个第二检测结果;Obtaining a plurality of first detection results based on the plurality of first function values and a preset credibility threshold, and obtaining a plurality of second detection results based on the plurality of second function values and the preset credibility threshold;
    获取所述多个第一检测结果中表征业务数据为不确定数据的比例作为第一比例,并获取所述多个第二检测结果中表征业务数据为不确定数据的比例作为第二比例;Acquiring a proportion of the plurality of first detection results that characterize business data as uncertain data as a first proportion, and acquiring a proportion of the plurality of second detection results that characterize business data as uncertain data as a second proportion;
    当所述第一比例和所述第二比例之间的差值大于指定差值时,对所述已训练的预测模型重新进行训练。When the difference between the first ratio and the second ratio is greater than a specified difference, the trained prediction model is retrained.
  10. 根据权利要求1-9任一项所述的方法,其特征在于,所述在设备进行当前业务访问时,获取所述设备的当前业务数据之前,还包括:The method according to any one of claims 1-9, characterized in that, before acquiring the current service data of the device when the device is performing current service access, the method further comprises:
    获取第一训练数据集,所述第一训练数据集包括多个设备的第一业务数据,以及所述多个设备的第一业务数据对应的函数值;Acquiring a first training data set, where the first training data set includes first service data of multiple devices and function values corresponding to the first service data of the multiple devices;
    基于所述第一训练数据集,将所述多个设备的第一业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型。Based on the first training data set, the first service data of the multiple devices are used as input data, and the function values corresponding to the first service data of the multiple devices are used as output data, which are obtained by training through a machine learning algorithm The first prediction model serves as the trained prediction model.
  11. 根据权利要求10所述的方法,其特征在于,所述基于所述第一训练数据集,将所述多个设备的第一业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型之后,还包括:The method according to claim 10, characterized in that, based on the first training data set, the first service data of the multiple devices are used as input data, and the first service data of the multiple devices are used as input data. The corresponding function value is used as the output data, and after the first prediction model is obtained as the trained prediction model through training of the machine learning algorithm, it also includes:
    基于所述多个设备的第一业务数据对应的函数值和预设可信度阈值,获得所述多个设备的第一业务数据的检测结果;Obtaining the detection result of the first service data of the multiple devices based on the function values corresponding to the first service data of the multiple devices and the preset credibility threshold;
    当所述多个设备的第一业务数据的检测结果表征所述多个设备的第一业务数据中的目标设备的第一业务数据为不确定数据时,获取所述目标设备的第二业务数据;When the detection result of the first service data of the multiple devices characterizes that the first service data of the target device in the first service data of the multiple devices is uncertain data, the second service data of the target device is acquired ;
    获取第二训练数据集,所述第二训练数据集包括所述多个设备的第一业务数据,所述多个设备的第一业务数据对应的函数值,所述目标设备的第二业务数据以及所述目标设备的第二业务数据对应的函数值;Acquire a second training data set, where the second training data set includes first service data of the multiple devices, function values corresponding to the first service data of the multiple devices, and second service data of the target device And the function value corresponding to the second service data of the target device;
    基于所述第二训练数据集,将所述多个设备的第一业务数据和所述目标设备的第二业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值和所述目标设备的第二业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第二预测模型作为已训练的预测模型。Based on the second training data set, the first service data of the multiple devices and the second service data of the target device are used as input data, and the function values corresponding to the first service data of the multiple devices are summed The function value corresponding to the second service data of the target device is used as output data, and the second prediction model is obtained as the trained prediction model by training with a machine learning algorithm.
  12. 根据权利要求10或11所述的方法,其特征在于,所述获取第一训练数据集,所述第一训练数据集包括多个设备的第一业务数据,以及所述多个设备的第一业务数据对应的函数值,包括:The method according to claim 10 or 11, wherein the first training data set is obtained, and the first training data set includes first service data of multiple devices, and first service data of the multiple devices. The function value corresponding to the business data, including:
    获取所述多个设备的第一业务数据;Acquiring first service data of the multiple devices;
    基于预设规则分别对所述多个设备的第一业务数据添加标签,获得多个设备的第一业务数据标签;Respectively adding tags to the first service data of the multiple devices based on preset rules to obtain the first service data tags of the multiple devices;
    获取第一训练数据集,所述训练数据集包括多个设备的第一业务数据标签,以及多个设备的第一业务数据标签对应的函数值。Acquire a first training data set, where the training data set includes first service data labels of multiple devices and function values corresponding to the first service data labels of multiple devices.
  13. 根据权利要求12所述的方法,其特征在于,所述基于预设规则分别对所述多个设备的第一业务数据添加标签,获得多个设备的第一业务数据标签,包括:The method according to claim 12, wherein the respectively adding tags to the first service data of the multiple devices based on a preset rule to obtain the first service data tags of the multiple devices comprises:
    分别检测所述多个设备的第一业务数据是否满足所述预设规则;Separately detecting whether the first service data of the multiple devices meets the preset rule;
    将检测到满足所述预设规则的设备的第一业务数据添加第一标签,将检测到不满足所述预设规则的设备的第一业务数据添加第二标签,获得多个设备的第一业务数据标签。Add a first tag to the first service data of a device that is detected to meet the preset rule, add a second tag to the first service data of a device that is detected to not meet the preset rule, and obtain the first data of multiple devices. Business data label.
  14. 根据权利要求13所述的方法,其特征在于,所述分别检测所述多个设备的第一业务数据是否满足所述预设规则之后,还包括:The method according to claim 13, wherein after the detecting whether the first service data of the multiple devices meets the preset rule, the method further comprises:
    获取所述多个设备的第一业务数据中满足所述预设规则的第一业务数据的占比作为第一占比;Acquiring a proportion of the first business data satisfying the preset rule among the first business data of the multiple devices as the first proportion;
    获取所述多个设备的第一业务数据中不满足所述预设规则的第一业务数据的占比作为第二占比;Acquiring a proportion of the first business data that does not meet the preset rule in the first business data of the multiple devices as a second proportion;
    基于所述第一占比和所述第二占比获得可信度阈值作为预设可信度阈值。A credibility threshold is obtained based on the first proportion and the second proportion as a preset credibility threshold.
  15. 根据权利要求13或14所述的方法,其特征在于,所述第一标签表征所述业务数据为恶意数据,所述第二标签表征所述业务数据为非恶意数据。The method according to claim 13 or 14, wherein the first label represents that the service data is malicious data, and the second label represents that the service data is non-malicious data.
  16. 根据权利要求10-14任一项所述的方法,其特征在于,所述基于所述第一训练数据集,将所述多个设备的第一业务数据作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型,包括:The method according to any one of claims 10-14, wherein, based on the first training data set, the first service data of the multiple devices are used as input data, and the multiple devices are The function value corresponding to the first business data is used as the output data, and the first prediction model is obtained as the trained prediction model through training of the machine learning algorithm, including:
    基于所述第一训练数据集,将所述多个设备的第一业务数据进行Onehot处理后作为输入数据,以及将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型。Based on the first training data set, the first business data of the multiple devices are processed by Onehot as input data, and the function values corresponding to the first business data of the multiple devices are used as output data. The learning algorithm is trained to obtain the first prediction model as the trained prediction model.
  17. 根据权利要求16所述的方法,其特征在于,所述基于所述第一训练数据集,将所述多个设备的第一业务数据进行Onehot处理后作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过机器学习算法进行训练获得第一预测模型作为已训练的预测模型,包括:The method according to claim 16, characterized in that, based on the first training data set, the first service data of the multiple devices are processed by Onehot as input data, and the data of the multiple devices The function value corresponding to the first business data is used as the output data, and the first prediction model is obtained as the trained prediction model through training of the machine learning algorithm, including:
    基于所述第一训练数据集,将所述多个设备的第一业务数据进行Onehot处理后作为输入数据,将所述多个设备的第一业务数据对应的函数值作为输出数据,通过DeepFM算法进行训练获得第一预测模型作为已训练的预测模型。Based on the first training data set, the first service data of the multiple devices are processed by Onehot as input data, and the function values corresponding to the first service data of the multiple devices are used as output data, using the DeepFM algorithm Perform training to obtain the first prediction model as the trained prediction model.
  18. 一种业务风控处理装置,其特征在于,所述装置包括:A business risk control processing device, characterized in that the device includes:
    当前业务数据获取模块,用于在设备进行当前业务访问时,获取所述设备的当前业务数据;The current business data acquisition module is used to acquire the current business data of the device when the device performs current business access;
    当前函数值获取模块,用于将所述当前业务数据输入已训练的预测模型,获取所述已训练的预测模型输出的当前函数值;The current function value obtaining module, configured to input the current business data into the trained prediction model, and obtain the current function value output by the trained prediction model;
    当前检测结果获得模块,用于基于所述当前函数值和预设可信度阈值,获得所述当前业务数据的当前检测结果,其中,所述当前检测结果用于表征所述当前业务数据是否为恶意数据;The current detection result obtaining module is configured to obtain the current detection result of the current service data based on the current function value and a preset credibility threshold, wherein the current detection result is used to characterize whether the current service data is Malicious data;
    处理方式确定模块,用于基于所述当前检测结果确定针对所述设备的当前业务访问的处理方式。The processing mode determination module is configured to determine the processing mode for the current service access of the device based on the current detection result.
  19. 一种电子设备,其特征在于,包括存储器和处理器,所述存储器耦接到所述处理器,所述存储器存储指令,当所述指令由所述处理器执行时所述处理器执行如权利要求1-17任一项所述的方法。An electronic device, comprising a memory and a processor, the memory is coupled to the processor, the memory stores instructions, and the processor executes the instructions when the instructions are executed by the processor. The method of any one of 1-17 is required.
  20. 一种计算机可读取存储介质,其特征在于,所述计算机可读取存储介质中存储有程序代码,所述程序代码可被处理器调用执行如权利要求1-17任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores program code, and the program code can be called by a processor to execute the method according to any one of claims 1-17 .
PCT/CN2020/076457 2020-02-24 2020-02-24 Processing method and apparatus for service risk management, electronic device, and storage medium WO2021168617A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/076457 WO2021168617A1 (en) 2020-02-24 2020-02-24 Processing method and apparatus for service risk management, electronic device, and storage medium
CN202080093339.4A CN115004652B (en) 2020-02-24 Business wind control processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/076457 WO2021168617A1 (en) 2020-02-24 2020-02-24 Processing method and apparatus for service risk management, electronic device, and storage medium

Publications (1)

Publication Number Publication Date
WO2021168617A1 true WO2021168617A1 (en) 2021-09-02

Family

ID=77491711

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/076457 WO2021168617A1 (en) 2020-02-24 2020-02-24 Processing method and apparatus for service risk management, electronic device, and storage medium

Country Status (1)

Country Link
WO (1) WO2021168617A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114826715A (en) * 2022-04-15 2022-07-29 咪咕文化科技有限公司 Network protection method, device, equipment and storage medium
CN116488949A (en) * 2023-06-26 2023-07-25 中国电子信息产业集团有限公司第六研究所 Industrial control system intrusion detection processing method, system, device and storage medium
CN117132001A (en) * 2023-10-24 2023-11-28 杭银消费金融股份有限公司 Multi-target wind control strategy optimization method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080120699A1 (en) * 2006-11-17 2008-05-22 Mcafee, Inc. Method and system for assessing and mitigating access control to a managed network
CN107395430A (en) * 2017-08-16 2017-11-24 中国民航大学 A kind of cloud platform dynamic risk access control method
CN108229157A (en) * 2017-12-29 2018-06-29 北京潘达互娱科技有限公司 Server attack early warning method and apparatus
US20190109837A1 (en) * 2016-04-28 2019-04-11 Brendan Xavier Louis Systems and methods of user authentication for data services
CN110798488A (en) * 2020-01-03 2020-02-14 北京东方通科技股份有限公司 Web application attack detection method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080120699A1 (en) * 2006-11-17 2008-05-22 Mcafee, Inc. Method and system for assessing and mitigating access control to a managed network
US20190109837A1 (en) * 2016-04-28 2019-04-11 Brendan Xavier Louis Systems and methods of user authentication for data services
CN107395430A (en) * 2017-08-16 2017-11-24 中国民航大学 A kind of cloud platform dynamic risk access control method
CN108229157A (en) * 2017-12-29 2018-06-29 北京潘达互娱科技有限公司 Server attack early warning method and apparatus
CN110798488A (en) * 2020-01-03 2020-02-14 北京东方通科技股份有限公司 Web application attack detection method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114826715A (en) * 2022-04-15 2022-07-29 咪咕文化科技有限公司 Network protection method, device, equipment and storage medium
CN114826715B (en) * 2022-04-15 2024-03-22 咪咕文化科技有限公司 Network protection method, device, equipment and storage medium
CN116488949A (en) * 2023-06-26 2023-07-25 中国电子信息产业集团有限公司第六研究所 Industrial control system intrusion detection processing method, system, device and storage medium
CN116488949B (en) * 2023-06-26 2023-09-01 中国电子信息产业集团有限公司第六研究所 Industrial control system intrusion detection processing method, system, device and storage medium
CN117132001A (en) * 2023-10-24 2023-11-28 杭银消费金融股份有限公司 Multi-target wind control strategy optimization method and system
CN117132001B (en) * 2023-10-24 2024-01-23 杭银消费金融股份有限公司 Multi-target wind control strategy optimization method and system

Also Published As

Publication number Publication date
CN115004652A (en) 2022-09-02

Similar Documents

Publication Publication Date Title
Elmasry et al. Evolving deep learning architectures for network intrusion detection using a double PSO metaheuristic
US20190311367A1 (en) System and method for using a data genome to identify suspicious financial transactions
US20190259033A1 (en) System and method for using a data genome to identify suspicious financial transactions
WO2021168617A1 (en) Processing method and apparatus for service risk management, electronic device, and storage medium
CN111159395A (en) Chart neural network-based rumor standpoint detection method and device and electronic equipment
CN110135157A (en) Malware homology analysis method, system, electronic equipment and storage medium
WO2021139279A1 (en) Data processing method and apparatus based on classification model, and electronic device and medium
Liu et al. Web intrusion detection system combined with feature analysis and SVM optimization
Tao et al. A network intrusion detection model based on convolutional neural network
CN110730164B (en) Safety early warning method, related equipment and computer readable storage medium
CN110162939B (en) Man-machine identification method, equipment and medium
Gonaygunta Machine learning algorithms for detection of cyber threats using logistic regression
Zhu et al. NUS: Noisy-sample-removed undersampling scheme for imbalanced classification and application to credit card fraud detection
CN113408897A (en) Data resource sharing method applied to big data service and big data server
Bhati et al. A new ensemble based approach for intrusion detection system using voting
BOUIJIJ et al. Machine learning algorithms evaluation for phishing urls classification
Li et al. Intrusion detection method based on imbalanced learning classification
CN115776401B (en) Method and device for tracing network attack event based on less sample learning
CN113259369B (en) Data set authentication method and system based on machine learning member inference attack
CN115004652B (en) Business wind control processing method and device, electronic equipment and storage medium
CN117009832A (en) Abnormal command detection method and device, electronic equipment and storage medium
US20210241147A1 (en) Method and device for predicting pair of similar questions and electronic equipment
Xie et al. Research and application of intrusion detection method based on hierarchical features
HUANG et al. Cyberbullying detection on social media
EL Asry et al. A robust intrusion detection system based on a shallow learning model and feature extraction techniques

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20921374

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20921374

Country of ref document: EP

Kind code of ref document: A1