CN113643119A - Model training method, business wind control method and business wind control device - Google Patents

Model training method, business wind control method and business wind control device Download PDF

Info

Publication number
CN113643119A
CN113643119A CN202110813046.5A CN202110813046A CN113643119A CN 113643119 A CN113643119 A CN 113643119A CN 202110813046 A CN202110813046 A CN 202110813046A CN 113643119 A CN113643119 A CN 113643119A
Authority
CN
China
Prior art keywords
service
user
training
wind control
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110813046.5A
Other languages
Chinese (zh)
Inventor
刘文强
李京昊
杨情
刘扬
陈金辉
朱晨
朱猛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sankuai Online Technology Co Ltd
Original Assignee
Beijing Sankuai Online Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sankuai Online Technology Co Ltd filed Critical Beijing Sankuai Online Technology Co Ltd
Priority to CN202110813046.5A priority Critical patent/CN113643119A/en
Publication of CN113643119A publication Critical patent/CN113643119A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Engineering & Computer Science (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The specification discloses a model training method, a business wind control method and a business wind control device. First, service data is acquired. Secondly, the service data is input into a preset prediction model to predict a service result corresponding to at least part of the rest service period of the set service executed by the target user. And then, according to the service result, determining label information corresponding to the target user, and labeling the user related data of the target user before executing the set service to generate a training sample. And finally, inputting the training sample into a wind control model to be trained to predict a risk recognition result of a target user for a set service, and training the wind control model by taking the deviation between the minimized risk recognition result and the label information as an optimization target. According to the method, training samples can be generated through a preset prediction model, the number of the training samples is increased, the training degree of the wind control model is improved, and therefore the accuracy of the risk identification result determined through the wind control model is improved.

Description

Model training method, business wind control method and business wind control device
Technical Field
The specification relates to the technical field of computers, in particular to a model training method, a business wind control method and a business wind control device.
Background
With the rapid development of economy, credit consumption is more and more concerned, and various personal consumption credit loans such as credit card consumption, personal automobile loan, study-aid loan, small-amount consumption loan and the like are increased. The individual consumption of the credit loan requires that each credit guarantor has a relatively perfect credit risk management system, and for this reason, each credit guarantor can use a prediction model to carry out risk prediction on the business transacted by the user.
At present, black samples of default of the user in a set time and white samples of default of the user in the set time are obtained. And judging whether the user will violate after the business is finished according to the business data of the user before applying for the business, so as to train the prediction model. In the credit loan transaction scenario, the length of time corresponding to the training samples is generally longer, for example, six months, one year, etc., resulting in fewer training samples meeting the time requirement. Therefore, the number of samples that can be used for training in the actual training process is limited, which may result in low accuracy of the prediction result determined by the prediction model.
Therefore, how to improve the accuracy of the prediction result determined by the prediction model is an urgent problem to be solved.
Disclosure of Invention
The present specification provides a method for model training, a method for business wind control and a device thereof, so as to partially solve the above problems in the prior art.
The technical scheme adopted by the specification is as follows:
the present specification provides a method of model training, comprising:
acquiring service data generated by a part of service periods of the determined target user executed set service;
inputting the service data into a preset prediction model to predict a service result corresponding to at least part of the rest service periods of the set service executed by the target user, wherein for each service period, the service result corresponding to the service period indicates whether the target user performs to execute the set service in the service period;
determining label information corresponding to the target user according to the service result, and labeling user related data of the target user before executing the set service according to the label information to generate a training sample;
inputting the training sample into a to-be-trained wind control model to predict a risk recognition result of the target user for the set service, and training the wind control model by taking the minimized deviation between the risk recognition result and the label information as an optimization target.
Optionally, the setting service relates to a service application stage;
determining a target user specifically comprises:
and determining the users within the appointed application period set in the service application stage as target users.
Optionally, the setting service further involves a service execution phase;
before determining the label information corresponding to the service result, the method further includes:
using the user outside the appointed application period set in the service application stage as a reference user;
if the service period that the reference user does not perform the set service exists in the service execution stage according to the service result corresponding to the reference user in the service execution stage, determining that the reference user is a black sample;
if it is determined that there is no service period in which the reference user does not perform the set service in the service execution stage according to the service result corresponding to the reference user in the service execution stage, determining that the reference user is a white sample;
determining a risk probability threshold according to the determined black sample and the white sample;
determining the label information corresponding to the service result, specifically including:
and determining label information corresponding to the target user according to the risk probability threshold and the risk probability corresponding to the service result.
Optionally, if it is determined that there is no service period in which the reference user does not perform the set service in the service execution stage according to the service result corresponding to the reference user in the service execution stage, determining that the reference user is a white sample specifically includes:
and if it is determined that the service cycle in which the set service is not executed by the reference user in the service execution stage does not exist according to the service result corresponding to the reference user in the service execution stage, and the number of the service cycles in which the set service is executed by the reference user in the service execution stage exceeds the set cycle number, determining that the reference user is a white sample.
Optionally, determining a risk probability threshold according to the determined black sample and the white sample, specifically including:
training the prediction model according to the black sample and the white sample to obtain a trained prediction model;
and inputting the black sample and the white sample into a trained prediction model, and determining the risk probability threshold according to an output result of the trained prediction model.
Optionally, the training samples comprise black samples and white samples;
labeling the user related data of the target user before executing the set service according to the label information to generate a training sample, specifically comprising:
determining the number of samples of black samples in the target user according to the label information to obtain the number of black samples;
determining the number of white samples as the number of white samples according to the determined proportion of the black samples and the number of the black samples;
and according to the label information, marking white samples according with the number of the white samples in user related data of each target user before the set service is executed.
Optionally, determining the black sample fraction specifically includes:
and determining the reference users with the labels as black samples, and taking the user ratio of all the reference users as the black sample ratio.
The present specification provides a method for service wind control, including:
receiving a service application request of a user for the set service;
determining user related data of the user according to the service application request;
inputting the user-related data into a pre-trained wind control model so as to predict a service result after the user executes the set service according to the user-related data, wherein the service result is used for indicating whether the user can perform the set service, and the wind control model is obtained by training through the model training method;
and carrying out service wind control on the user according to the service result.
The present specification provides an apparatus for model training, comprising:
the acquisition module is used for acquiring the determined service data generated by the part of the service cycle of the set service executed by the target user;
the prediction module is used for inputting the service data into a preset prediction model so as to predict a service result corresponding to at least part of the rest service periods of the set service executed by the target user, and for each service period, the service result corresponding to the service period indicates whether the target user performs the set service in the service period;
the marking module is used for determining label information corresponding to the target user according to the service result and marking user related data of the target user before the set service is executed according to the label information so as to generate a training sample;
and the training module is used for inputting the training sample into a wind control model to be trained so as to predict a risk recognition result of the target user for the set service, and training the wind control model by taking the minimized deviation between the risk recognition result and the label information as an optimization target.
This specification provides a device of business wind control, including:
a receiving module, configured to receive a service application request of a user for the set service;
a determining module, configured to determine user-related data of the user according to the service application request;
the prediction module is used for inputting the user related data into a pre-trained wind control model so as to predict a service result after the user executes the set service according to the user related data, wherein the service result is used for indicating whether the user can perform the set service or not, and the wind control model is obtained by training through the model training method;
and the wind control module is used for carrying out service wind control on the user according to the service result.
The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the above-described method of model training and method of traffic scheduling.
The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for model training and the method for traffic scheduling when executing the program.
The technical scheme adopted by the specification can achieve the following beneficial effects:
in the method for model training and the method for business wind control provided by the specification. Firstly, acquiring service data generated by a part of service period of the determined target user executing the set service. Secondly, inputting the service data into a preset prediction model to predict a service result corresponding to at least part of the rest service periods of the set service executed by the target user, wherein for each service period, the service result corresponding to the service period indicates whether the target user performs the set service in the service period. And then, determining label information corresponding to the target user according to the service result, and labeling user related data of the target user before executing the set service according to the label information to generate a training sample. And finally, inputting the training sample into a wind control model to be trained to predict a risk recognition result of a target user for a set service, and training the wind control model by taking the deviation between the minimized risk recognition result and the label information as an optimization target.
It can be seen from the above method that the method can predict the service result corresponding to at least part of the remaining service period of the set service executed by the target user according to the service data generated by the target user executing part of the service period of the set service. That is to say, the service results of part of the remaining service periods in the training samples which do not meet the training requirements can be complemented through the preset prediction model, and the user-related data of the target user before the set service is executed is labeled according to the service results to generate the training samples, so that the number of the training samples for training the wind control model is increased, the training degree of the wind control model is improved, and the accuracy of the risk identification result determined through the wind control model is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the specification and are incorporated in and constitute a part of this specification, illustrate embodiments of the specification and together with the description serve to explain the specification and not to limit the specification in a non-limiting sense. In the drawings:
FIG. 1 is a schematic flow chart of a method of model training in the present specification;
FIG. 2 is a diagram illustrating a model structure of a prediction model provided in an embodiment of the present disclosure;
fig. 3 is a schematic flow chart of a method for service wind control in this specification;
FIG. 4 is a schematic diagram of a model training apparatus provided herein;
fig. 5 is a schematic diagram of a service management apparatus provided in the present specification;
fig. 6 is a schematic diagram of an electronic device corresponding to fig. 1 provided in the present specification.
Detailed Description
In order to make the objects, technical solutions and advantages of the present disclosure more clear, the technical solutions of the present disclosure will be clearly and completely described below with reference to the specific embodiments of the present disclosure and the accompanying drawings. It is to be understood that the embodiments described are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort belong to the protection scope of the present specification.
The technical solutions provided by the embodiments of the present description are described in detail below with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a method for model training in this specification, which includes the following steps:
s100: and acquiring the determined service data generated by the target user in the part of the service period in which the set service is executed.
In the embodiment of the present specification, the execution subject for training the prediction model may be a server, or may be an electronic device such as a desktop computer, and for convenience of description, the following describes a method for training the prediction model provided in the present specification, with only the server as the execution subject.
In this embodiment, the server may obtain service data generated by determining that the target user has performed part of the service period of the set service. The setting service mentioned here may refer to a service that needs to evaluate the user behavior, for example, a service that needs to evaluate whether the user has a violation or a default (loan service, etc.), a service that needs to evaluate the user's ability to consume (credit lease service, credit card service, etc.), and the like.
The service data comprises user related data corresponding to the user in the service application stage and service data corresponding to the user in the service execution stage. The user-related data corresponding to the user in the service application stage is used for reflecting the user information and the behavior characteristics of the user before applying for the service, for example, in the credit loan service, the user information (such as marital status, age, sex, occupation, personal income and the like) and the behavior characteristics (consumption data of the last months) when the user applies for a loan. The service data corresponding to the user in the service execution stage is used for reflecting behavior characteristics of the user in the service execution stage after the user successfully transacts the service, for example, in the credit loan service, data of each repayment behavior after the user successfully loan (such as the maximum overdue days, the minimum overdue days, the loan balance, the overdue amount, the number of advanced repayment, the number of withdrawal, the total withdrawal amount, the proportion of the withdrawal amount to the credit amount, the borrowing interest rate, the borrowing deadline and the like of the last months).
The service period mentioned here may be a period for characterizing a period determined according to actual needs of a service, by which a user performs a service. The service period may be a preset period of time, and the time length of the deadline in this specification is not limited, and may be specifically determined according to actual requirements, for example: three months, six months, one year, etc.
In the embodiment of the present specification, setting a service involves a service application phase and a service execution phase, and the server needs to determine a user within a specified application deadline set in the service application phase as a target user. And taking users beyond the appointed application period set in the service application stage as reference users. The service application phase mentioned herein may refer to a phase in which the user applies for executing the service, and the specified application duration may refer to a duration in which the user has elapsed a specified time period from the time of applying for the service, for example, six months from the time of applying for the service. There may be some overlap in terms of time limit between the service application phase and the service execution phase. The server can predict the service result corresponding to the residual service period of the target user not executing the set service through the prediction model, and train the prediction model to be trained by referring to the user.
S102: and inputting the service data into a preset prediction model to predict a service result corresponding to at least part of the rest service periods of the set service executed by the target user, wherein for each service period, the service result corresponding to the service period indicates whether the target user performs to execute the set service in the service period.
In practical applications, a training sample used by the server to train the risk model needs to have certain business data, that is, the training sample needs to have business data generated by a specified business cycle in which the user has performed a set business, for example, six months, one year, and the like. Due to the longer time length, fewer training samples are required to meet the time requirement to reach the target data volume. Therefore, the server can predict the service result corresponding to at least part of the rest service period of the set service executed by the target user according to the service data generated by the part of the service period of the set service executed by the target user through the prediction model so as to generate the training samples and increase the number of the training samples meeting the requirements.
In this embodiment, the server may input the service data into a preset prediction model to predict a service result corresponding to at least a part of the remaining service period of the set service executed by the target user. And aiming at each service period, the service result corresponding to the service period indicates whether the target user performs the set service in the service period. For example, in the credit loan transaction, the user is not defaulted at each time (timely payment, overdue payment, unpaid payment, etc.).
That is, if the target user has executed the service data generated in the service period of the set service, and does not satisfy the condition of the service data generated in the set service period, the server may predict the service result corresponding to at least part of the remaining service period of the set service executed by the target user, so that the service data generated in the service period of the set service executed by the target user satisfies the condition of the service data generated in the set service period.
For example, the service period requirement is six months, and the server may predict a service result corresponding to the next three months (the remaining service period) according to the service data of only three months (the service data generated by the executed service period) through the prediction model, so that the service data generated by the partial service period in which the target user has executed the set service reaches the time requirement, so as to be used for training the risk model.
S104: and determining label information corresponding to the target user according to the service result, and labeling user related data of the target user before the set service is executed according to the label information to generate a training sample.
In the prior art, when a server trains a wind control model, the server usually only pays attention to tag information within a set time, for example, the server sets a service period to be six months, if a user performs a set service within six months, and performs a set service for non-performance in the seventh month, the server still determines the tag information of the user as a white sample during model training, and the similarity between the user and the user with the black sample is actually significantly greater than the similarity between the user and the user with the white sample, which may interfere with the recognition capability of the wind control model for the black sample, and reduce the recognition capability of the wind control model for the features of the non-performance set service of the user at different periods. Therefore, the server can re-determine the label information corresponding to the part of the samples so as to improve the accuracy of the risk identification result determined by the wind control model.
In this embodiment of the present specification, if the server determines, according to a service result corresponding to the reference user in the service execution stage, that there is a service period in which the reference user does not perform the set service in the service execution stage, it may determine that the reference user is a black sample. And if the service period of the set service which is not executed by the reference user in the service execution stage does not exist according to the service result corresponding to the reference user in the service execution stage, determining that the reference user is a white sample.
That is, the server determines the tag information of the reference user, and needs to determine the tag information according to the tag information of all service periods in the service result corresponding to the service execution stage of the reference user, and if the reference user has a black sample for the tag of one service period, the final tag of the reference user is the black sample. The above-mentioned user non-performance setting service may refer to that the user does not perform the setting service within a set time, for example, in the credit loan service, after the user loans, the user considers that the user does not perform the setting service on the repayment date. For another example, in the credit loan service, the service expiration time is one month, and after the user makes a loan and does not pay for the loan after the loan due date, the user is considered to have not performed the setting service.
After a part of reference users apply for services successfully, the situation that services are not executed or the services are executed later may occur, so that the time length of the service application stage of the part of reference users reaches a specified application period, but the number of service cycles for which the set services have been executed does not reach the set number of cycles, that is, the situation that the part of reference users do not violate due to short service execution stage or no service execution may occur, and thus the part of reference users are determined to be white samples. This part of the reference user does not actually meet the requirements of the white samples for training the wind control model, and therefore, the server needs to determine whether the number of service cycles of the reference user, on which the setting service has been performed, exceeds the set number of cycles in the process of determining the white samples.
If the server determines that there is no service cycle in which the reference user does not perform the set service in the service execution stage according to the service result corresponding to the reference user in the service execution stage, and the number of service cycles in which the reference user has performed the set service exceeds the set number of cycles in the service execution stage, it may be determined that the reference user is a white sample. For example, if the number of cycles is set to six months, the reference user performs the set service in the service execution phase, and the service execution phase is completed for six months, the reference user may be determined to be a white sample.
By the method, the server can re-label the label information corresponding to the reference user so as to improve the recognition capability of the prediction model on the characteristics of the non-performance setting business of the reference user in different periods, so that the business result determined by the prediction model is more accurate.
In this embodiment, the server may train the prediction model according to the black sample and the white sample to obtain the trained prediction model.
Specifically, the server may input the service data corresponding to the reference user and the label information corresponding to the reference user to the prediction model to be trained, predict a service result corresponding to a next service period of the set service executed by the reference user according to the service data generated by the partial service period of the set service executed by the reference user, and train the prediction model with minimizing a deviation between the service result and the label information corresponding to the reference user as an optimization target. Through multiple rounds of iterative training, the deviation can be continuously reduced and converged in a numerical range, and then the training of the prediction model is completed. For example, when the reference user executes the setting service, if the non-fulfillment execution setting service occurs in any service period, the label information corresponding to the reference user is a black sample. The reference to the tag information corresponding to the user may also refer to the tag information corresponding to the service setting executed by the user in each service period, for example, when the reference user executes the service setting, if the first service period performs the service setting, the tag information corresponding to the first service period in the reference user is a white sample, and if the second service period does not perform the service setting, the tag information corresponding to the second service period in the reference user is a black sample.
In practical application, the server determines whether the label corresponding to the target user is a black sample, which is often implemented according to a risk probability threshold determined by human experience, that is, if the risk probability corresponding to the business result determined by the server through the prediction model is greater than the risk probability threshold, the label corresponding to the target user is determined to be the black sample, and if the risk probability corresponding to the business result is less than the risk probability threshold, the label corresponding to the target user is determined to be the white sample. Because the risk probability threshold determined according to the subjective experience of people may not meet the actual requirements of the business, the server may determine the risk probability threshold more meeting the business logic according to the business data corresponding to the reference user, so as to improve the accuracy of the business result determined by the prediction model.
In the embodiment of the present specification, the server may determine a risk probability threshold according to the determined black sample and white sample through the trained prediction model, and then determine the label information corresponding to the target user according to the risk probability threshold and the risk probability corresponding to the service result.
In an embodiment of the specification, the risk probability threshold determined by the server includes a black sample threshold corresponding to a black sample and a white sample threshold corresponding to a white sample.
Specifically, the server may input the service data corresponding to the reference user whose label information is the black sample into the trained prediction model, obtain the risk probability corresponding to the service result corresponding to the reference user whose label information is the black sample, determine the mean value of the risk probabilities corresponding to the service result corresponding to the reference user whose label information is the black sample, and use the mean value as the black sample threshold. Similarly, the server may use the average of the risk probabilities corresponding to the service results corresponding to the reference users whose label information is the white sample as the white sample threshold according to the above method.
For another example, the server may determine the black sample threshold and the white sample threshold according to a bayesian optimization function and by referring to a service result corresponding to the user through a bayesian optimization algorithm.
In practical application, in the training samples, the proportion of the black samples in the total training samples is basically fixed, so that the server can determine the number of the white samples in the target user according to the proportion of the black samples in the training samples, and mark the white samples meeting the number of the white samples in the target user according to the number of the white samples in the target user.
In this embodiment, the training samples include black samples and white samples, and the server may determine the number of samples of the black samples in the target user according to the tag information, so as to obtain the number of black samples. And secondly, determining the number of white samples as the number of white samples according to the determined proportion of the black samples and the number of the black samples. And finally, according to the label information, marking white samples according with the number of the white samples in the user related data of each target user before the target user executes the set service. The specific formula for obtaining the number of white samples is as follows:
Figure BDA0003169147280000121
in the above formula, W is used to represent the number of samples of white samples in the target user, C is used to represent the number of samples of black samples in the target user, and R is used to represent the black sample fraction in the reference user. As can be seen from the above formula, the server may determine the number of samples of the white samples in the target user according to the number of samples of the black samples in the target user and the ratio of the black samples. The number of samples of black samples in the target user mentioned here may be the number of users who have not performed the set service in the service execution stage.
The server may determine the reference user labeled as the black sample, and determine the user ratio among all the reference users as the black sample ratio.
Secondly, the server can input the service data corresponding to the target user into the trained prediction model, and predict the risk probability corresponding to the service result corresponding to at least part of the rest service period of the set service executed by the target user. And finally, determining a white sample threshold value according to the risk probability and the number of white samples.
For example, if it is determined that the number of target users whose tag information is a white sample is W, the server may rank the predicted risk probabilities corresponding to the service results corresponding to the periods in which the target users execute at least part of the remaining services of the set service in order from small to large, use the risk probability corresponding to the target user whose rank position is W as a white sample threshold, and label the target user according to the service results corresponding to the periods in which the target users execute at least part of the remaining services of the set service, which are determined by the prediction model.
The server can also convert the risk probability corresponding to the service result corresponding to at least part of the rest service period of the set service executed by the target user into model prediction scores, sequence the model prediction scores corresponding to the target user from small to large, use the model prediction score corresponding to the target user with the sequencing position W as a white sample threshold value, and label the target user according to the service result determined by the prediction model.
The method can be seen in that the server trains the prediction model by referring to the service data of the user, determines the black sample threshold and the white sample threshold of the prediction model according to the service data of the reference user, and marks the user related data of the target user before the set service is executed by the trained prediction model to generate the training sample.
It should be noted that the prediction model may have various forms, for example, an eXtreme Gradient Boosting (xgboost) model, a Gradient Boosting Decision Tree (GBDT), and the like, and the description does not specifically limit the prediction model.
In this embodiment, after determining the black sample threshold and the white sample threshold, the server may input service data corresponding to the target user into the post-training prediction model to determine a risk probability corresponding to a service result corresponding to at least part of the remaining service period of the set service executed by the target user, label the target user as a black sample if the risk probability is greater than the black sample threshold, and label the target user as a white sample if the risk probability is less than the white sample threshold. The model structure of the prediction model is shown in fig. 2.
Fig. 2 is a schematic diagram of a model structure of a prediction model provided in an embodiment of the present disclosure.
In fig. 2, the server may divide the users into target users and reference users according to whether the users are within a specified application period set in a service application stage through a prediction model. If the user is the reference user, the server may mark the reference user who has not performed the setting service as a black sample. And marking the number of service cycles of the setting service executed by the reference user exceeding the set number of cycles as a white sample. Moreover, since the traffic data amount in which the number of traffic cycles for which the setting service has been executed in the reference user does not exceed the set number of cycles is small and does not satisfy the requirement of the traffic data amount of the training sample, the result of the wind control recognition determined by the wind control model may be inaccurate, and therefore, the traffic data in which the number of traffic cycles for which the setting service has been executed in the reference user does not exceed the set number of cycles is discarded without use.
And the server predicts the risk probability corresponding to the service result of the target user executing at least part of the rest service period of the set service through the prediction model, marks the target sample with the risk probability larger than the black sample threshold as the black sample, and marks the target sample with the risk probability smaller than the white sample threshold as the white sample.
In this embodiment, the server may determine, according to the service result, tag information corresponding to the target user, and label, according to the tag information, user-related data of the target user before executing the set service, so as to generate a training sample. And training the wind control model to be trained according to the training sample.
S106: inputting the training sample into a to-be-trained wind control model to predict a risk recognition result of the target user for the set service, and training the wind control model by taking the minimized deviation between the risk recognition result and the label information as an optimization target.
In this embodiment, the server may input the training sample into the wind control model to be trained to predict a risk identification result of the target user for the set service, and train the wind control model with a deviation between the minimum risk identification result and the label information as an optimization target. Through multiple rounds of iterative training, the deviation can be continuously reduced and converged in a numerical range, and further training of the wind control model is completed.
In the above process, it can be seen that the method can predict the service result corresponding to at least part of the remaining service period of the set service executed by the target user according to the service data generated by the target user according to the part of the service period of the set service executed by the target user through the prediction model. That is to say, the service result of the remaining service period in the target user can be complemented through the preset prediction model, and the user-related data of the target user before the set service is executed is labeled according to the service result to generate the training samples, so that the number of the training samples for training the wind control model is increased, and the training degree of the wind control model is improved. In addition, the method can determine the threshold value which is more in line with the business logic and used for distinguishing the black sample and the white sample by referring to the business data corresponding to the user, so that the accuracy of the determined black sample and the determined white sample is improved, and the accuracy of the risk identification result determined by the wind control model is improved.
In the embodiment of the present description, after the training of the wind control model is completed, the business wind control can be performed on the user through the wind control model, and a specific process is shown in fig. 3.
Fig. 3 is a flowchart illustrating a method for service wind control in this specification.
S300: and receiving a service application request of a user for the set service.
S302: and determining user related data of the user according to the service application request.
S304: inputting the user-related data into a pre-trained wind control model so as to predict a service result after the user executes the set service according to the user-related data, wherein the service result is used for indicating whether the user can perform the set service, and the wind control model is obtained by training through the model training method.
S306: and carrying out service wind control on the user according to the service result.
In this embodiment, the server may receive a service application request of the user for the setting service, for example, if the setting service is a loan service, the server may receive a loan application of the user. And then, according to the service application request, determining user related data of the user, for example, the server determines user information (such as marital status, age, sex, occupation, personal income and the like) and behavior characteristics (consumption data of the last months) when the user applies for loan. Secondly, inputting the user related data into a pre-trained wind control model so as to predict a service result after the user executes the set service according to the user related data, wherein the service result is used for indicating whether the user can perform the set service. And finally, carrying out service wind control on the user according to the service result.
From the above, the server can generate training samples through the prediction model, increase the number of the training samples for training the wind control model, and improve the training degree of the wind control model, and determine the threshold value, which is more in line with the business logic, for distinguishing the black sample from the white sample by referring to the business data corresponding to the user, so as to improve the accuracy of the determined black sample and white sample, so that the wind control model trained in this way is applied to the actual business, and the accuracy of risk identification can be effectively improved.
Based on the same idea, the present specification further provides a corresponding model training apparatus, as shown in fig. 4.
Fig. 4 is a schematic diagram of an apparatus for model training provided in the present specification, including:
an obtaining module 400, configured to obtain service data generated by a part of service cycles of a service that is determined to have been executed and set by a target user;
a predicting module 402, configured to input the service data into a preset predicting model to predict a service result corresponding to at least part of remaining service periods in which the target user executes the set service, where, for each service period, the service result corresponding to the service period indicates whether the target user performs to execute the set service in the service period;
a labeling module 404, configured to determine, according to the service result, label information corresponding to the target user, and label, according to the label information, user-related data of the target user before executing the set service, so as to generate a training sample;
a training module 406, configured to input the training sample into a to-be-trained wind control model, so as to predict a risk identification result of the target user for the set service, and train the wind control model with minimizing a deviation between the risk identification result and the tag information as an optimization target.
Optionally, the obtaining module 400 is specifically configured to determine that the set service relates to a service application stage, and determine a user located within a specified application deadline set in the service application stage as a target user.
Optionally, the predicting module 402 is specifically configured to, the set service further relates to a service execution stage, use a user located outside a specified application deadline set in the service application stage as a reference user, determine, if a service period in which the set service is executed without performing by the reference user exists in the service execution stage according to a service result corresponding to the reference user in the service execution stage, determine, that the reference user is a black sample, determine, if a service period in which the set service is executed without performing by the reference user does not exist in the service execution stage according to a service result corresponding to the reference user in the service execution stage, determine, that the reference user is a white sample, determine, according to the determined black sample and the determined white sample, a risk probability threshold, and according to the risk probability threshold, and determining the label information corresponding to the target user according to the risk probability corresponding to the service result.
Optionally, the predicting module 402 is specifically configured to determine that the reference user is a white sample if it is determined that, according to a service result corresponding to the reference user in the service execution stage, there is no service cycle in which the reference user does not perform the set service in the service execution stage, and in the service execution stage, the number of service cycles in which the reference user has performed the set service exceeds a set number of cycles.
Optionally, the prediction module 402 is specifically configured to train the prediction model according to the black sample and the white sample to obtain a trained prediction model, and input the black sample and the white sample to the trained prediction model to determine the risk probability threshold according to an output result of the trained prediction model.
Optionally, the predicting module 402 is specifically configured to determine, according to the label information, the number of black samples in the target user to obtain the number of black samples, determine, according to the determined ratio of the black samples and the number of the black samples, the number of white samples to serve as the number of white samples, and mark, according to the label information, the white samples that meet the number of the white samples in the user-related data of each target user before the set service is executed.
Optionally, the predicting module 402 is specifically configured to determine a reference user labeled as a black sample, and a user ratio among all reference users is used as the black sample ratio.
Fig. 5 is a schematic diagram of a service wind control apparatus provided in this specification, including:
a receiving module 500, configured to receive a service application request of a user for the set service;
a determining module 502, configured to determine user-related data of the user according to the service application request;
a prediction module 504, configured to input the user-related data into a pre-trained wind control model, so as to predict, according to the user-related data, a service result after the user executes the set service, where the service result is used to indicate whether the user will perform the set service, and the wind control model is obtained by training through a model training method;
and a wind control module 506, configured to perform service wind control on the user according to the service result.
The present specification also provides a computer readable storage medium having stored thereon a computer program operable to perform a method of model training and a method of traffic scheduling as provided in figure 1 above.
This specification also provides a schematic block diagram of an electronic device corresponding to that of figure 1, shown in figure 6. As shown in fig. 6, at the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile memory, but may also include hardware required for other services. The processor reads a corresponding computer program from the non-volatile memory into the memory and then runs the computer program to implement the model training method and the business scheduling method described in fig. 1. Of course, besides the software implementation, the present specification does not exclude other implementations, such as logic devices or a combination of software and hardware, and the like, that is, the execution subject of the following processing flow is not limited to each logic unit, and may be hardware or logic devices.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the various elements may be implemented in the same one or more software and/or hardware implementations of the present description.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The description has been presented with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present specification, and is not intended to limit the present specification. Various modifications and alterations to this description will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present specification should be included in the scope of the claims of the present specification.

Claims (12)

1. A method of model training, comprising:
acquiring service data generated by a part of service periods of the determined target user executed set service;
inputting the service data into a preset prediction model to predict a service result corresponding to at least part of the rest service periods of the set service executed by the target user, wherein for each service period, the service result corresponding to the service period indicates whether the target user performs to execute the set service in the service period;
determining label information corresponding to the target user according to the service result, and labeling user related data of the target user before executing the set service according to the label information to generate a training sample;
inputting the training sample into a to-be-trained wind control model to predict a risk recognition result of the target user for the set service, and training the wind control model by taking the minimized deviation between the risk recognition result and the label information as an optimization target.
2. The method of claim 1, wherein the provisioning service relates to a service application phase;
determining a target user specifically comprises:
and determining the users within the appointed application period set in the service application stage as target users.
3. The method of claim 2, wherein the provisioning service further involves a service execution phase;
before determining the label information corresponding to the service result, the method further includes:
using the user outside the appointed application period set in the service application stage as a reference user;
if the service period that the reference user does not perform the set service exists in the service execution stage according to the service result corresponding to the reference user in the service execution stage, determining that the reference user is a black sample;
if it is determined that there is no service period in which the reference user does not perform the set service in the service execution stage according to the service result corresponding to the reference user in the service execution stage, determining that the reference user is a white sample;
determining a risk probability threshold according to the determined black sample and the white sample;
determining the label information corresponding to the service result, specifically including:
and determining label information corresponding to the target user according to the risk probability threshold and the risk probability corresponding to the service result.
4. The method of claim 3, wherein if it is determined that there is no service period in which the reference user does not perform the set service in the service execution stage according to the service result corresponding to the reference user in the service execution stage, determining that the reference user is a white sample comprises:
and if it is determined that the service cycle in which the set service is not executed by the reference user in the service execution stage does not exist according to the service result corresponding to the reference user in the service execution stage, and the number of the service cycles in which the set service is executed by the reference user in the service execution stage exceeds the set cycle number, determining that the reference user is a white sample.
5. The method of claim 3, wherein determining a risk probability threshold based on the determined black and white samples comprises:
training the prediction model according to the black sample and the white sample to obtain a trained prediction model;
and inputting the black sample and the white sample into a trained prediction model, and determining the risk probability threshold according to an output result of the trained prediction model.
6. The method of claim 4, wherein the training samples comprise black samples and white samples;
labeling the user related data of the target user before executing the set service according to the label information to generate a training sample, specifically comprising:
determining the number of samples of black samples in the target user according to the label information to obtain the number of black samples;
determining the number of white samples as the number of white samples according to the determined proportion of the black samples and the number of the black samples;
and according to the label information, marking white samples according with the number of the white samples in user related data of each target user before the set service is executed.
7. The method of claim 6, wherein determining a black sample fraction specifically comprises:
and determining the reference users with the labels as black samples, and taking the user ratio of all the reference users as the black sample ratio.
8. A method for traffic scheduling, comprising:
receiving a service application request of a user for the set service;
determining user related data of the user according to the service application request;
inputting the user-related data into a pre-trained wind control model to predict a service result after the user executes the set service according to the user-related data, wherein the service result is used for indicating whether the user will perform the set service, and the wind control model is obtained by training through the method of any one of claims 1 to 7;
and carrying out service wind control on the user according to the service result.
9. An apparatus for model training, comprising:
the acquisition module is used for acquiring the determined service data generated by the part of the service cycle of the set service executed by the target user;
the prediction module is used for inputting the service data into a preset prediction model so as to predict a service result corresponding to at least part of the rest service periods of the set service executed by the target user, and for each service period, the service result corresponding to the service period indicates whether the target user performs the set service in the service period;
the marking module is used for determining label information corresponding to the target user according to the service result and marking user related data of the target user before the set service is executed according to the label information so as to generate a training sample;
and the training module is used for inputting the training sample into a wind control model to be trained so as to predict a risk recognition result of the target user for the set service, and training the wind control model by taking the minimized deviation between the risk recognition result and the label information as an optimization target.
10. An apparatus for traffic throttling, comprising:
a receiving module, configured to receive a service application request of a user for the set service;
a determining module, configured to determine user-related data of the user according to the service application request;
a prediction module, configured to input the user-related data into a pre-trained wind control model, so as to predict, according to the user-related data, a service result after the user executes the set service, where the service result is used to indicate whether the user will perform the set service, and the wind control model is obtained by training according to the method of any one of claims 1 to 7;
and the wind control module is used for carrying out service wind control on the user according to the service result.
11. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1 to 7 or 8.
12. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 7 or 8 when executing the program.
CN202110813046.5A 2021-07-19 2021-07-19 Model training method, business wind control method and business wind control device Pending CN113643119A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110813046.5A CN113643119A (en) 2021-07-19 2021-07-19 Model training method, business wind control method and business wind control device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110813046.5A CN113643119A (en) 2021-07-19 2021-07-19 Model training method, business wind control method and business wind control device

Publications (1)

Publication Number Publication Date
CN113643119A true CN113643119A (en) 2021-11-12

Family

ID=78417724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110813046.5A Pending CN113643119A (en) 2021-07-19 2021-07-19 Model training method, business wind control method and business wind control device

Country Status (1)

Country Link
CN (1) CN113643119A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114154891A (en) * 2021-12-08 2022-03-08 中国建设银行股份有限公司 Retraining method and retraining device for risk control model
CN114254588A (en) * 2021-12-16 2022-03-29 马上消费金融股份有限公司 Data tag processing method and device
CN114943307A (en) * 2022-06-28 2022-08-26 支付宝(杭州)信息技术有限公司 Model training method and device, storage medium and electronic equipment
CN116028820A (en) * 2023-03-20 2023-04-28 支付宝(杭州)信息技术有限公司 Model training method and device, storage medium and electronic equipment
CN116308738A (en) * 2023-02-10 2023-06-23 之江实验室 Model training method, business wind control method and device
CN116578877A (en) * 2023-07-14 2023-08-11 之江实验室 Method and device for model training and risk identification of secondary optimization marking

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114154891A (en) * 2021-12-08 2022-03-08 中国建设银行股份有限公司 Retraining method and retraining device for risk control model
CN114254588A (en) * 2021-12-16 2022-03-29 马上消费金融股份有限公司 Data tag processing method and device
CN114254588B (en) * 2021-12-16 2023-10-13 马上消费金融股份有限公司 Data tag processing method and device
CN114943307A (en) * 2022-06-28 2022-08-26 支付宝(杭州)信息技术有限公司 Model training method and device, storage medium and electronic equipment
CN116308738A (en) * 2023-02-10 2023-06-23 之江实验室 Model training method, business wind control method and device
CN116308738B (en) * 2023-02-10 2024-03-08 之江实验室 Model training method, business wind control method and device
CN116028820A (en) * 2023-03-20 2023-04-28 支付宝(杭州)信息技术有限公司 Model training method and device, storage medium and electronic equipment
CN116028820B (en) * 2023-03-20 2023-07-04 支付宝(杭州)信息技术有限公司 Model training method and device, storage medium and electronic equipment
CN116578877A (en) * 2023-07-14 2023-08-11 之江实验室 Method and device for model training and risk identification of secondary optimization marking
CN116578877B (en) * 2023-07-14 2023-12-26 之江实验室 Method and device for model training and risk identification of secondary optimization marking

Similar Documents

Publication Publication Date Title
CN113643119A (en) Model training method, business wind control method and business wind control device
CN108764915B (en) Model training method, data type identification method and computer equipment
CN110674188A (en) Feature extraction method, device and equipment
CN115238826B (en) Model training method and device, storage medium and electronic equipment
CN115146731A (en) Model training method, business wind control method and business wind control device
CN114997472A (en) Model training method, business wind control method and business wind control device
CN112966186A (en) Model training and information recommendation method and device
CN114332873A (en) Training method and device for recognition model
CN110033092B (en) Data label generation method, data label training device, event recognition method and event recognition device
CN113222649A (en) Method and device for recommending service execution mode
CN113886033A (en) Task processing method and device
CN112561162A (en) Information recommendation method and device
CN112966577A (en) Method and device for model training and information providing
CN112183584A (en) Method and device for model training and business execution
CN110516918B (en) Risk identification method and risk identification device
CN116308738B (en) Model training method, business wind control method and device
CN112163962A (en) Method and device for model training and business wind control
CN110033383B (en) Data processing method, device, medium and apparatus
CN111369293A (en) Advertisement bidding method and device and electronic equipment
CN115729714A (en) Resource allocation method, device, storage medium and electronic equipment
CN113010562B (en) Information recommendation method and device
CN115660105A (en) Model training method, business wind control method and business wind control device
CN115130621A (en) Model training method and device, storage medium and electronic equipment
CN115049433A (en) Model training method and device, storage medium and electronic equipment
CN109389157B (en) User group identification method and device and object group identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication