CN113935495A

CN113935495A - Training method, using method, device and equipment of mobility prediction model

Info

Publication number: CN113935495A
Application number: CN202111029216.7A
Authority: CN
Inventors: 李骏琪; 邵俊; 万友平
Original assignee: Shenzhen Suoxinda Data Technology Co ltd
Current assignee: Shenzhen Suoxinda Data Technology Co ltd
Priority date: 2021-09-01
Filing date: 2021-09-01
Publication date: 2022-01-14

Abstract

The embodiment of the invention discloses a training method of a mobility prediction model, which is characterized in that a first mobility prediction model is trained by utilizing non-private data and a category label at a server end, so that the finally obtained model can identify the category of a current user when the model predicts a user to be predicted, the learning capability of the model is enhanced, and the prediction accuracy is improved; the first mobility prediction model is trained by utilizing the private data at the client to obtain a second mobility prediction model, so that the stability and generalization performance of the model for real-time prediction can be enhanced, the adaptability, effectiveness and accuracy of the second mobility prediction model for predicting each user can be improved, and the accuracy of the mobility prediction of the user through machine learning is enhanced; and the second mobility prediction model can use less data of the single user to carry out rapid iteration on the client, so that the data cost of the model is reduced, personalized prediction of each user can be ensured, and the prediction accuracy is improved.

Description

Training method, using method, device and equipment of mobility prediction model

Technical Field

The invention relates to the technical field of machine learning, in particular to a training method, a using method, a device and equipment of a mobility prediction model.

Background

In the prior art, machine learning methods such as light-gbm and the like are used for predictive modeling of user mobility to perform overdue risk early warning, so that an intervention state of overdue repayment is entered in advance, bad account loss is reduced, however, in the prior art, a large amount of data is needed for modeling and subsequent iteration, interference of data distribution change cannot be eliminated, and stability is weak. Meanwhile, in the prior art, the requirement on the sample characteristics is high, the training effect and the prediction effect are influenced in a financial scene where the user privacy data characteristics are not convenient to collect, and the prediction accuracy needs to be improved.

Disclosure of Invention

The invention mainly aims to provide a training method and device of a mobility prediction model, computer equipment and a storage medium, which can solve the problem of low accuracy in prediction of user mobility in the prior art.

In order to achieve the above object, a first aspect of the present invention provides a method for training a mobility prediction model, the method being applied to a server, and the method including:

obtaining a first sample data set of a target user, wherein the first sample data set comprises non-private data of the target user and a category label of the target user;

inputting a training set included in the first sample data set into a gradient lifting model for single-step prediction, and determining a first loss; obtaining a first-order gradient according to the first loss and gradient algorithm;

inputting a test set included in the first sample data set into the gradient lifting model to perform single-step prediction to obtain a second loss; determining a second-order gradient according to the second loss, the first-order gradient and a learning rate;

performing single-step updating on the gradient lifting model by using the second-order gradient to obtain an updated gradient lifting model; returning to the step of obtaining the first sample data set of the target user until the number of times of returning execution reaches a preset iteration number, and determining the finally obtained gradient lifting model as a first mobility prediction model;

and distributing the first mobility prediction model to a client corresponding to each user to be predicted.

In a possible implementation manner, after determining the resulting gradient boost model as the first mobility prediction model, the method further includes:

carrying out mobility prediction on each candidate user by using the first mobility prediction model to obtain the mobility corresponding to each candidate user;

determining a statistical mean value and a statistical variance by using the mobility corresponding to each candidate user and a preset statistical algorithm;

and distributing the statistical mean and the statistical variance to a client corresponding to each user to be predicted, wherein the client is used for determining the risk signal strength based on the statistical mean and the statistical variance, and the risk signal strength is used for indicating the risk strength of overdue repayment of the user to be predicted.

In a feasible implementation manner, the distributing the first mobility prediction model to the client corresponding to each user to be predicted further includes:

receiving risk prompt information reported by a client corresponding to each user to be predicted, wherein the risk prompt information comprises risk signal strength;

determining a wind control grade corresponding to the risk signal strength according to the risk signal strength and a preset grade determination rule;

and executing corresponding wind control operation on the user to be predicted based on the wind control grade.

In one possible implementation, the obtaining the first sample data set of the target user further includes:

acquiring non-privacy data of sample users, and clustering candidate users which reach a risk expression period according to the non-privacy data to obtain category labels corresponding to the candidate users;

sequencing the non-private data of the candidate user according to a data generation time sequence to obtain a sample data sequence, wherein the sample data sequence comprises the non-private data of the candidate user and a category label of the candidate user;

the obtaining the first sample dataset of the target user comprises:

randomly extracting the sample data sequence by using a random extraction rule to obtain a first sample data set, wherein the random extraction rule comprises a preset class extraction number;

and dividing the first sample data set according to a preset dividing proportion, and determining a training set and a test set corresponding to the first sample data set.

In order to achieve the above object, a second aspect of the present invention provides a method for using a mobility prediction model, the method being applied to a client, and the method including:

receiving a first mobility prediction model sent by the server, and obtaining first privacy sample data of a user to be predicted corresponding to the client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on a training method of the mobility prediction model in the first aspect;

determining a second mobility prediction model corresponding to the user to be predicted according to the first mobility prediction model and the first privacy sample data;

and determining whether the user to be predicted has overdue risks or not by using the second mobility prediction model and a preset risk judgment rule.

In a feasible implementation manner, the determining, according to the first mobility prediction model and the first privacy sample data, a second mobility prediction model corresponding to the user to be predicted includes:

inputting the first privacy sample data into the first mobility prediction model to obtain a second mobility prediction model;

determining the data updating times of the private data;

and when the data updating times reach a preset updating time threshold, inputting second privacy sample data into the second mobility prediction model to obtain a third mobility prediction model, wherein the second privacy sample data comprises the privacy data of the user to be predicted and the non-privacy data of the user to be predicted, of which the data updating times reach the updating time threshold.

In a feasible implementation manner, the determining, by using the second mobility prediction model and a preset risk judgment rule, whether the user to be predicted has an overdue risk includes:

analyzing and predicting the mobility of the user to be predicted by using the first privacy sample data and the second mobility prediction model to obtain an overall average value of the user to be predicted;

analyzing and predicting the mobility of the user to be predicted by using the second privacy sample data and the third mobility prediction model to obtain a sample mean and a sample variance of the user to be predicted;

determining a overdue check value corresponding to the user to be predicted according to the overall mean value, the sample variance and the data updating times;

if the overdue check value is smaller than a safety threshold, determining that the user to be predicted has overdue risk;

and if the overdue check value is greater than or equal to the safety threshold, determining that the user to be predicted does not have overdue risk.

In one possible implementation, the method further includes:

if the user to be predicted has overdue risks, acquiring a statistical mean value and a statistical variance sent by a server;

determining the risk signal strength of the overdue risk by using the statistical mean, the statistical variance, the sample mean and the sample variance, wherein the risk signal strength is used for indicating the risk strength of overdue repayment of the user;

generating risk prompt information by using the risk signal strength, and reporting the risk prompt information to a server;

and if the user to be predicted does not have overdue risk, taking the third mobility prediction model as a prediction model used for performing mobility prediction on the user to be predicted next time, taking the sample mean value as the overall mean value, and continuing to execute the step of determining the data updating times of the private data.

In order to achieve the above object, a third aspect of the present invention provides an apparatus for training a mobility prediction model, the apparatus being applied to a server, the apparatus including:

a data determination module: the method comprises the steps of obtaining a first sample data set of a target user, wherein the first sample data set comprises non-private data of the target user and a category label of the target user;

a first training module: the single-step prediction is carried out on the training set input into the gradient lifting model included in the first sample data set, and a first loss is determined; obtaining a first-order gradient according to the first loss and gradient algorithm;

a second training module: the test set included in the first sample data set is input into the gradient lifting model to carry out single-step prediction, and a second loss is obtained; determining a second-order gradient according to the second loss, the first-order gradient and a learning rate;

a model iteration module: the gradient lifting model is used for carrying out single-step updating on the gradient lifting model by utilizing the second-order gradient to obtain an updated gradient lifting model; returning to the step of obtaining the first sample data set of the target user until the number of times of returning execution reaches a preset iteration number, and determining the finally obtained gradient lifting model as a first mobility prediction model;

a model distribution module: and the first mobility prediction model is used for being distributed to the client corresponding to each user to be predicted.

In order to achieve the above object, a fourth aspect of the present invention provides an apparatus for using a mobility prediction model, the apparatus being applied to a client, and the apparatus including:

a data acquisition module: the mobility prediction method comprises the steps of receiving a first mobility prediction model sent by the server and obtaining first privacy sample data of a user to be predicted corresponding to the client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on a training method of the mobility prediction model in the first aspect;

a model determination module: the second mobility prediction model is used for determining a second mobility prediction model corresponding to the user to be predicted according to the first mobility prediction model and the first privacy sample data;

a risk determination module: and the second mobility prediction model and a preset risk judgment rule are used for determining whether the user to be predicted has overdue risk.

To achieve the above object, a fifth aspect of the present invention provides a computer-readable storage medium storing a computer program, which, when executed by a processor, causes the processor to perform the steps as shown in the first aspect, the second aspect or any possible implementation manner.

To achieve the above object, a sixth aspect of the present invention provides a computer device, comprising a memory and a processor, the memory storing a computer program, the computer program, when executed by the processor, causing the processor to perform the steps as set forth in the first aspect, the second aspect or any feasible implementation.

The embodiment of the invention has the following beneficial effects:

the invention provides a method for training a mobility prediction model, which is applied to a server and comprises the following steps: acquiring a first sample data set of a target user, wherein the first sample data set comprises non-private data of the target user and a category label of the target user; inputting a training set included in the first sample data set into a gradient lifting model for single-step prediction, and determining a first loss; obtaining a first-order gradient according to a first loss and gradient algorithm; inputting a test set included in the first sample data set into a gradient lifting model for single-step prediction to obtain a second loss; determining a second-order gradient according to the second loss, the first-order gradient and the learning rate; performing single-step updating on the gradient lifting model by using the second-order gradient to obtain an updated gradient lifting model; returning to the step of obtaining the first sample data set of the target user until the number of times of returning execution reaches a preset iteration number, and determining the finally obtained gradient lifting model as a first mobility prediction model; and distributing the first mobility prediction model to the client corresponding to each user to be predicted. The method has the advantages that the non-private data and the category labels are used for training the first mobility prediction model at the server side, characteristics related to categories can be learned in the model training process, so that the finally obtained model can identify the category of the current user when the user to be predicted is predicted, the learning capability of the model is enhanced, the prediction accuracy is improved, the private data is used for training the first mobility prediction model at the client side to obtain the second mobility prediction model, the stability and the generalization performance of the model for real-time prediction can be enhanced, the adaptability, the actual effect and the accuracy of the second mobility prediction model for predicting each user can be improved, and the accuracy of the mobility prediction of the user through machine learning is enhanced; and the second mobility prediction model can use less data of the single user to carry out rapid iteration on the client, so that the data cost of the model is reduced, personalized prediction of each user can be ensured, and the prediction accuracy is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Wherein:

fig. 1 is a flowchart of a method for training a mobility prediction model according to an embodiment of the present invention;

fig. 2 is another flowchart of a method for training a mobility prediction model according to an embodiment of the present invention;

FIG. 3 is a flow chart illustrating a method for using a mobility prediction model in accordance with an embodiment of the present invention;

FIG. 4 is another flow chart of a method for using a mobility prediction model according to an embodiment of the present invention

Fig. 5 is a block diagram of a training apparatus for a mobility prediction model according to an embodiment of the present invention;

FIG. 6 is a block diagram of an apparatus for using a mobility prediction model according to an embodiment of the present invention;

fig. 7 is a block diagram of a computer device according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, fig. 1 is a flowchart illustrating a method for training a mobility prediction model according to an embodiment of the present invention, where the method shown in fig. 1 is applied to a server, and the method specifically includes the following steps:

101. obtaining a first sample data set of a target user, wherein the first sample data set comprises non-private data of the target user and a category label of the target user;

it should be noted that, the following training of the mobility prediction model is performed by obtaining a first sample data set including non-private data of a target user and a category label at a server, and features related to categories can be learned in the model training process, so that features related to the categories can be trained with higher weights, so that the current user category can be identified when the finally obtained model predicts the user to be predicted, the learning capability of the model is enhanced, and the prediction accuracy is improved, wherein a plurality of target users exist, each target user can be a registered user corresponding to each financial client, the server can be a data center corresponding to the financial client, and the non-private data can be behavior or footprint data uploaded to the data center for use data, financial product data or loan behavior data and the like generated by each user at the client, the non-private data includes, but is not limited to, financial behavior data generated by a user of a financial client, loan approval data, credit investigation data, external data, and the like, and has non-privacy. The category labels can be obtained by clustering through an input clustering algorithm, and the category labels corresponding to the target users can also be used for screening the sample users through the existing guest group labels, so that the category labels of the target users are obtained.

102. Inputting a training set included in the first sample data set into a gradient lifting model for single-step prediction, and determining a first loss; obtaining a first-order gradient according to the first loss and gradient algorithm;

103. inputting a test set included in the first sample data set into the gradient lifting model to perform single-step prediction to obtain a second loss; determining a second-order gradient according to the second loss, the first-order gradient and a learning rate;

in a possible implementation manner, the first sample data set includes a training set and a test set, and the training set may be input into the gradient boost model to perform single-step prediction to obtain a first loss, where the first loss refers to a loss value of model training, but this prediction does not update the gradient boost model, only finds a gradient for the first loss, and stores a first-order gradient obtained by gradient calculation. The first order gradient is used to calculate the second order gradient used to update the gradient boost model in step 103. Further, a test set included in the first sample data set is input into the gradient lifting model to perform single-step prediction, so that a second loss is obtained, the second loss is also a loss value of model training, and a second-order gradient used for updating the gradient lifting model is obtained by using the first-order gradient, the second loss and a preset learning rate. Specifically, the second-order gradient may be obtained by multiplying the first-order gradient by a difference obtained by subtracting the learning rate from the second loss. The gradient lifting model can be an XGboost model. The preset learning rate may be 0.001.

104. Performing single-step updating on the gradient lifting model by using the second-order gradient to obtain an updated gradient lifting model; returning to the step of obtaining the first sample data set of the target user until the number of times of returning execution reaches a preset iteration number, and determining the finally obtained gradient lifting model as a first mobility prediction model;

it can be understood that the second-order gradient for updating the gradient lifting model can be determined after the

steps

102 and 103, and then the gradient lifting model is updated in a single step by using the second-order gradient to obtain an updated gradient lifting model, and the step 101 is executed again to update the gradient lifting model to meet the corresponding iteration times, thereby completing multi-step training of multiple single-step updates of the gradient lifting model, and using the gradient lifting model obtained by the last update as the first mobility prediction model. The first mobility prediction model is a mobility prediction model obtained based on non-private data and a category label, and may also be referred to as a meta model M1. The preset number of iterations may be 200, 300, or 400, and so on, which is not limited by the examples herein.

105. And distributing the first mobility prediction model to a client corresponding to each user to be predicted.

In the embodiment of the present invention, the first mobility prediction model obtained in

steps

101, 102, 103, and 104 is distributed to the client corresponding to the user to be predicted, where the client may perform further training and overdue risk prediction on the first mobility prediction model based on the non-private data generated on the client by the user to be predicted corresponding to the client.

The invention provides a method for training a mobility prediction model, which is applied to a server and comprises the following steps: acquiring a first sample data set of a target user, wherein the first sample data set comprises non-private data of the target user and a category label of the target user; inputting a training set included in the first sample data set into a gradient lifting model for single-step prediction, and determining a first loss; obtaining a first-order gradient according to a first loss and gradient algorithm; inputting a test set included in the first sample data set into a gradient lifting model for single-step prediction to obtain a second loss; determining a second-order gradient according to the second loss, the first-order gradient and the learning rate; performing single-step updating on the gradient lifting model by using the second-order gradient to obtain an updated gradient lifting model; returning to the step of obtaining the first sample data set of the target user until the number of times of returning execution reaches a preset iteration number, and determining the finally obtained gradient lifting model as a first mobility prediction model; and distributing the first mobility prediction model to the client corresponding to each user to be predicted. The method has the advantages that the non-private data and the class labels are used for training the first mobility prediction model at the server side, characteristics related to classes can be learned in the model training process, so that the finally obtained model can identify the current user class when predicting a user to be predicted, the learning capability of the model is enhanced, the prediction accuracy is improved, in addition, a single-step updating mode is adopted, a small amount of small sample data can be realized, the mobility prediction model is obtained through a few iterations, the prediction accuracy is ensured, and meanwhile, the requirement on the data volume is reduced.

Referring to fig. 2, fig. 2 is another flowchart of a method for training a mobility prediction model according to an embodiment of the present invention, and as shown in fig. 2, the method is applied to a server, and the method specifically includes the following steps:

201. obtaining a first sample data set of a target user, wherein the first sample data set comprises non-private data of the target user and a category label of the target user;

it should be noted that, the content of step 201 is similar to that of step 101 shown in fig. 1, and for avoiding repetition, details are not repeated here, and the content described in step 101 may be referred to specifically.

In a possible implementation, step 201 further includes: acquiring non-private data of a sample user, and clustering candidate users which reach a risk expression period according to the non-private data of the sample user to obtain category labels corresponding to the candidate users; and sequencing the non-private data of the candidate user according to the data generation time sequence to obtain a sample data sequence, wherein the sample data sequence comprises the non-private data of the candidate user and the category label of the candidate user.

It should be noted that, before step 201, non-private data of all sample users at the server needs to be acquired, non-private data of candidate users in a risk presentation period in the sample users is clustered by a clustering algorithm to obtain category labels corresponding to the candidate users, and if there is a corresponding guest group label for each sample user, the candidate users may also be clustered by the guest group label to obtain category labels corresponding to the candidate users, which is not limited in this example, further, the non-private data of the candidate users is sorted according to the generation time of the data to obtain a sorted sample data sequence, it is understood that the sample data sequence includes the non-private data and the category labels corresponding to the candidate users, and it should be noted that the number of categories of the users obtained by clustering is about 100 to 500, the specific category number may be adjusted according to actual requirements, which is only exemplary and not limiting. The risk performance period can be determined by the account age, repayment period and observation point of the user, and specifically, the account age refers to the asset deposit month. The repayment period can be a predetermined repayment time, the observation point is a certain time point in the free month and the repayment time, and the risk presentation period is a period of time from the observation point to the repayment time.

After obtaining the sample data sequence, the obtaining the first sample data set of the target user in step 201 may specifically include:

i. randomly extracting the sample data sequence by using a random extraction rule to obtain a first sample data set, wherein the random extraction rule comprises a preset class extraction number;

further, randomly extracting candidate users in the sample data sequence obtained by clustering according to a preset class extraction number to obtain partial class labels and target users corresponding to the extracted partial class labels, and determining non-private data and class labels of the target users as a first sample data set D1. Specifically, the preset number of category extractions may be 5 to 10 categories of users.

ii. And dividing the first sample data set according to a preset dividing proportion, and determining a training set and a test set corresponding to the first sample data set.

After the first sample data set is obtained, the first sample data set D1 may be classified according to a preset division ratio to obtain a training set and a test set, specifically, according to 4: the user category ratio of 1 divides the user categories of the first sample data set into a training set and a test set, wherein, assuming that the number of categories extracted is 10 user categories, the training set may include 8 user categories, and the test set may include 2 user categories, which is not limited by this example.

202. Inputting a training set included in the first sample data set into a gradient lifting model for single-step prediction, and determining a first loss; obtaining a first-order gradient according to the first loss and gradient algorithm;

203. inputting a test set included in the first sample data set into the gradient lifting model to perform single-step prediction to obtain a second loss; determining a second-order gradient according to the second loss, the first-order gradient and a learning rate;

204. performing single-step updating on the gradient lifting model by using the second-order gradient to obtain an updated gradient lifting model; returning to the step of obtaining the first sample data set of the target user until the number of times of returning execution reaches a preset iteration number, and determining the finally obtained gradient lifting model as a first mobility prediction model;

it should be noted that, steps 202, and 204 are similar to

steps

102, 103, and 104 shown in fig. 1, and for avoiding repetition, details are not repeated here, and reference may be specifically made to the contents described in

steps

102, 103, and 104. And returning to execute the step of acquiring the first sample data set of the target user, namely returning to execute the steps i and ii.

In a possible implementation, step 204 is followed by:

a. carrying out mobility prediction on each candidate user by using the first mobility prediction model to obtain the mobility corresponding to each candidate user;

b. determining a statistical mean value and a statistical variance by using the mobility corresponding to each candidate user and a preset statistical algorithm;

c. and distributing the statistical mean and the statistical variance to a client corresponding to each user to be predicted, wherein the client is used for determining the risk signal strength based on the statistical mean and the statistical variance, and the risk signal strength is used for indicating the risk strength of overdue repayment of the user to be predicted.

It should be noted that, the steps a, b, and c are specifically: the mobility prediction method comprises the steps of conducting mobility prediction on each candidate user in a risk presentation period by using a first mobility prediction model obtained through training, conducting statistical analysis through the mobility of each candidate user, determining a statistical mean value and a statistical variance corresponding to a server side by using a statistical algorithm, wherein the statistical algorithm comprises but is not limited to mean value calculation and variance calculation, and sending the statistical mean value and the statistical variance obtained through analysis to a client side of each user to be predicted, so that the client side can calculate risk signal strength through the statistical mean value and the statistical variance.

205. Distributing the first mobility prediction model to a client corresponding to each user to be predicted;

it should be noted that, the content of step 205 is similar to that of step 105 shown in fig. 1, and for avoiding repetition, no further description is provided here, and the content described in step 105 may be referred to specifically.

206. Receiving risk prompt information reported by a client corresponding to each user to be predicted, wherein the risk prompt information comprises risk signal strength;

it should be noted that, if the server receives the risk prompt information reported by any client, it indicates that the user to be predicted corresponding to the client has overdue risk, where the risk prompt information may include, in addition to the risk signal strength, footprint information such as information of ordering financial products, user category information, and behavior information of the user on the financial client. The risk signal strength is used to indicate the risk strength of overdue repayment of the user.

207. Determining a wind control grade corresponding to the risk signal strength according to the risk signal strength and a preset grade determination rule;

furthermore, the wind control grade corresponding to the risk signal strength is determined according to the preset grade determination rule and the risk signal strength, so that each user to be predicted with overdue risks can be intervened in advance, and overdue repayment is prevented. For example, the preset level determination rule may be: sorting from large to small according to the intensity of the risk signals, and determining the wind control level based on the sorting result, wherein the wind control level is corresponding to the risk signals from large to small from high to low; the preset level determination rule may also be: and dividing the value range interval of the sequencing result, and configuring different value range interval grade tables to determine the wind control grade of the corresponding user. Still alternatively, the preset level determination rule may be: after the risk signal intensity is sequenced, the sequencing result is combined with the category label of the user in the risk prompt information to determine the risk grade corresponding to each user. The examples are not particularly limited.

208. And executing corresponding wind control operation on the user to be predicted based on the wind control grade.

It can be understood that users with different wind control levels may have different wind control operations, the wind control operation corresponding to the user with the higher wind control level includes the highest processing priority, the wind control operation includes but is not limited to daily reminding of the user from the predicted risk day, and in the subsequent preset time period, if the risk state is not released, the loan authority and other wind control means of the user operation client are stopped. It will be appreciated that the preset time period is between the time of risk reporting and the expiration of a payment.

The invention provides a training method of a mobility prediction model, which is applied to a server, and is characterized in that a multi-step training of a gradient lifting model is carried out by utilizing non-private data and a class label at a server end, so that a finally obtained first mobility prediction model can identify the class of a current user, relevant characteristic data of the current class are fitted with higher weight, the learning capacity of the model is enhanced, a single-step updating mode is adopted, a small amount of small sample data can be realized, the mobility prediction model is obtained through a few iterations, the prediction precision is ensured, and the requirement on the data quantity is reduced. And corresponding measures can be carried out according to the risk signal strength in the risk prompt information reported by the client, wind control operation is implemented, and overdue risks are managed and controlled in time.

Referring to fig. 3, fig. 3 is a flowchart illustrating a method for using a mobility prediction model according to an embodiment of the present invention, where the method shown in fig. 3 is applied to a client, and the method specifically includes the following steps:

301. receiving a first mobility prediction model sent by the server and acquiring first privacy sample data of a user to be predicted corresponding to the client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on a training method of the mobility prediction model in fig. 1;

it should be noted that, when the client may receive the first mobility prediction model sent by the server, the client may further obtain non-privacy data of the user to be predicted and first privacy sample data of the privacy data of the user to be predicted, where the privacy data includes, but is not limited to, the use frequency of the loan APP, various consumption amount data of the user, and the like.

302. Determining a second mobility prediction model corresponding to the user to be predicted according to the first mobility prediction model and the first privacy sample data;

further, a second mobility prediction model is determined according to the first mobility prediction model and the first privacy sample data, the second mobility prediction model is used for performing mobility prediction on the user to be predicted, and the second mobility prediction model is an individualized mobility prediction model obtained based on the first privacy sample data including the privacy data of the user to be predicted. Therefore, when the user to be predicted is predicted based on the second mobility prediction model, the method is more suitable for the characteristics of the user to be predicted, and the accuracy of the mobility prediction of the user to be predicted by the second mobility prediction model is improved.

303. And determining whether the user to be predicted has overdue risks or not by using the second mobility prediction model and a preset risk judgment rule.

In the embodiment of the invention, the overdue risk of each user to be predicted can be judged through the second mobility prediction model corresponding to each client and the preset risk judgment rule, so that each user to be predicted can obtain the second mobility prediction model based on the corresponding privacy data, and personalized prediction and judgment of each user to be predicted are realized.

The invention provides a use method of a mobility prediction model, which is applied to a client and comprises the following steps: receiving a first mobility prediction model sent by a server, and acquiring first privacy sample data of a user to be predicted corresponding to a client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on a training method of the mobility prediction model in the figure 1; determining a second mobility prediction model corresponding to the user to be predicted according to the first mobility prediction model and the first privacy sample data; and determining whether the user to be predicted has overdue risks or not by using a second mobility prediction model and a preset risk judgment rule. The second mobility prediction model is obtained by training the first mobility prediction model sent by the server by using the private data, so that the stability and generalization performance of the model for real-time prediction can be enhanced, the second mobility prediction model is a private self-adaptive model of a user to be predicted, the self-adaptability, the actual effect and the accuracy of the second mobility prediction model for predicting each user are improved, and the accuracy of the mobility prediction of the user through machine learning is enhanced; and the second mobility prediction model can use less data of the single user to carry out rapid iteration on the client, so that the data cost of the model is reduced, personalized prediction of each user can be guaranteed, and overdue risk identification is more accurate. The user privacy data is used on the client side for light-weight training without uploading to the server, and the protection of the user privacy is facilitated on the basis of improving the speed and reducing the load.

Referring to fig. 4, fig. 4 is another flowchart of a method for using a mobility prediction model according to an embodiment of the present invention, and as shown in fig. 4, the method is applied to a client, and the method specifically includes the following steps:

401. receiving a first mobility prediction model sent by the server and acquiring first privacy sample data of a user to be predicted corresponding to the client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on a training method of the mobility prediction model in fig. 1;

it should be noted that, the content of step 401 is similar to that of step 301 shown in fig. 3, and for avoiding repetition, no further description is provided here, and the content described in step 301 may be referred to specifically.

402. Inputting the first privacy sample data into the first mobility prediction model to obtain a second mobility prediction model;

403. determining the data updating times of the private data;

further, after the second mobility prediction model is obtained, it is necessary to determine the data update times of the private data of the user to be predicted, so that when the update times reach a preset update time, the second mobility prediction model on the client side is updated and adjusted. It can be understood that, each time data is updated, the data update time corresponds to +1, and each update is accumulated until a preset update time threshold is reached.

404. When the data updating times reach a preset updating time threshold, inputting second privacy sample data into the second mobility prediction model to obtain a third mobility prediction model, wherein the second privacy sample data comprises the privacy data of the user to be predicted and the non-privacy data of the user to be predicted, of which the data updating times reach the updating time threshold;

it should be noted that, when the number of data updates reaches a preset update number threshold, the second privacy sample data including the current privacy data and the current non-privacy data at that time is input into the second mobility prediction model, so as to obtain a third mobility prediction model. Specifically, the preset update time threshold may be a time value, such as 1 time, 2 times, or the like, or may also be an update frequency value or an update frequency value, such as several times per minute, several times per hour, or the like, which is not limited herein. And after obtaining the third mobility prediction model, clearing the data updating times.

405. Determining whether the user to be predicted has overdue risks or not by using the second mobility prediction model and a preset risk judgment rule;

in a possible implementation manner, step 405 may specifically include the following steps:

A. analyzing and predicting the mobility of the user to be predicted by using the first privacy sample data and the second mobility prediction model to obtain an overall average value of the user to be predicted;

it can be understood that the mobility of the user to be predicted is predicted through the first privacy sample data and the second mobility prediction model, and the predicted mobility is subjected to statistical analysis, so as to obtain an overall average value of the user to be predicted, wherein the overall average value is the mobility average value of a corresponding post-expression time window of the user when the user to be predicted receives the first mobility prediction model. The first privacy sample data refers to non-privacy data and privacy data generated on a client of the user to be predicted when the first mobility prediction model is received.

B. Analyzing and predicting the mobility of the user to be predicted by using the second privacy sample data and the third mobility prediction model to obtain a sample mean and a sample variance of the user to be predicted;

furthermore, the mobility of the user to be predicted is predicted through second privacy sample data and a second mobility prediction model, and statistical analysis is performed on the predicted mobility to obtain a sample mean value and a sample variance of the user to be predicted, wherein the second privacy sample data refers to non-privacy data and privacy data generated on the client by the user to be predicted, and the non-privacy data and the privacy data are updated to reach a preset update time threshold value. Specifically, the sample mean is a sample mean and a sample variance obtained by predicting the mobility corresponding to each time obtained by the user and performing mean calculation or statistical analysis on the mobility obtained each time when the privacy data is updated each time.

Illustratively, with a preset update time threshold as the update frequency f, after the data features of the private data on the user client are updated f times, the second mobility prediction model M2 is subjected to fine tuning, that is, training once to obtain a third mobility prediction model M3, and the third mobility prediction model is used to predict the mobility r of the user in the current state for multiple times₁₁、r₁₂、r₁₃…r_1fWherein the mobility r₁₁Is obtained based on the privacy data and non-privacy data after the first update under the update frequency f, and the mobility r₁₂Is obtained based on the updated private data and non-private data for the second time under the updating frequency f, … … mobility r_1fThe mobility prediction method is based on the privacy data and the non-privacy data after the last update under the update frequency f, namely, the mobility r of the user under the current state is predicted for multiple times by using a third mobility prediction model₁₁、r₁₂、r₁₃…r_1fThe method is to use the updated data characteristics of the current time and the previous time for each prediction, and further average the mobility of each time obtained by prediction to obtain the sample mean value mu₁And sample variance S₁；

C. Determining a overdue check value corresponding to the user to be predicted according to the overall mean value, the sample variance and the data updating times;

for example, t-distribution single-sample mean detection may be performed on the prediction result by using the overall mean, the sample variance, and the data update times, and the obtained detection value is used as an overdue detection value corresponding to the user to be predicted, specifically, step C may be determined by the following algorithm:

wherein A is overdue check value, f is data update frequency, mu₁Is the sample mean, μ₀Is the overall mean, S₁Is the sample variance.

D. If the overdue check value is smaller than a safety threshold, determining that the user to be predicted has overdue risk;

E. and if the overdue check value is greater than or equal to the safety threshold, determining that the user to be predicted does not have overdue risk.

Further, a safety threshold t is determined by searching a t boundary value table of t distribution single sample mean value test_α,n-1Where α is the desired prediction confidence; when t distribution test is carried out, if the number of samples is n, n-1 is the degree of freedom of the samples; and then, through alpha and the degree of freedom, the safety threshold value can be searched and determined through a t-boundary value table. If the check value is smaller than the safety threshold, the user mobility average value (sample average value) is obviously larger than the user mobility average value (overall average value) before updating, the fact that the user to be predicted has the overdue risk is determined, and otherwise, if the overdue check value is larger than or equal to the safety threshold, the fact that the user to be predicted does not have the overdue risk is determined.

406. If the user to be predicted has overdue risks, acquiring a statistical mean value and a statistical variance sent by a server;

407. determining the risk signal strength of the overdue risk by using the statistical mean, the statistical variance, the sample mean and the sample variance, wherein the risk signal strength is used for indicating the risk strength of overdue repayment of the user;

408. generating risk prompt information by using the risk signal strength, and reporting the risk prompt information to a server;

in a feasible implementation manner, the step 407-.

Specifically, the risk signal strength can be determined by the following calculation formula:

in the formula, R_SFor the risk signal strength, S₁Is the sample variance, μ₁Is the sample mean, μ is the statistical mean, and S is the statistical variance.

409. And if the user to be predicted does not have overdue risk, taking the third mobility prediction model as a prediction model used for performing mobility prediction on the user to be predicted next time, taking the sample mean value as the overall mean value, and continuing to execute the step of determining the data updating times of the private data.

It can be understood that, if the user to be predicted does not have an overdue risk, the risk does not need to be reported to the server, and at this time, the third mobility prediction model is used as a prediction model for next mobility prediction of the user to be predicted (i.e., the second mobility prediction model is updated to the third mobility prediction model), so that the sample mean value μ £ is made to be the average value of the prediction model used for next mobility prediction of the user to be predicted₁Overall mean value mu₀And the mobility of the user is continuously predicted to judge the overdue risk. It can be understood that after determining that the user to be predicted has an overdue risk or does not have an expected risk value, the mobility prediction model is updated, and then the data update times at this time need to be cleared to perform the next round of risk judgment, so that after the model is updated, the data update times at this time are cleared to start the next round of accumulation.

The invention provides a use method of a mobility prediction model, which is applied to a client. The first mobility prediction model is sent to the client, and the first mobility prediction model is trained by utilizing the private data at the client to obtain a second mobility prediction model, so that the stability and the generalization performance of the model for real-time prediction can be enhanced, the second mobility prediction model is a private self-adaptive model of a special user to be predicted, the self-adaptability, the actual efficiency and the accuracy of the second mobility prediction model for predicting each user are improved, and the accuracy of the mobility prediction of the user through machine learning is enhanced; and the second mobility prediction model can use less data of the single user to carry out rapid iteration on the client, so that the data cost of the model is reduced, personalized prediction of each user can be ensured, and the prediction accuracy is improved. Furthermore, timely and effective automatic updating of the second mobility prediction model is achieved through the data updating times and the updating time threshold of the private data, whether the user has overdue risks or not is determined by the aid of the overall mean value obtained by the second mobility prediction model before data updating and the sample mean value and the sample variance obtained by the third mobility prediction model before data updating, accuracy of risk identification is improved, the user private data are used on the client side for light-weight training without uploading to a server, and protection of the user privacy is facilitated on the basis of improving speed and reducing load. And when a risk exists, generating risk prompt information containing the intensity of the risk signal and reporting the risk prompt information to the server, so that the server can carry out wind control in time.

Referring to fig. 5, fig. 5 is a block diagram of a structure of a training apparatus for a mobility prediction model according to an embodiment of the present invention, where the apparatus shown in fig. 5 is applied to a server, and the apparatus specifically includes the following modules:

the data determination module 501: the method comprises the steps of obtaining a first sample data set of a target user, wherein the first sample data set comprises non-private data of the target user and a category label of the target user;

the first training module 502: the single-step prediction is carried out on the training set input into the gradient lifting model included in the first sample data set, and a first loss is determined; obtaining a first-order gradient according to the first loss and gradient algorithm;

the second training module 503: the test set included in the first sample data set is input into the gradient lifting model to carry out single-step prediction, and a second loss is obtained; determining a second-order gradient according to the second loss, the first-order gradient and a learning rate;

model iteration module 504: the gradient lifting model is used for carrying out single-step updating on the gradient lifting model by utilizing the second-order gradient to obtain an updated gradient lifting model; returning to the step of obtaining the first sample data set of the target user until the number of times of returning execution reaches a preset iteration number, and determining the finally obtained gradient lifting model as a first mobility prediction model;

the model distribution module 505: and the first mobility prediction model is used for being distributed to the client corresponding to each user to be predicted.

It should be noted that the functions of each module shown in fig. 5 are similar to those shown in each step shown in fig. 1, and for avoiding repetition, details are not described here, and the details shown in each step shown in fig. 1 may be referred to specifically.

The invention provides a device for training a mobility prediction model, which is applied to a server and comprises the following components: a data determination module: the method comprises the steps of obtaining a first sample data set of a target user, wherein the first sample data set comprises non-private data of the target user and a category label of the target user; a first training module: the method comprises the steps of inputting a training set included in a first sample data set into a gradient lifting model for single-step prediction, and determining a first loss; obtaining a first-order gradient according to a first loss and gradient algorithm; a second training module: the single-step prediction method comprises the steps of inputting a test set included in a first sample data set into a gradient lifting model to perform single-step prediction to obtain a second loss; determining a second-order gradient according to the second loss, the first-order gradient and the learning rate; a model iteration module: the gradient lifting model updating system is used for updating the gradient lifting model in a single step by using a second-order gradient to obtain an updated gradient lifting model; returning to the step of obtaining the first sample data set of the target user until the number of times of returning execution reaches a preset iteration number, and determining the finally obtained gradient lifting model as a first mobility prediction model; a model distribution module: and the first mobility prediction model is used for being distributed to the client corresponding to each user to be predicted. The method has the advantages that the method utilizes non-private data and the class labels to carry out multi-step training of the gradient lifting model at the server side, so that the finally obtained first mobility prediction model can identify the class of the current user, the relevant feature data of the current class are fitted with higher weight, the learning capacity of the model is enhanced, in addition, a single-step updating mode is adopted, a small amount of small sample data can be realized, the mobility prediction model is obtained through a few iterations, the prediction accuracy is ensured, and meanwhile, the requirement on the data quantity is reduced.

Referring to fig. 6, fig. 6 is a block diagram illustrating a structure of a device for using a mobility prediction model according to an embodiment of the present invention, where the device shown in fig. 6 is applied to a client, and the device specifically includes the following modules:

the data acquisition module 601: the mobility prediction method comprises the steps of receiving a first mobility prediction model sent by the server and obtaining first privacy sample data of a user to be predicted corresponding to the client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on a training method of the mobility prediction model in the first aspect;

model determination module 602: the second mobility prediction model is used for determining a second mobility prediction model corresponding to the user to be predicted according to the first mobility prediction model and the first privacy sample data;

the risk determination module 603: and the second mobility prediction model and a preset risk judgment rule are used for determining whether the user to be predicted has overdue risk.

It should be noted that the functions of each module shown in fig. 6 are similar to those shown in each step shown in fig. 3, and for avoiding repetition, details are not described here, and the details shown in each step shown in fig. 3 may be referred to specifically.

The invention provides a device for using a mobility prediction model, which is applied to a client and comprises: a data acquisition module: the mobility prediction method comprises the steps of receiving a first mobility prediction model sent by a server, and obtaining first privacy sample data of a user to be predicted corresponding to a client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on a training method of the mobility prediction model in the graph 1; a model determination module: the second mobility prediction model is used for determining a second mobility prediction model corresponding to the user to be predicted according to the first mobility prediction model and the first privacy sample data; a risk determination module: and the method is used for determining whether the user to be predicted has overdue risks by utilizing the second mobility prediction model and a preset risk judgment rule. The second mobility prediction model is obtained by training the first mobility prediction model sent by the server by using the private data, so that the stability and generalization performance of the model for real-time prediction can be enhanced, the second mobility prediction model is a private self-adaptive model of a user to be predicted, the self-adaptability, the actual effect and the accuracy of the second mobility prediction model for predicting each user are improved, and the accuracy of the mobility prediction of the user through machine learning is enhanced; and the second mobility prediction model can use less data of the single user to carry out rapid iteration on the client, so that the data cost of the model is reduced, personalized prediction of each user can be guaranteed, and overdue risk identification is more accurate. The user privacy data is used on the client side for light-weight training without uploading to the server, and the protection of the user privacy is facilitated on the basis of improving the speed and reducing the load.

FIG. 7 is a diagram illustrating an internal structure of a computer device in one embodiment. The computer device may specifically be a terminal, and may also be a server. As shown in fig. 7, the computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the memory includes a non-volatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system and may also store a computer program which, when executed by the processor, causes the processor to carry out the above-mentioned method. The internal memory may also have stored therein a computer program which, when executed by the processor, causes the processor to perform the method described above. Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is proposed, comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps as shown in any of fig. 1, fig. 2, fig. 3, or fig. 4.

In one embodiment, a computer-readable storage medium is provided, storing a computer program, which, when executed by a processor, causes the processor to perform the steps as shown in any of fig. 1, fig. 2, fig. 3, or fig. 4.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a non-volatile computer-readable storage medium, and can include the processes of the embodiments of the methods described above when the program is executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method for training a mobility prediction model, wherein the method is applied to a server, and the method comprises the following steps:

2. The method of claim 1, wherein after determining the resulting gradient boost model as the first mobility prediction model, further comprising:

3. The method according to claim 1, wherein the distributing the first mobility prediction model to the client corresponding to each user to be predicted further comprises:

4. The method of claim 1, wherein obtaining the first sample data set of the target user further comprises:

the obtaining the first sample dataset of the target user comprises:

5. A method for using a mobility prediction model, wherein the method is applied to a client, and the method comprises the following steps:

receiving a first mobility prediction model sent by the server, and acquiring first privacy sample data of a user to be predicted corresponding to the client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on a training method of the mobility prediction model in claim 1;

6. The method according to claim 5, wherein the determining a second mobility prediction model corresponding to the user to be predicted according to the first mobility prediction model and the first privacy sample data includes:

determining the data updating times of the private data;

7. The method according to claim 6, wherein the determining whether the user to be predicted has an overdue risk by using the second mobility prediction model and a preset risk judgment rule includes:

8. The method of claim 7, further comprising:

9. An apparatus for training a mobility prediction model, the apparatus being applied to a server, the apparatus comprising:

10. An apparatus for using a mobility prediction model, the apparatus being applied to a client, the apparatus comprising:

a data acquisition module: the mobility prediction method comprises the steps of receiving a first mobility prediction model sent by the server and obtaining first privacy sample data of a user to be predicted corresponding to the client, wherein the first privacy sample data comprises non-privacy data of the user to be predicted and privacy data of the user to be predicted, and the first mobility prediction model is obtained based on the training method of the mobility prediction model in claim 1;

11. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 4 or 5 to 8.

12. A computer device comprising a memory and a processor, characterized in that the memory stores a computer program which, when executed by the processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 4 or 5 to 8.