CN113837863A

CN113837863A - Business prediction model creation method and device and computer readable storage medium

Info

Publication number: CN113837863A
Application number: CN202111138614.2A
Authority: CN
Inventors: 顾凌云; 谢旻旗; 张涛; 黄以增
Original assignee: Shanghai IceKredit Inc
Current assignee: Shanghai IceKredit Inc
Priority date: 2021-09-27
Filing date: 2021-09-27
Publication date: 2021-12-24
Anticipated expiration: 2041-09-27
Also published as: CN113837863B

Abstract

According to the business prediction model establishing method, the business prediction model establishing device and the computer readable storage medium, firstly, a plurality of auxiliary data sets similar to a target data set are found; then, sampling is carried out from a plurality of auxiliary data sets to obtain a sample data set, and a service state model is obtained through training of the sample data set; then, obtaining default probability through a business state model, and determining a modeling data set based on the default probability; then, determining a weight parameter based on the target data set and the modeling data set; and finally, establishing a business prediction model by the modeling data set and the weight parameters. According to the scheme, the auxiliary data set similar to the target data set is used, the modeling data set is screened out in a quantification mode, and the weight of the sample in the modeling data set is adjusted, so that the sample in the modeling data set is closer to the sample of the service corresponding to the service prediction model to be created, and the created service prediction model has stronger prediction capability and stability.

Description

Business prediction model creation method and device and computer readable storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for creating a service prediction model, and a computer-readable storage medium.

Background

In model development, a large amount of sample data is generally needed, and at the stage of a business development just beginning, situations such as few sample data (business objects and business state labels) can be met, so that model development cannot be performed based on the existing current sample data, or the developed model has deviation of prediction capability and unstable effect.

Disclosure of Invention

In order to overcome at least the above-mentioned deficiencies in the prior art, the present application aims to provide a method, an apparatus and a computer-readable storage medium for creating a traffic prediction model, which are used to solve the above-mentioned technical problems.

In a first aspect, an embodiment of the present application provides a method for creating a business prediction model, which is applied to a computer device, and the method includes:

acquiring a target data set of a service prediction model to be created;

acquiring a plurality of auxiliary data sets meeting preset service similar conditions with the target data set based on the target data set;

extracting sample data from the plurality of auxiliary data sets to obtain a sample data set;

training according to the sample data set to obtain a service state model for predicting the service state of the service object in the sample data;

predicting the target data set and a plurality of auxiliary data sets by adopting the service state model to obtain default probabilities of the target data set and each auxiliary data set;

determining a modeling data set from the sample data set based on the default probabilities of the target data set and each auxiliary data set;

determining a weight parameter according to the target data set and the modeling data set;

creating the business prediction model based on the modeling dataset and the weight parameters.

According to the scheme, firstly, a target data set of a service prediction model to be created is obtained, and a plurality of auxiliary data sets similar to the target data set are found; then, sampling is carried out from a plurality of auxiliary data sets to obtain a sample data set, and a service state model is obtained through training of the sample data set; then, obtaining the default probability of the target data set and each auxiliary data set through the business state model, and determining a modeling data set based on the default probability; then, determining a weight parameter based on the target data set and the modeling data set; and finally, establishing a business prediction model by the modeling data set and the weight parameters. According to the scheme, the auxiliary data set similar to the target data set is used, the modeling data set is screened out in a quantification mode, and the weight of the samples in the modeling data set is adjusted, so that the samples in the modeling data set are closer to the samples of the business corresponding to the business prediction model to be created, the business prediction model can be created under the condition that the data volume of the target data set is less, and the created business prediction model has stronger prediction capability and stability.

In a possible implementation manner, in the step of obtaining, based on the target data set, a plurality of auxiliary data sets that satisfy a preset traffic similarity condition with the target data set, the preset traffic similarity condition includes:

each auxiliary data set has the same predictor variables available for creating the business prediction model as the target data set; and the combination of (a) and (b),

the sample data of each auxiliary data set comprises a business state label of the business object.

In a possible implementation manner, the step of extracting sample data from the plurality of auxiliary data sets to obtain a sample data set includes:

extracting the same preset amount of sample data from each auxiliary data set to obtain the sample data set;

wherein the step of extracting the same preset amount of sample data from each auxiliary data set comprises:

detecting whether the number of sample data in each auxiliary data set is greater than the preset number;

if the number is larger than or equal to the preset number, extracting the sample data of the preset number from each auxiliary data set by adopting a non-return sampling mode;

and if the number of the auxiliary data sets is smaller than the preset number, extracting the sample data of the preset number from each auxiliary data set by adopting a sample-back-sampling mode.

In a possible implementation manner, the step of determining a modeling data set from the sample data set based on the default probability of the target data set and each auxiliary data set includes:

taking the default probability of the target data set as basic data, taking the default probabilities of the multiple auxiliary data sets as test data, and calculating the group stability index of each auxiliary data set according to the basic data and the test data;

and taking the auxiliary data set with the minimum index value in the population stability index as the modeling data set.

In a possible implementation manner, in the step of calculating the population stability indicator of each auxiliary data set according to the basic data and the test data, the basic data are grouped, and the test data are grouped according to a threshold standard of the grouping of the basic data, wherein the number of the groups of the basic data is the same as the number of the groups of the test data;

the population stability indicator psi is calculated as follows:

where n is the number of packets, i is the serial number of the packet, A_iIs the proportion of the sample in the group of the i-th group in the test data, E_iIs the proportion of the sample in the group of the ith group in the basic data.

In one possible implementation manner, in the step of determining the weight parameter according to the target data set and the modeling data set, the formula for determining the weight parameter is as follows:

wherein, beta is a one-dimensional weight parameter array, and the one-dimensional weight parameter array comprises a weight parameter beta₁、β₂…β_jM is the number of samples, x 'of the modeling dataset'_jFor the jth sample of the modeling data set, n is the number of samples of the target data set, x_iFor the ith sample of the target data set, phi represents Euler's formula, and the constraint condition of quadratic programming is beta₁、β₂…β_j0 or more and beta₁、β₂…β_jThe sum is 1.

In a possible implementation manner, the step of creating the business prediction model based on the modeling data set and the weight parameter includes:

and taking the sample data in the modeling data set as a modeling sample, and taking the weight parameter as the weight of the sample data in the modeling data set to perform model creation to obtain the business prediction model.

In one possible implementation, the business state model and the business prediction model are logistic regression models.

In a second aspect, an embodiment of the present application further provides a device for creating a business prediction model, which is applied to a computer device, where the device includes:

the first acquisition module is used for acquiring a target data set of a service prediction model to be created;

the second acquisition module is used for acquiring a plurality of auxiliary data sets which meet the preset service similarity conditions with the target data set based on the target data set;

the sample extraction module is used for extracting sample data from the plurality of auxiliary data sets to obtain a sample data set;

the model training module is used for training according to the sample data set to obtain a service state model used for predicting the service state of the service object in the sample data;

the default probability prediction module is used for predicting the target data set and the plurality of auxiliary data sets by adopting the business state model to obtain default probabilities of the target data set and each auxiliary data set;

a modeling data set determination module for determining a modeling data set from the sample data set based on the default probabilities of the target data set and each auxiliary data set;

the weight parameter determining module is used for determining weight parameters according to the target data set and the modeling data set;

and the model creating module is used for creating the business prediction model based on the modeling data set and the weight parameters.

In a third aspect, an embodiment of the present application further provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and when the instructions are executed, the computer is caused to execute the method for creating the service prediction model in the first aspect or any one of the possible implementation manners of the first aspect.

In a fourth aspect, an embodiment of the present application further provides a computer device, where the computer device includes a processor, a computer-readable storage medium, and a communication unit, where the computer-readable storage medium, the communication unit, and the processor are connected through a bus system, the communication unit is configured to be communicatively connected to at least one terminal device, the computer-readable storage medium is configured to store a program, an instruction, or a code, and the processor is configured to execute the program, the instruction, or the code in the computer-readable storage medium, so as to implement the method for creating a traffic prediction model in the first aspect or any possible implementation manner of the first aspect.

Based on any one of the above aspects, firstly, a target data set of a service prediction model to be created is obtained, and a plurality of auxiliary data sets similar to the target data set are found; then, sampling is carried out from a plurality of auxiliary data sets to obtain a sample data set, and a service state model is obtained through training of the sample data set; then, obtaining the default probability of the target data set and each auxiliary data set through the business state model, and determining a modeling data set based on the default probability; then, determining a weight parameter based on the target data set and the modeling data set; and finally, establishing a business prediction model by the modeling data set and the weight parameters. According to the scheme, the auxiliary data set similar to the target data set is used, the modeling data set is screened out in a quantification mode, and the weight of the samples in the modeling data set is adjusted, so that the samples in the modeling data set are closer to the samples of the business corresponding to the business prediction model to be created, the business prediction model can be created under the condition that the data volume of the target data set is less, and the created business prediction model has stronger prediction capability and stability.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that need to be called in the embodiments are briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a schematic flowchart of a method for creating a business prediction model according to an embodiment of the present application;

fig. 2 is a functional module schematic diagram of a service prediction model creation apparatus according to an embodiment of the present application;

fig. 3 is a schematic hardware structure diagram of a computer device according to an embodiment of the present application.

Detailed Description

The present application will now be described in detail with reference to the drawings, and the specific operations in the method embodiments may also be applied to the apparatus embodiments or the system embodiments.

In the prior art, in order to solve the technical problems in the background art, one possible solution is to use sample data of other relatively mature services for modeling, however, due to differences in service contents, a service prediction model formed by directly using sample data of other relatively mature services for modeling has the problems of poor prediction capability and poor stability.

Taking credit business development of a financial institution as an example, a business prediction model is usually used to predict the default (business state) probability of a business object (customer), the prediction business model used in the credit approval stage is usually called an application scoring model, and the scoring result of the prediction business model is generally used as a basis for approval to pass or reject. However, the development of a prediction business model generally requires a large amount of sample data, and in an early stage of a new credit business (such as a large loan business) which is just developed, the number of samples of a group of meeting business objects is small, the repayment performance after the loan is not sufficient (the prediction label of the sample is not clear), and the available samples after the loan are lacked, so that the model development cannot be performed, or the developed model has a deviation in prediction capability and an unstable effect.

In order to overcome the deficiencies in the foregoing technical solutions, the inventor provides the following solutions, please refer to fig. 1, and fig. 1 is a schematic flow chart of a business prediction model creation method provided in an embodiment of the present application, where the business prediction model creation method provided in this embodiment may be executed by a computer device, and for convenience of explaining the technical solution of the present application, the business prediction model creation method is described in detail below in conjunction with a possible application scenario, where the possible application scenario may be used in a financial loan scenario, and it is understood that the technical solution provided in the present application may also be applied to other scenarios, for example, product information popularization based on big data. The following describes a method for creating a business prediction model provided by the present application, taking a financial loan scenario as an example.

The flow steps of the business prediction model creation method are explained in detail with reference to fig. 1.

And step S11, acquiring a target data set of the service prediction model to be created.

In this step, the service prediction model to be created may be a model for performing service prediction on a new service, where the new service refers to a service in which the service development time is less than a preset time (e.g., 3 months), and the new service may also refer to a service in which the number of sample data generated in the service scene is less than a preset number (e.g., 1000). The target data set refers to a set of sample data generated in a new service scenario.

And step S12, acquiring a plurality of auxiliary data sets meeting the preset service similarity conditions with the target data set based on the target data set.

In this embodiment of the present application, the presetting of the service similarity condition may include:

Using financial loan scenario as an example, the auxiliary data set S₁、S₂…S_nAnd a target data set S₀The conditions for similar services that are satisfied may be as follows:

auxiliary data set S₁、S₂…S_nAnd a target data set S₀Have some of the same independent variable (also called predictor) fields available for modeling, such as the borrower's basic information, the derivative fields of the people's behavioral credit report, etc.; and the combination of (a) and (b),

auxiliary data set S₁、S₂…S_nHas the function of generating good and bad customer labels for modeling according to the repayment performance after loan (business state), namely dependent variable (also called response variable and target variable) due to the target data set S₀The service development time is short, the repayment performance after the loan is not enough, and the target data set S₀May have good and bad customer labels.

Step S13, sample data is extracted from the plurality of auxiliary data sets to obtain a sample data set.

In the embodiment of the present application, it is possible to select from each of the auxiliary data sets (S)₁、S₂…S_n) And extracting the sample data with the same preset quantity to obtain a sample data set S.

In particular toFrom each auxiliary data set (S)₁、S₂…S_n) The step of extracting the sample data with the same preset quantity to obtain a sample data set S comprises the following steps:

detecting each helper data set (S)₁、S₂…S_n) Whether the number of sample data in (a) is greater than the preset number (e.g., 10000 pieces);

if it is detected that the number is greater than or equal to the preset number, then sampling from each of the sets of auxiliary data (S) in a non-playback sampling manner₁、S₂…S_n) Extracting the sample data of the preset quantity;

if less than said predetermined number is detected, then a back-sampling mode is used to extract from each of said sets of auxiliary data (S)₁、S₂…S_n) And extracting the sample data of the preset quantity.

The non-put-back sampling mode is that each time one unit is extracted from the population, the unit is not put back into the population after investigation and recording, therefore, the number of the total units is reduced by one every time one unit is extracted, and the probability of being drawn in each unit is different. The back sampling method is a sampling method in which, when the individuals are extracted one by one, the extracted individual is put back into the population each time and then the next extraction is performed.

And step S14, training according to the sample data set to obtain a business state model for predicting the business state of the business object in the sample data.

In the embodiment of the application, the business state model is trained by using the sample data set S, so as to obtain a business state model which can predict payment of a business object (such as a loan customer) (predicting whether the customer is overdue for payment).

Specifically, in the model training process, model parameters may be adjusted according to a difference between a tag of input sample data and a tag of the input sample data output by the model, and when the tag of the input sample data is substantially consistent with the tag of the input sample data output by the model, the model training is ended, and a trained business state model is obtained.

Step S15, predicting the target data set and a plurality of auxiliary data sets by adopting a business state model to obtain default probabilities of the target data set and each auxiliary data set.

Specifically, the default probability of the target data set may be used as basic data, the default probabilities of the plurality of auxiliary data sets may be used as test data, and a population stability indicator of each auxiliary data set may be calculated according to the basic data and the test data;

and taking the auxiliary data set with the minimum index value in the population stability indexes as the modeling data set, wherein the population stability indexes are used for measuring the indexes of the deviation between the predicted value and the actual value of the model.

In the embodiment of the application, the basic data are grouped, and the test data are grouped according to the threshold standard of the grouping of the basic data, wherein the grouping number of the basic data is the same as the grouping number of the test data;

the population stability indicator psi is calculated as follows:

where n is the number of packets, i is the serial number of the packet, A_iIs the proportion of the sample in the group of the i-th group in the test data, E_iIs the proportion of the sample in the group of the ith group in the basic data. Recording the probability of breach for each secondary data set as psi₁、psi₂...psi_n。

Step S16, based on the target data set and the default probability of each auxiliary data set, a modeling data set is determined from the sample data set.

Will psi₁、psi₂...psi_nAnd taking the auxiliary data set corresponding to the medium minimum value as a modeling data set T.

And step S17, determining a weight parameter according to the target data set and the modeling data set.

In the embodiment of the present application, the formula for determining the weight parameter is as follows:

wherein, beta is a one-dimensional weight parameter array, and the one-dimensional weight parameter array comprises a weight parameter beta₁、β₂…β_jM is the number of samples, x ', of the modeling dataset T'_jFor the jth sample of the modeling data set T, n is the number of samples of the target data set S0, x_iFor the sample of the ith target data set S0, phi represents Euler' S formula, and the constraint condition of quadratic programming is beta₁、β₂…β_j0 or more and beta₁、β₂…β_jThe sum is 1.

Step S18, creating the business prediction model based on the modeling data set and the weight parameter.

In the embodiment of the application, sample data in the modeling data set is used as a modeling sample, and a weight parameter is used as the weight of the sample data in the modeling data set for model creation, so that the business prediction model is obtained.

According to the business prediction model creation method provided by the embodiment of the application, the auxiliary data set similar to the target data set is used, the modeling data set is screened out in a quantitative mode (the group stability index is adopted to determine the modeling data set), the sample weight in the modeling data set is adjusted (the weighted modeling sample data is closer to the target customer group, the sample deviation is reduced, and the model prediction capability and stability are improved), so that the samples in the modeling data set are closer to the samples of the business corresponding to the business prediction model to be created, the business prediction model can be created under the condition that the data volume of the target data set is less, and the created business prediction model has stronger prediction capability and stability.

Further, in the embodiment of the present application, the service state model and the service prediction model may be a logistic regression model, a binary classification model, a random forest model, a gradient boosting iterative decision tree model, or the like. Preferably, the service state model and the service prediction model may be a logistic regression model, and the logistic regression model is used for the service state model and the service prediction model, and compared with other models, the model has stronger interpretability and can reduce the risk of overfitting.

Referring to fig. 2, fig. 2 is a schematic diagram of functional modules of a service prediction model creation apparatus according to an embodiment of the present disclosure, in this embodiment, functional modules of the service prediction model creation apparatus 20 may be divided according to a method embodiment executed by a computer device, that is, the following functional modules corresponding to the service prediction model creation apparatus 20 may be used to execute the method embodiments executed by the computer device. The business prediction model-based creating device 20 may include a first obtaining module 21, a second obtaining module 22, a sample sampling module 23, a model training module 24, a default probability prediction module 25, a modeling data set determination module 26, a weight parameter determination module 27, and a model creating module 28, and the functions of the functional modules of the business prediction model creating device 20 are described in detail below.

The first obtaining module 21 is configured to obtain a target data set of a service prediction model to be created.

The service prediction model to be created may be a model for performing service prediction on a new service, where the new service refers to a service in which the service development time is less than a preset time (e.g., 3 months), and the new service may also refer to a service in which the number of sample data generated in the service scenario is less than a preset number (e.g., 1000). The target data set refers to a set of sample data generated in a new service scenario.

A second obtaining module 22, configured to obtain, based on the target data set, multiple auxiliary data sets that satisfy a preset service similarity condition with the target data set.

And a sample extracting module 23, configured to extract sample data from the multiple auxiliary data sets to obtain a sample data set.

In particular, from each auxiliary data set (S)₁、S₂…S_n) The step of extracting the sample data with the same preset quantity to obtain a sample data set S comprises the following steps:

if less than the detected valueA predetermined number of samples are taken from each of said sets of auxiliary data (S) in a sample with put back manner₁、S₂…S_n) And extracting the sample data of the preset quantity.

And the model training module 24 is configured to train according to the sample data set to obtain a service state model for predicting a service state of a service object in the sample data.

And a default probability prediction module 25, configured to predict the target data set and the multiple auxiliary data sets by using the service state model, so as to obtain default probabilities of the target data set and each auxiliary data set.

the population stability indicator psi is calculated as follows:

A modeling data set determining module 26, configured to determine a modeling data set from the sample data set based on the default probability of the target data set and each auxiliary data set.

A weight parameter determining module 27, configured to determine a weight parameter according to the target data set and the modeling data set.

In this embodiment, the formula for determining the weight parameter by the weight parameter determining module 270 may be as follows:

wherein, beta is a one-dimensional weight parameter array, and the one-dimensional weight parameter array comprises a weight parameter beta₁、β₂…β_jM is the number of samples, x ', of the modeling dataset T'_jFor the jth sample of said modeling data set T, n is said targetData set S₀Number of samples of (1), x_iFor the ith said target data set S₀Phi represents the Euler formula, and the constraint condition of quadratic programming is beta₁、β₂…β_j0 or more and beta₁、β₂…β_jThe sum is 1.

A model creation module 28, configured to create the business prediction model based on the modeling dataset and the weight parameter.

It should be noted that the division of the modules in the above apparatus or system is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity or may be physically separated. And these modules can be implemented in the form of software (e.g., open source software) that can be invoked by a processor; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by a processor, and part of the modules can be realized in the form of hardware. For example, the model creating module 28 may be implemented by a single processor, for example, the model creating module may be stored in a memory of the device or system in the form of program code, and a certain processor of the device or system calls and executes the functions of the model creating module 28, and the implementation of other modules is similar and will not be described herein again. In addition, the modules can be wholly or partially integrated together or can be independently realized. The processor described herein may be an integrated circuit with signal processing capability, and in the implementation process, each step or each module in the above technical solutions may be implemented in the form of an integrated logic circuit in the processor or a software program executed.

Referring to fig. 3, fig. 3 is a schematic diagram illustrating a hardware structure of a computer device 10 for implementing the business prediction model creating method according to the embodiment of the present disclosure, where the computer device 10 may be implemented on a cloud server. As shown in fig. 3, computer device 10 may include a processor 11, a computer-readable storage medium 12, a bus 13, and a communication unit 14.

In a specific implementation process, at least one processor 11 executes computer-executable instructions stored in a computer-readable storage medium 12 (for example, various modules included in the traffic prediction model creation apparatus 20 shown in fig. 2), so that the processor 11 may execute the traffic prediction model creation method according to the above method embodiment, where the processor 11, the computer-readable storage medium 12, and the communication unit 14 are connected through a bus 13, and the processor 11 may be used to control data reception and transmission of the communication unit 14.

For the specific implementation process of the processor 11, reference may be made to the above-mentioned method embodiments executed by the computer device 10, which implement the principle and the technical effect similarly, and the detailed description of the embodiment is omitted here.

Computer-readable storage medium 12 may include random access memory and may also include non-volatile storage, such as at least one disk storage.

The bus 13 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.

In addition, an embodiment of the present application further provides a readable storage medium, where the readable storage medium stores computer-executable instructions, and when a processor executes the computer-executable instructions, the method for creating a business prediction model as above is implemented.

To sum up, according to the method, the apparatus, and the computer-readable storage medium for creating a service prediction model provided in the embodiments of the present application, first, a target data set of a service prediction model to be created is obtained, and a plurality of auxiliary data sets similar to the target data set are found; then, sampling is carried out from a plurality of auxiliary data sets to obtain a sample data set, and a service state model is obtained through training of the sample data set; then, obtaining the default probability of the target data set and each auxiliary data set through the business state model, and determining a modeling data set based on the default probability; then, determining a weight parameter based on the target data set and the modeling data set; and finally, establishing a business prediction model by the modeling data set and the weight parameters. According to the scheme, the auxiliary data set similar to the target data set is used, the modeling data set is screened out in a quantification mode, and the weight of the samples in the modeling data set is adjusted, so that the samples in the modeling data set are closer to the samples of the business corresponding to the business prediction model to be created, the business prediction model can be created under the condition that the data volume of the target data set is less, and the created business prediction model has stronger prediction capability and stability.

The embodiments described above are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the detailed description of the embodiments of the present application provided in the accompanying drawings is not intended to limit the scope of the application, but is merely representative of selected embodiments of the application. Based on this, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A business prediction model creation method applied to a computer device, the method comprising:

acquiring a target data set of a service prediction model to be created;

2. The traffic prediction model creation method according to claim 1, wherein in the step of obtaining a plurality of auxiliary data sets satisfying a preset traffic similarity condition with the target data set based on the target data set, the preset traffic similarity condition includes:

3. The method for creating a traffic prediction model according to claim 1, wherein the step of extracting sample data from the plurality of auxiliary data sets to obtain a sample data set comprises:

4. The method for creating a traffic prediction model according to claim 1, wherein the step of determining a modeling data set from the sample data set based on the probability of breach for the target data set and each auxiliary data set comprises:

5. The traffic prediction model creation method according to claim 4, wherein in the step of calculating the population stability indicator for each set of auxiliary data based on the base data and the test data, the base data are grouped, and the test data are grouped according to a threshold standard for grouping of the base data, wherein the number of groups of the base data is the same as the number of groups of the test data;

the population stability indicator psi is calculated as follows:

6. The traffic prediction model creation method according to claim 5, wherein in the step of determining a weight parameter from the target data set and the modeling data set, the formula for determining the weight parameter is as follows:

7. The business prediction model creation method of claim 6, wherein the step of creating the business prediction model based on the modeling dataset and the weight parameter comprises:

8. The method of creating a business prediction model of claim 7, wherein the business state model and the business prediction model are logistic regression models.

9. An apparatus for creating a business prediction model, applied to a computer device, the apparatus comprising:

10. A computer-readable storage medium having stored therein instructions that, when executed, cause a computer device to perform the traffic prediction model creation method of any of claims 1-8.