CN113610176A

CN113610176A - Cross-scene migration classification model forming method and device and readable storage medium

Info

Publication number: CN113610176A
Application number: CN202110939361.2A
Authority: CN
Inventors: 顾凌云; 谢旻旗; 张阳; 王震宇
Original assignee: Shanghai IceKredit Inc
Current assignee: Shanghai IceKredit Inc
Priority date: 2021-08-16
Filing date: 2021-08-16
Publication date: 2021-11-05

Abstract

According to the method and the device for forming the cross-scene migration classification model and the readable storage medium, a source domain sample set and a target domain training sample set form a training sample set, the training sample set and a target domain test set are distributed differently, samples which are closest to the target domain distribution in source domain samples are found by adjusting sample weights of the source domain samples and the target domain training samples, meanwhile, the influence of the loss of the target domain samples is amplified, the effective data weight is increased, and the ineffective data weight is reduced. And judging whether iteration is finished or not through the model effect parameters, and taking the best one in the training process as a decision. Compared with the prior art, the method has the advantages that the last decision-making mode is obtained by promoting the weak classifiers and comprehensively voting the last half of the weak classifiers, the learner which has the best performance in the training process is used for making decisions, the whole transfer learning process is equivalent to only occurring in the model training process, and the development difficulty is reduced.

Description

Cross-scene migration classification model forming method and device and readable storage medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a method and a device for forming a cross-scene migration classification model and a readable storage medium.

Background

The traditional machine learning model is established on the basis that a training set and a test set obey the same distribution, the assumption is not satisfied under many conditions, sometimes the training set is overdue, and the cost of re-labeling data is high, so that a classifier is expected to be trained by using a plurality of training sets with different distributions, and a good classification effect can be obtained on the test set. For example, a certain unit (e.g., a bank) has a short time to develop a new service (e.g., a large-amount loan service), at this time, the amount of samples in the scene is insufficient, and bad samples are fewer, however, the bank needs to make a model in the new service scene urgently, and forced modeling may cause weak model prediction capability and unstable model.

Disclosure of Invention

In order to overcome at least the above-mentioned deficiencies in the prior art, the present application aims to provide a method, an apparatus and a readable storage medium for forming a cross-scene migration classification model, which are used to solve the above-mentioned technical problems.

In a first aspect, an embodiment of the present application provides a method for forming a cross-scene migration classification model, which is applied to a computer device, and the method includes:

initializing sample set weight parameters and determining a classification algorithm and iteration times, wherein the sample set comprises a training sample set consisting of a source domain sample set and a target domain training sample set and a target domain test set consisting of target domain test samples;

calling the classification algorithm, and obtaining a classifier on the target domain test set based on the weight distribution condition of each sample in the training sample set and the target domain test set;

calculating the error rate of the classifier in the training sample set of the target domain, and adjusting the weight of the training sample set based on the error rate;

calculating model effect parameters of the classifier on the target domain test set, and storing corresponding iteration labels;

and detecting whether the model effect parameter meets an iteration ending condition, returning to the step of calling the classification algorithm when the iteration ending condition is not met, and obtaining a classifier on the target domain test set based on the weight distribution condition of each sample in the training sample set and the target domain test set, wherein the classifier corresponding to the condition that the model effect parameter meets the iteration ending condition is used as a trained classification model until the model effect parameter meets the condition that the iteration ends.

According to the scheme provided by the application, a source domain sample set and a target domain training sample set form a training sample set, the training sample set and a target domain test set are distributed differently, samples which are closest to the target domain in the source domain sample are found by adjusting sample weights of the source domain sample and the target domain training sample, meanwhile, the influence of loss of the target domain sample is amplified, the effective data weight is increased, and the ineffective data weight is reduced. And judging whether iteration is finished or not through the model effect parameters, and taking the best one in the training process as a decision. Compared with the prior art, the migration learning algorithm has the advantages that the plurality of weak classifiers are improved, the last half of weak classifiers are used for carrying out comprehensive voting, and the final decision making mode is obtained. Meanwhile, the single model (classifier) is used for final prediction, and the method can be suitable for more actual services.

In one possible implementation, in the step of initializing the sample set weight parameter and determining the classification algorithm and the number of iterations:

initializing a weight vector W of the training sample set¹And a weight adjustment parameter β;

wherein the weight vector

n is the number of samples in the source domain sample set, and m is the number of samples in the target domain training sample setAnd N is the iteration number.

In one possible implementation, the weight distribution P on the training sample set^tThe following formula is satisfied:

where t is 1.. and N, t is the corresponding number of iterations.

In one possible implementation, the step of calculating an error rate of the classifier in the training sample set of the target domain and adjusting the weight of the training sample set based on the error rate includes:

calculating the error rate of the classifier in the target domain training sample set;

correcting a weight adjustment parameter based on the error rate;

adjusting the weights of the samples in the training sample set based on the corrected weight adjustment parameters;

calculating the error rate xi of the classifier in the target domain training sample set_tThe formula of (1) is as follows:

corrected weight adjustment parameter β t:

β_t＝ε_t/(1-ε_t)

the weight distribution of the samples in the adjusted training sample set satisfies the following conditions:

wherein h (x) is the predicted label probability, and c (x) is the labeled label probability.

In a possible implementation manner, in the step of detecting whether the model effect parameter satisfies an iteration end condition, the iteration end condition includes any one of the following three conditions:

the first model effect parameter meets a corresponding judgment rule;

the second model effect parameter meets a corresponding judgment rule; or the like, or, alternatively,

the first model effect parameter or the second model effect parameter satisfies a corresponding judgment rule.

In a possible implementation manner, the sample of the source domain is a sample of consumption loan smaller than a first consumption amount, and the sample of the target domain is a sample of consumption loan larger than a second consumption amount, wherein the first consumption amount is not larger than the second consumption amount.

In one possible implementation, the classification algorithm includes an extreme gradient boosting model.

In a second aspect, an embodiment of the present application further provides an apparatus for forming a cross-scene migration classification model, which is applied to a computer device, and the apparatus includes:

the initialization module is used for initializing weight parameters of a sample set and determining a classification algorithm and iteration times, wherein the sample set comprises a training sample set consisting of a source domain sample set and a target domain training sample set and a target domain test set consisting of target domain test samples;

the calling module is used for calling the classification algorithm and obtaining a classifier on the target domain test set based on the weight distribution condition of each sample in the training sample set and the target domain test set;

the calculation and adjustment module is used for calculating the error rate of the classifier in the training sample set of the target domain and adjusting the weight of the training sample set based on the error rate;

the calculation and storage module is used for calculating the model effect parameters of the classifier on the target domain test set and storing the corresponding iteration labels;

and the detection module is used for detecting whether the model effect parameters meet the iteration ending conditions or not, repeatedly executing the functions of the calling module, the calculation and adjustment module and the calculation and storage module when the iteration ending conditions are not met until the model effect parameters meet the iteration ending conditions, and taking the classifier corresponding to the condition that the model effect parameters meet the iteration ending conditions as a trained classification model.

In a third aspect, an embodiment of the present application provides a readable storage medium, where instructions are stored in the readable storage medium, and when the instructions are executed, the instructions cause a computer to execute the method for forming a cross-scene migration classification model in the first aspect or any one of the possible implementation manners of the first aspect.

Based on any one of the above aspects, the source domain sample set and the target domain training sample set are combined into a training sample set, the training sample set and the target domain test set are distributed differently, samples which are closest to the target domain distribution in the source domain samples are found by adjusting the sample weights of the source domain samples and the target domain training samples, the influence of the loss of the target domain samples is amplified, the effective data weight is increased, and the ineffective data weight is reduced. And judging whether iteration is finished or not through the model effect parameters, and taking the best one in the training process as a decision. Compared with the prior art, the migration learning algorithm has the advantages that the plurality of weak classifiers are improved, the last half of weak classifiers are used for carrying out comprehensive voting, and the final decision making mode is obtained. Meanwhile, the single model (classifier) is used for final prediction, and the method can be suitable for more actual services.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that need to be called in the embodiments are briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

Fig. 1 is a schematic flowchart of a cross-scene migration classification model forming method according to an embodiment of the present application;

FIG. 2 is a flowchart illustrating a sub-step of step S13 in FIG. 1;

fig. 3 is a functional module schematic diagram of a cross-scene migration classification model forming apparatus according to an embodiment of the present application;

fig. 4 is a schematic hardware structure diagram of a computer device according to an embodiment of the present application.

Detailed Description

The present application will now be described in detail with reference to the drawings, and the specific operations in the method embodiments may also be applied to the apparatus embodiments or the system embodiments.

In order to solve the technical problems in the background art, in the prior art, effective data can be screened from source domain data through a transfer learning algorithm (TrAdaBoost algorithm), data which are not matched with a target domain are filtered, a weight adjusting mechanism is established through a Boosting method, and for a source domain sample, the closer a predicted value and a label are, the larger the weight is; the opposite is true for the target domain samples, the larger the predictor and tag differences, the larger the weight. The strategy is to find a sample which is distributed most closely to a target domain in a source domain sample, and simultaneously amplify the influence of the loss of the target domain sample, increase the effective data weight and reduce the ineffective data weight. The original TrAdaBoost algorithm uses the second half of weak classifiers to perform comprehensive voting by promoting a plurality of weak classifiers so as to obtain the final decision. The whole transfer learning process can occur in the model training and testing processes, and the model development difficulty is high.

In order to overcome the above-mentioned deficiencies, the inventor provides the following solution, please refer to fig. 1, and fig. 1 is a schematic flow diagram of a cross-scene migration classification model forming method provided in an embodiment of the present application, where the cross-scene migration classification model forming method provided in this embodiment may be executed by a computer device, and for convenience of explaining a technical solution of the present application, the cross-scene migration classification model forming method is described in detail below in conjunction with a possible application scenario, where the possible application scenario may be used in a financial loan scenario, and it is understood that the technical solution provided in this application may also be applied to other scenarios, for example, product information popularization based on big data. The method for forming the cross-scene migration classification model provided by the application is described below by taking a financial loan scene as an example.

The flow steps of the cross-scene migration classification model forming method are explained in detail with reference to fig. 1.

And step S11, initializing sample set weight parameters and determining a classification algorithm and iteration times.

The sample set may include a training sample set T (T ═ AUB) composed of a source domain sample set a and a target domain training sample set B, and a target domain test set S composed of target domain test samples, where the source domain refers to existing old knowledge and the target domain is new knowledge to be learned. In the embodiment of the application, the sample of the source domain may be a sample of a consumption loan smaller than a first consumption amount, and the sample of the target domain may be a sample of a consumption loan larger than a second consumption amount, where the first consumption amount is not larger than the second consumption amount.

In this step, the step of initializing the sample set weight parameters specifically includes:

wherein the weight vector

N is the number of samples in the source domain sample set, m is the number of samples in the target domain training sample set, and N is the number of iterations.

In the embodiment of the present application, the classification algorithm may use an extreme gradient boost model (XGBoost model).

In the prior art, sample weight vectors of a source domain and a target domain are initialized, and there is a place of all-1 initialization processing, and considering that the data volume of the source domain is sufficient and the target domain is insufficient at many times, the method enables the total weight of data of the two domains to be equal, and if the initialization is carried out by all-1, the model is obviously biased to the source domain. In contrast, the initialization method of the present invention is more reasonable.

And step S12, calling the classification algorithm, and obtaining a classifier on the target domain test set based on the weight distribution condition of each sample in the training sample set and the target domain test set.

Wherein the weight distribution P on the training sample set^tThe following formula is satisfied:

where t is 1.. and N, t is the corresponding number of iterations.

And obtaining a classifier ht on the target test set S.

And step S13, calculating the error rate of the classifier in the training sample set of the target domain, and adjusting the weight of the training sample set based on the error rate.

Referring to fig. 2, in the embodiment, step S13 can be implemented by the following sub-steps.

And a substep S131, calculating the error rate of the classifier in the training sample set of the target domain.

Wherein ξ_tFor error rate, h (x) is predicted label probability, and c (x) is labeled label probability.

And a substep S132 of correcting the weight adjustment parameter based on the error rate.

Corrected weight adjustment parameter β t:

β_t＝ε_t/(1-ε_t)

and a substep S133 of adjusting the weights of the samples in the training sample set based on the corrected weight adjustment parameters.

and step S14, calculating the model effect parameters of the classifier on the target domain test set, and storing the corresponding iteration labels.

In this step, the step of implementing step S14 is as follows:

firstly, calculating an area value (AUC value) under a subject working characteristic curve, and using the area value as a first model effect parameter of the model effect parameter, wherein the subject working characteristic curve is a curve drawn by using a True Positive Rate (TPR) as a vertical coordinate and a False Positive Rate (FPR) as a horizontal coordinate.

Then, a maximum value (ks ═ max (abs (TPR-FPR))) of an absolute value of a difference between the ordinate and the abscissa in the subject's work characteristic curve is calculated, and the maximum value is set as a second model effect parameter of the model effect parameters.

And finally, recording the iteration times corresponding to the model effect parameters.

And step S15, detecting whether the model effect parameter meets the iteration end condition.

If the iteration end condition is satisfied, the process proceeds to step S16, and if the iteration end condition is not satisfied, the process returns to step S12.

In step S15, the iteration end condition includes any one of the following three:

the first model effect parameter meets a corresponding judgment rule;

And step S16, taking the classifier corresponding to the condition that the model effect parameter meets the iteration end as a trained classification model.

The method for forming the cross-scene migration classification model comprises the steps of forming a training sample set by a source domain sample set and a target domain training sample set, enabling the training sample set and a target domain test set to be distributed differently, finding a sample which is distributed closest to a target domain in a source domain sample by adjusting sample weights of the source domain sample and the target domain training sample, amplifying the influence of loss of the target domain sample, increasing effective data weight, and reducing ineffective data weight. And judging whether iteration is finished or not through the model effect parameters, and taking the best one in the training process as a decision. Compared with the prior art, the migration learning algorithm has the advantages that the plurality of weak classifiers are improved, the last half of weak classifiers are used for carrying out comprehensive voting to obtain the final decision-making mode, the learner which has the best performance in the training process is used for carrying out decision-making, the whole migration learning process is equivalent to only occurring in the model training process, and the development difficulty is reduced. Meanwhile, the single model (classifier) is used for final prediction, and the method can be suitable for more actual services.

Referring to fig. 3, fig. 3 is a schematic diagram of functional modules of a cross-scene migration classification model forming apparatus according to an embodiment of the present disclosure, in this embodiment, functional modules of the cross-scene migration classification model forming apparatus 20 may be divided according to a method embodiment executed by a computer device, that is, the following functional modules corresponding to the cross-scene migration classification model forming apparatus 20 may be used to execute the method embodiments executed by the computer device. The cross-scene migration-based classification model forming apparatus 20 may include an initialization module 21, a calling module 22, a calculation and adjustment module 23, a calculation and storage module 24, and a detection module 25, and the functions of the functional modules of the cross-scene migration classification model forming apparatus 20 are described in detail below.

And the initialization module 21 is used for initializing the sample set weight parameters and determining a classification algorithm and iteration times.

In the embodiment of the present application, the function of the initialization module 21 may be as follows:

wherein the weight vector

And the calling module 22 is configured to call the classification algorithm, and obtain a classifier on the target domain test set based on the weight distribution of each sample in the training sample set and the target domain test set.

where t is 1.. and N, t is the corresponding number of iterations.

And obtaining a classifier ht on the target test set S.

And a calculating and adjusting module 23, configured to calculate an error rate of the training sample set in the target domain, and adjust the weight of the training sample set based on the error rate.

In the claimed embodiment, the function of the calculation and adjustment module 23 may be as follows.

First, the error rate of the classifier in the training sample set of the target domain is calculated.

Then, a weight adjustment parameter is corrected based on the error rate.

Corrected weight adjustment parameter β t:

β_t＝ε_t/(1-ε_t)

and finally, adjusting the weights of the samples in the training sample set based on the corrected weight adjustment parameters.

and the calculating and storing module 24 is configured to calculate a model effect parameter of the classifier on the target domain test set, and store a corresponding iteration label.

The calculation and storage module 24 implements the following functions:

A detecting module 25, configured to detect whether the model effect parameter meets an iteration end condition, and when the iteration end condition is not met, repeatedly execute the functions of the calling module 22, the calculating and adjusting module 23, and the calculating and storing module 24 until the model effect parameter meets the iteration end condition, and use a classifier corresponding to the condition that the model effect parameter meets the iteration end condition as a trained classification model.

In the process of detecting whether the model effect parameter satisfies the iteration end condition by the detection module 25, the iteration end condition used may include any one of the following three conditions:

the first model effect parameter meets a corresponding judgment rule;

It should be noted that the division of the modules in the above apparatus or system is only a logical division, and the actual implementation may be wholly or partially integrated into one physical entity or may be physically separated. And these modules can be implemented in the form of software (e.g., open source software) that can be invoked by a processor; or may be implemented entirely in hardware; and part of the modules can be realized in the form of calling software by a processor, and part of the modules can be realized in the form of hardware. For example, the detection module 25 may be implemented by a single processor, for example, the detection module may be stored in a memory of the apparatus or system in the form of program codes, and a certain processor of the apparatus or system calls and executes the functions of the detection module 25, and the implementation of other modules is similar to that, and thus will not be described herein again. In addition, the modules can be wholly or partially integrated together or can be independently realized. The processor described herein may be an integrated circuit with signal processing capability, and in the implementation process, each step or each module in the above technical solutions may be implemented in the form of an integrated logic circuit in the processor or a software program executed.

Referring to fig. 4, fig. 4 is a schematic diagram of a hardware structure of a computer device 10 for implementing the cross-scene migration classification model forming method according to the embodiment of the present disclosure, where the computer device 10 may be implemented on a cloud server. As shown in fig. 4, computer device 10 may include a processor 11, a readable storage medium 12, a bus 13, and a communication unit 14.

In a specific implementation process, at least one processor 11 executes computer-executable instructions stored in the readable storage medium 12 (for example, various modules included in the cross-scene migration classification model forming apparatus 20 shown in fig. 3), so that the processor 11 may execute the cross-scene migration classification model forming method according to the above method embodiment, where the processor 11, the readable storage medium 12, and the communication unit 14 are connected through the bus 13, and the processor 11 may be configured to control data reception and transmission of the communication unit 14.

For the specific implementation process of the processor 11, reference may be made to the above-mentioned method embodiments executed by the computer device 10, which implement the principle and the technical effect similarly, and the detailed description of the embodiment is omitted here.

Readable storage medium 12 may comprise random access memory and may also include non-volatile storage, such as at least one disk storage.

The bus 13 may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.

In addition, an embodiment of the present application further provides a readable storage medium, where a computer executing instruction is stored, and when a processor executes the computer executing instruction, the above method for forming a cross-scene migration classification model is implemented.

In summary, according to the method, the apparatus, and the readable storage medium for forming the cross-scene migration classification model provided in the embodiments of the present application, a source domain sample set and a target domain training sample set are combined into a training sample set, the training sample set and a target domain test set are distributed differently, and by adjusting sample weights of the source domain sample and the target domain training sample, a sample closest to the target domain in the source domain sample is found, and meanwhile, an influence of loss of the target domain sample is amplified, effective data weights are increased, and invalid data weights are reduced. And judging whether iteration is finished or not through the model effect parameters, and taking the best one in the training process as a decision. Compared with the prior art, the migration learning algorithm has the advantages that the plurality of weak classifiers are improved, the last half of weak classifiers are used for carrying out comprehensive voting, and the final decision making mode is obtained. Meanwhile, the single model (classifier) is used for final prediction, and the method can be suitable for more actual services.

The embodiments described above are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the detailed description of the embodiments of the present application provided in the accompanying drawings is not intended to limit the scope of the application, but is merely representative of selected embodiments of the application. Based on this, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for forming a cross-scene migration classification model is applied to a computer device and comprises the following steps:

2. The method of forming a cross-scene migration classification model according to claim 1, wherein in the steps of initializing sample set weight parameters and determining classification algorithms and iteration times:

wherein the weight vector

3. The method of forming a cross-scene migration classification model according to claim 2, characterized in that the weight distribution P on the training sample set^tThe following formula is satisfied:

where t is 1.. and N, t is the corresponding number of iterations.

4. The method of claim 3, wherein the step of calculating an error rate of the classifier in the target domain training sample set and adjusting the weights of the training sample set based on the error rate comprises:

correcting a weight adjustment parameter based on the error rate;

corrected weight adjustment parameter β t:

β_t＝ε_t/(1-ε_t)

5. The method of claim 4, wherein the step of calculating model effect parameters of the classifier on the target domain test set and storing corresponding iteration labels comprises:

calculating an area value under a working characteristic curve of the subject, and taking the area value as a first model effect parameter of the model effect parameters;

calculating the maximum value of the absolute value of the difference between the ordinate and the abscissa in the working characteristic curve of the subject, and taking the maximum value as a second model effect parameter of the model effect parameters;

recording the number of iterations corresponding to the model effect parameter;

wherein the test subject working characteristic curve is a curve drawn by taking the true positive rate as the ordinate and the false positive rate as the abscissa.

6. The method as claimed in claim 5, wherein in the step of detecting whether the model effect parameter satisfies an iteration end condition, the iteration end condition includes any one of the following three conditions:

the first model effect parameter meets a corresponding judgment rule;

7. The method for forming the cross-scene migration classification model according to any one of claims 1-6, wherein the sample of the source domain is a sample of loan consumption smaller than a first consumption amount, and the sample of the target domain is a sample of loan consumption larger than a second consumption amount, wherein the first consumption amount is not larger than the second consumption amount.

8. The method of forming a cross-scene migration classification model according to claim 7, wherein the classification algorithm includes an extreme gradient boosting model.

9. An apparatus for forming a cross-scene migration classification model, applied to a computer device, the apparatus comprising:

10. A readable storage medium having stored therein instructions that, when executed, cause a computer device to perform the cross-scene migration classification model forming method of any of the preceding claims 1-8.