CN114998029A - Model calibration method and device, electronic equipment and computer readable medium - Google Patents

Model calibration method and device, electronic equipment and computer readable medium Download PDF

Info

Publication number
CN114998029A
CN114998029A CN202210629001.7A CN202210629001A CN114998029A CN 114998029 A CN114998029 A CN 114998029A CN 202210629001 A CN202210629001 A CN 202210629001A CN 114998029 A CN114998029 A CN 114998029A
Authority
CN
China
Prior art keywords
wind control
model
grouping
score
groups
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210629001.7A
Other languages
Chinese (zh)
Inventor
康业猛
唐亚平
董立武
敖兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202210629001.7A priority Critical patent/CN114998029A/en
Publication of CN114998029A publication Critical patent/CN114998029A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Mathematical Analysis (AREA)
  • Finance (AREA)
  • Computational Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Strategic Management (AREA)
  • Computer Hardware Design (AREA)
  • Development Economics (AREA)
  • Evolutionary Biology (AREA)
  • Geometry (AREA)
  • General Business, Economics & Management (AREA)
  • Operations Research (AREA)
  • Probability & Statistics with Applications (AREA)
  • Marketing (AREA)
  • Algebra (AREA)
  • Technology Law (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The application discloses a model calibration method, a model calibration device, electronic equipment and a computer readable medium, which relate to the technical field of computers, and the method comprises the following steps: responding to the abnormal model output, calling a calibration program to obtain a corresponding wind control model identification, and obtaining a corresponding wind control evaluation set according to the wind control model identification; grouping the wind control evaluation sets to obtain first wind control evaluation groups; determining a bad sample rate and a wind control score mean value corresponding to each first wind control score group; grouping the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups; and fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters. The risk sequencing performance of the model is improved, and the accuracy of the output result of the model is improved.

Description

Model calibration method and device, electronic equipment and computer readable medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a model calibration method and apparatus, an electronic device, and a computer-readable medium.
Background
At present, in a consumption financial wind control scene, a scoring card technology focuses on risk ranking of output results after forecasting customer groups by using big data and a machine learning model, but for application of group modeling, after a plurality of submodels are obtained, wind control evaluation is performed on different customer groups through the submodels, and the output wind control evaluation results have large difference.
In the process of implementing the present application, the inventor finds that at least the following problems exist in the prior art:
for the application of the clustering modeling, after a plurality of submodels are obtained, wind control evaluation is carried out on different passenger groups through the plurality of submodels, and the output wind control evaluation results have larger difference.
Disclosure of Invention
In view of this, embodiments of the present application provide a model calibration method, an apparatus, an electronic device, and a computer readable medium, which can solve the problem that, in an application of the existing clustering modeling, after a plurality of submodels are obtained, and after wind control evaluation is performed on different passenger groups by the plurality of submodels, output wind control evaluation results have a large difference.
To achieve the above object, according to an aspect of embodiments of the present application, there is provided a model calibration method configured to:
responding to the abnormal model output, calling a calibration program to obtain a corresponding wind control model identification, and obtaining a corresponding wind control evaluation set according to the wind control model identification;
grouping the wind control evaluation sets to obtain first wind control evaluation groups;
determining a bad sample rate and a wind control score mean value corresponding to each first wind control score group;
grouping the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups;
and fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters.
Optionally, grouping the sets of wind control scores includes:
acquiring a sample identifier corresponding to each wind control score in each wind control score set;
and grouping the wind control scores based on the sample identifications and preset bad sample rate threshold values.
Optionally, grouping the sets of wind control scores includes:
and performing equal-frequency grouping on each wind control evaluation set based on the preset grouping number.
Optionally, grouping the first wind control evaluation groups according to the bad sample rate includes:
clustering the bad sample rate to generate a cluster;
and grouping the first wind control evaluation group according to the clustering cluster.
Optionally, grouping the first wind-controlled scoring groups according to the cluster includes:
and dividing the first wind control evaluation groups corresponding to each cluster into the same group.
Optionally, fitting the wind control score mean values corresponding to the second wind control score groups, including:
and mapping each wind control score mean value corresponding to each second wind control score group to a preset score range.
Optionally, before grouping the sets of wind control scores, the method further comprises:
sorting the wind control scores in each wind control score set to obtain sorted wind control score sets; and
grouping the wind control evaluation sets, including:
and grouping the sorted wind control score sets.
In addition, the present application also provides a model calibration apparatus configured to:
the acquisition unit is configured to respond to the abnormal model output, call a calibration program to acquire a corresponding wind control model identification and acquire a corresponding set of wind control evaluation according to the wind control model identification;
the first grouping unit is configured to group each wind control scoring set to obtain each first wind control scoring group;
the calculating unit is configured to determine a bad sample rate and a wind control score mean value corresponding to each first wind control score group;
the second grouping unit is configured to group the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups;
and the calibration unit is configured to fit the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determine model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrate the model corresponding to the wind control model identification according to the model parameters.
Optionally, the first grouping unit is further configured to:
acquiring a sample identifier corresponding to each wind control score in each wind control score set;
and grouping the wind control scores based on the sample identifications and preset bad sample rate threshold values.
Optionally, the first grouping unit is further configured to:
and performing equal-frequency grouping on each wind control evaluation set based on the preset grouping number.
Optionally, the second packet unit is further configured to:
clustering the bad sample rate to generate a cluster;
and grouping the first wind control evaluation group according to the clustering cluster.
Optionally, the second packet unit is further configured to:
and dividing the first wind control evaluation groups corresponding to each cluster into the same group.
Optionally, the calibration unit is further configured to:
and mapping each wind control score mean value corresponding to each second wind control score group to a preset score range.
Optionally, the model calibration apparatus further comprises a sorting unit configured to:
sorting the wind control scores in each wind control score set to obtain sorted wind control score sets; and
the first grouping unit is further configured to:
and grouping the sorted wind control score sets.
Additionally, the present application provides a model calibration electronic device configured to: one or more processors; a storage device for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the model calibration method as described above.
In addition, the present application also provides a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the model calibration method as described above.
One embodiment of the above invention has the following advantages or benefits: the method comprises the steps of responding to model output abnormity, calling a calibration program to obtain a corresponding wind control model identifier, and obtaining a corresponding wind control evaluation set according to the wind control model identifier; grouping the wind control evaluation sets to obtain first wind control evaluation groups; determining a bad sample rate and a wind control score mean value corresponding to each first wind control score group; grouping the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups; and fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters. Firstly grouping the wind control scoring sets once, then grouping the wind control scoring sets at unequal frequencies according to the bad sample rate of each group after grouping, then fitting the wind control scoring in the groups after grouping at unequal frequencies to obtain fitting data, and further obtaining model parameters of each regression model according to the fitting data. Therefore, the obtained model parameters are applied to each regression model, and the model results in the same range can be output according to different input data, so that the risk ranking performance of the models is improved, and the accuracy of the output results of the models is improved.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a further understanding of the application and are not to be construed as limiting the application. Wherein:
FIG. 1 is a schematic diagram of a main flow of a model calibration method according to a first embodiment of the present application;
FIG. 2 is a schematic diagram of a main flow of a model calibration method according to a second embodiment of the present application;
FIG. 3a is a diagram illustrating output results of a model before model calibration according to an embodiment of the present disclosure;
FIG. 3b is a diagram illustrating the output of the model after model calibration according to the model calibration method of the embodiment of the present application;
FIG. 4 is a schematic diagram of the main elements of a model calibration apparatus according to an embodiment of the present application;
FIG. 5 is an exemplary system architecture diagram to which embodiments of the present application may be applied;
fig. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, is configured to enable various details of the embodiments of the application to be understood, and is intended to be exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. According to the technical scheme, the data acquisition, storage, use, processing and the like meet relevant regulations of national laws and regulations.
Fig. 1 is a schematic diagram of a main flow of a model calibration method according to a first embodiment of the present application, and as shown in fig. 1, the model calibration method is configured to:
and S101, responding to the abnormal model output, calling a calibration program to obtain a corresponding wind control model identifier, and obtaining a corresponding wind control evaluation set according to the wind control model identifier.
The model in the embodiment of the present application may be two or more models being trained, and the following description will take two models being trained as an example. In the embodiment of the present application, linear regression is used when performing model calibration, but the model used when performing model training is not limited to the linear regression model.
In this embodiment, the execution subject (for example, a server) of the model calibration method may receive the result output by the model through a wired connection or a wireless connection. And when the execution main body receives the result output by the model and detects that the result is abnormal, calling a calibration program to obtain the wind control model identification corresponding to the result, such as the number and the name of the wind control model. Furthermore, the execution subject can obtain the corresponding wind control evaluation set according to the wind control model identification. The wind control score set may be a set formed by the wind control scores corresponding to each client output by the model corresponding to the wind control model identifier according to the input guest group data.
Specifically, the output abnormality of the model may be, for example, that the output results of the two models are both 0.1, but the expressed meanings are different, curves generated by the output results of the two models in the same coordinate system do not coincide and have a large curve difference, and a curve obtained by fitting curves corresponding to the output results of the two models still cannot coincide with the target distribution curve, and it is determined that the output abnormality of the model is present.
For example, the two models perform the wind control analysis on the two passenger groups obtained by dividing the same sample set, however, when the executing subject detects that a curve (e.g., the curve A, D in fig. 3 a) formed by the wind control analysis scores output by the models does not coincide in the same coordinate system, and a fitted curve (e.g., the curve B in fig. 3 a) obtained by fitting the formed curve (e.g., the curve A, D in fig. 3 a) does not coincide with a preset real curve, that is, the target distribution curve (e.g., the curve C in fig. 3 a), the executing subject may determine that the model output is abnormal. It will be understood that the preset true curve may be a standard curve obtained by fitting the pre-labeled curve a and curve D in fig. 3, i.e. the target distribution curve (e.g. curve C in fig. 3 a). In an ideal state, the curve B and the curve C in fig. 3 are coincident, and the goal of the embodiment of the present application is to make the output result of the model corresponding to each customer group data reach an ideal state through training, that is, the risk rankings are the same.
Taking fig. 3a as an example, the curve C is a target distribution curve for reference of model calibration, and the purpose of the embodiment of the present application is to let the fitted curve B of the curve a and the curve D approach the curve C infinitely. For example, a curve a and a curve D in fig. 3a are obtained by performing wind control analysis on the passenger group a and the passenger group D by the model a and the model D, respectively, and the corresponding relationship between the curve, the model, and the passenger group is curve a-model a-passenger group a and curve D-model D-passenger group D, and when the executing subject detects that the curve a and the curve D are not coincident, and the fitting curve B of the curves a and D is not coincident with the target distribution curve C and the difference is greater than the threshold, it may be determined that the model output is abnormal.
And step S102, grouping the wind control evaluation groups to obtain first wind control evaluation groups.
Specifically, grouping the wind control evaluation sets includes:
acquiring a sample identifier corresponding to each wind control score in each wind control score set;
and grouping the wind control scores based on the sample identifications and preset bad sample rate threshold values.
The execution main body can firstly obtain a sample identifier corresponding to each wind control score in each wind control score set so as to determine which customers correspond to each wind control score, and the sample identifier specifically can be a customer name, a customer mobile phone number and the like, and then the execution main body can obtain a sample evaluation identifier corresponding to each customer according to the sample identifier, wherein the sample evaluation identifier can be 0 or 1, for example, 1 can represent a customer with poor credit, and 0 can represent a customer with good credit. The execution subject can group the wind control scores in a random grouping mode based on the sample identification and the sample evaluation identification, but the bad sample rate in each group is ensured to be lower than a preset bad sample rate threshold value. The bad sample rate is the ratio of the number of bad samples in each group to the total number of samples in the group. The preset bad sample rate may be 0.01, or may be other values, and the preset bad sample rate is not limited in the embodiment of the present application.
For example, the execution subject may first group 10, each attempt with 5 groups, and up to 50 groups, in this way determining the number of groups to be finally divided in combination with the bad sample rate of each group not exceeding a preset bad sample rate threshold. The concrete implementation is as follows:
# grouping sorted data
g1=[]
g2=[]
for g in range(10,50,5):
Wherein g is the number of groups divided.
Specifically, grouping the wind control evaluation sets includes: and carrying out equal-frequency grouping on each wind control evaluation set based on the preset grouping number.
The equal frequency grouping is, for example, set to be divided into 10 groups, and for each set of the wind control scores, the wind control scores are divided into 10 groups, and the number of the wind control scores in each group is the same.
Specifically, before grouping the sets of wind control scores, the method further comprises: sorting the wind control scores in each wind control score set, specifically, sorting in a descending manner or sorting in an ascending manner, so as to obtain sorted wind control score sets; and grouping the wind control evaluation sets, including: and grouping the sorted wind control score sets.
Supposing that training data X-Y exist, dividing a data set into two guest groups of X1-Y1 and X2-Y2 after data analysis and model training preliminary verification, and then respectively establishing data models to obtain corresponding model]1 and [ model]2. Two sub-models with different risk orderings are obtained. After two models with inconsistent performance are obtained, X is respectively added 1 ~Y 1 And X 2 ~Y 2 Through the corresponding model]1 and [ model]2 prediction scoring, resulting in p1 and p2, followed by ranking p1 and p2, resulting in p11 and p 22. Grouping p11 and p22 can be a number of groups of 20, 30 or more.
For example, the execution agent may perform the following operations:
construction of the Generation mapping dataset
Input # p1, label 1); (p2, label2)
# output (p11, p22)
#***********************************
# start:
# ordering input data
p11=sort(p1,label1);p22=sort(p2,label2)
And further obtaining a wind control score sorting result in each wind control score set.
Specifically, after the wind control scores in each wind control score set are sorted, the sorted wind control scores in each wind control score set are grouped. Illustratively, the wind control scores in the set of wind control scores 1 are ranked as 0.6, 0.5, 0.4, 0.3, 0.2, 0.1. The wind control scores in the set of wind control scores 2 are ranked as 0.9, 0.7, 0.4, 0.3, 0.2, 0.1. The execution subject may group the wind control evaluation set 1 into [0.6, 0.5], [0.4, 0.3], [0.2, 0.1], and the bad sample rate of each divided group is 0, respectively. The execution subject may group the wind control evaluation group 2 as [0.9, 0.7], [0.4, 0.3], [0.2, 0.1], and the bad sample rates of each of the divided groups are 0.005, 0.003, 0.002, respectively.
And S103, determining the bad sample rate and the wind control score mean value corresponding to each first wind control score group.
After the execution subject groups the wind control score sets to obtain the first wind control score groups, a bad sample rate (bad rate) and a wind control score average (p-average) in each group returned by the group function may be received.
The implementation of step S103 is as follows:
after the # group function returns the grouping, the bad rate and the p mean value in each group
_g1=group(p11,g)
_g2=group(p22,g)
g1.extend(_g1)
g2.extend(_g2)
Data structure within # g1 or g2 [ (bad rate _1, p _1),. -% (bad rate _ n, p _ n) ]
And step S104, grouping the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups.
And (3) putting groups with similar bad sample rates (bad rates) in the first wind control evaluation groups into which the wind control evaluation groups are divided into the same group to obtain a new group, namely a second wind control evaluation group, namely a fitting data group. The method is realized as follows:
# traversal g1\ g2, compare bad rates, make the bad rate close (within a certain range) to the mean of p of the corresponding groups into a fitting data set (p _ g1, p _ g2)
Figure BDA0003679074710000091
The return data is a fitting data group, p is a wind control score, the mean value of p is the mean value of the wind control score corresponding to each first wind control score group, and the bad rate is the bad sample rate corresponding to each first wind control score group.
And step S105, fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters.
Specifically, fitting the wind control score mean values corresponding to the second wind control score groups includes: and mapping each wind control score mean value corresponding to each second wind control score group to a preset score range. For example, a curve C shown in fig. 3a is generated according to the scores of the second wind control score groups mapped to the preset score range, that is, as shown in fig. 3B, a curve B shown in fig. 3a coincides with a curve C, which indicates that the linear regression performance of the model a is the same as that of the model D, and the model parameters of the model at this time are recorded and updated into the model, so as to train the risk ranking performance of the model.
In the embodiment, a calibration program is called to obtain a corresponding wind control model identifier in response to the abnormal model output, and a corresponding wind control evaluation set is obtained according to the wind control model identifier; grouping the wind control evaluation sets to obtain first wind control evaluation groups; determining a bad sample rate and a wind control score mean value corresponding to each first wind control score group; grouping the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups; and fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters. Firstly grouping the wind control scoring sets once, then grouping the wind control scoring sets at unequal frequencies according to the bad sample rate of each group after grouping, then fitting the wind control scoring in the groups after grouping at unequal frequencies to obtain fitting data, and further obtaining model parameters of each regression model according to the fitting data. Therefore, the obtained model parameters are applied to each regression model, and the model results in the same range can be output according to different input data, so that the risk ranking performance of the models is improved, and the accuracy of the output results of the models is improved.
Fig. 2 is a schematic main flow chart of a model calibration method according to a second embodiment of the present application, and as shown in fig. 2, the model calibration method is configured to:
step S201, responding to the abnormal model output, calling a calibration program to obtain a corresponding wind control model identification, and obtaining a corresponding wind control evaluation set according to the wind control model identification.
And step S202, grouping the wind control evaluation sets to obtain first wind control evaluation groups.
Step S203, determining the bad sample rate and the wind control score mean value corresponding to each first wind control score group.
And step S204, clustering the bad sample rate to generate a cluster.
Specifically, the executing subject may group similar bad sample rates into one class, and place the first wind control score group corresponding to the similar bad sample rate into the same cluster.
And S205, grouping the first wind control evaluation groups according to the clustering cluster to obtain second wind control evaluation groups.
And (3) enabling the first wind control scoring components in the same cluster to be in the same group to obtain second wind control scoring components, wherein the second wind control scoring components are actually non-equal-frequency components, namely the number of the first wind control scoring components in each second wind control scoring component can be different or can be the same. Therefore, the bad sample rate of each first wind control group in each second wind control evaluation group is similar, and the risk ranking performance of the model is improved.
Specifically, the grouping of the first wind control scoring group according to the cluster includes: and dividing the first wind control evaluation groups corresponding to each cluster into the same group.
And step S206, fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, further determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters.
As an integral embodiment, the model calibration method is implemented as follows:
1. assuming that training data X-Y exist, dividing a data set into two guest groups of X1-Y1 and X2-Y2 after data analysis and model training preliminary verification, and then respectively establishing data models to obtain corresponding model 1 and model 2.
2. And respectively scoring X1-Y1 and X2-Y2 through corresponding [ model ]1 and [ model ]2 prediction to obtain p1 and p2, and then sequencing p1 and p2 to obtain p11 and p 22.
3. And calling the flow 1 to obtain fitting data.
4. A linear regression equation is trained to obtain a mapping relationship of p2 f (p1) (fitting linear equation p2 f (p1) evaluates the fitting effect based on linear regression determination coefficients R2. R2 determines how many percentages of the coefficients reflect the fluctuation of Y can be described by the fluctuation of X, i.e., how many percentages of the variation characterizing the dependent variable Y (e.g., the wind control score mean in this application) can be interpreted by the independent variable X of the control (e.g., the bad sample rate in this application)).
5. The results verify that the mapped B and C curves coincide, as can be seen from fig. 3B.
Wherein, the flow 1 is as follows (the whole implementation of obtaining the fitting data is as follows):
Figure BDA0003679074710000121
for example, the curve A, D of FIG. 3 may be obtained according to the following:
the execution subject divides the samples into 10 buckets according to the output of the model, namely, the samples predicted to be 0 to 0.1 are classified into one bucket, the samples predicted to be 0.1 to 0.2 are classified into one bucket, and the like, and the 10 buckets are taken as the abscissa; the positive sample fraction in each bucket is calculated as the ordinate and the thus plotted curve reliability diagram is available for evaluation.
Fig. 4 is a schematic diagram of main units of a model calibration apparatus according to an embodiment of the present application. As shown in fig. 4, the model calibration apparatus 400 is configured as an acquisition unit 401, a first grouping unit 402, a calculation unit 403, a second grouping unit 404, and a calibration unit 405.
The obtaining unit 401 is configured to, in response to the model output abnormality, invoke a calibration program to obtain a corresponding wind control model identifier, and obtain a corresponding set of wind control evaluations according to the wind control model identifier.
A first grouping unit 402 configured to group the wind control score sets to obtain first wind control score groups.
And a calculating unit 403 configured to determine a bad sample rate and a wind-control score mean value corresponding to each first wind-control score group.
And a second grouping unit 404 configured to group the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups.
The calibration unit 405 is configured to fit the wind control score mean values corresponding to the second wind control score groups to generate fitting data, and then determine model parameters of the regression model corresponding to the wind control model identifier based on the fitting data to perform calibration on the model corresponding to the wind control model identifier according to the model parameters.
In some embodiments, the first grouping unit 402 is further configured to: acquiring a sample identifier corresponding to each wind control score in each wind control score set; and grouping the wind control scores based on the sample identifications and preset bad sample rate threshold values.
In some embodiments, the first grouping unit 402 is further configured to: and performing equal-frequency grouping on each wind control evaluation set based on the preset grouping number.
In some embodiments, the second packet unit 404 is further configured to: clustering the bad sample rate to generate a cluster; and grouping the first wind control evaluation group according to the clustering cluster.
In some embodiments, the second packet unit 404 is further configured to: and dividing the first wind control evaluation groups corresponding to each cluster into the same group.
In some embodiments, the calibration unit is further configured to: and mapping each wind control score mean value corresponding to each second wind control score group to a preset score range.
In some embodiments, the model calibration apparatus further comprises a sorting unit, not shown in fig. 4, configured to: sorting the wind control scores in each wind control score set to obtain sorted wind control score sets; and the first grouping unit is further configured to: and grouping the sorted wind control scoring sets.
It should be noted that, in the present application, the model calibration method and the model calibration apparatus have corresponding relation in the specific implementation contents, and therefore, the repeated contents are not described again.
Fig. 5 shows an exemplary system architecture 500 to which the model calibration method or the model calibration apparatus of the embodiments of the present application may be applied.
As shown in fig. 5, the system architecture 500 may be configured as terminal devices 501, 502, 503, a network 504, and a server 505. The network 504 serves to provide a medium for communication links between the terminal devices 501, 502, 503 and the server 505. The network 504 may be configured with various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may use the terminal devices 501, 502, 503 to interact with a server 505 over a network 504 to receive or send messages or the like. The terminal devices 501, 502, 503 may have installed thereon various communication client applications, such as shopping-like applications, web browser applications, search-like applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 501, 502, 503 may be various electronic devices having model calibration processing screens and supporting web browsing, configured as, but not limited to, smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 505 may be a server providing various services, such as a background management server (for example only) providing support for anomalies in the model output detected by the user with the terminal devices 501, 502, 503. The background management server can respond to the abnormal model output, call a calibration program to obtain a corresponding wind control model identifier, and obtain a corresponding wind control evaluation set according to the wind control model identifier; grouping the wind control evaluation sets to obtain first wind control evaluation groups; determining a bad sample rate and an air control score mean value corresponding to each first air control score group; grouping the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups; and fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters. Firstly grouping the wind control scoring sets once, then grouping the wind control scoring sets at unequal frequencies according to the bad sample rate of each group after grouping, then fitting the wind control scoring in the groups after grouping at unequal frequencies to obtain fitting data, and further obtaining model parameters of each regression model according to the fitting data. Therefore, the obtained model parameters are applied to each regression model, the model result in the same range can be output according to different input data, and the accuracy of the model output result is improved.
It should be noted that the model calibration method provided in the embodiment of the present application is generally executed by the server 505, and accordingly, the model calibration apparatus is generally disposed in the server 505.
It should be understood that the number of terminal devices, networks, and servers in fig. 5 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 6, shown is a block diagram of a computer system 600 suitable for use in implementing a terminal device of an embodiment of the present application. The terminal device shown in fig. 6 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present application.
As shown in fig. 6, the computer system 600 is configured as a Central Processing Unit (CPU)601, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data necessary for the operation of the computer system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 configured as a keyboard, mouse, or the like; an output section 607 configured such as a Cathode Ray Tube (CRT), a liquid crystal credit authorization query processor (LCD), and the like, and a speaker and the like; a storage section 608 configured as a hard disk or the like; and a communication section 609 configured as a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. A driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to embodiments disclosed herein, the processes described above with reference to the flow diagrams may be implemented as computer software programs. For example, embodiments disclosed herein are configured as a computer program product configured as a computer program carried on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the present application are executed when the computer program is executed by the Central Processing Unit (CPU) 601.
It should be noted that the computer readable medium shown in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of the computer-readable storage medium may be configured as, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may be configured as a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, and be configured as, but not limited to, an electromagnetic signal, an optical signal, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, configured to, but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present application may be implemented by software or hardware. The described units may also be provided in a processor, which may be described as: a processor is configured to an acquisition unit, a first packet unit, a calculation unit, a second packet unit, and a calibration unit. Wherein the names of the elements do not in some way constitute a limitation on the elements themselves.
As another aspect, the present application also provides a computer-readable medium, which may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to be configured to invoke a calibration procedure to obtain a corresponding wind control model identification in response to a model output anomaly, and obtain a corresponding set of wind control scores based on the wind control model identification; grouping the wind control evaluation sets to obtain first wind control evaluation groups; determining a bad sample rate and a wind control score mean value corresponding to each first wind control score group; grouping the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups; and fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters.
According to the technical scheme of the embodiment of the application, the wind control scoring sets are grouped for one time, then the non-equal frequency grouping is carried out according to the bad sample rate of each group after grouping, then the wind control scoring in the groups after the non-equal frequency grouping is fitted to obtain fitting data, and then the model parameters of each regression model are obtained according to the fitting data. Therefore, the obtained model parameters are applied to each regression model, the model result in the same range can be output according to different input data, and the accuracy of the model output result is improved.
The above-described embodiments are not intended to limit the scope of the present disclosure. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (12)

1. A method of model calibration, comprising:
responding to the abnormal model output, calling a calibration program to obtain a corresponding wind control model identification, and obtaining a corresponding wind control evaluation set according to the wind control model identification;
grouping the wind control evaluation sets to obtain first wind control evaluation groups;
determining a bad sample rate and a wind control score mean value corresponding to each first wind control score group;
grouping the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups;
fitting the wind control score mean values corresponding to the second wind control score groups to generate fitting data, determining model parameters of the regression model corresponding to the wind control model identification based on the fitting data, and calibrating the model corresponding to the wind control model identification according to the model parameters.
2. The method of claim 1, wherein the grouping each of the sets of wind control scores comprises:
acquiring a sample identifier corresponding to each wind control score in each wind control score set;
and grouping the wind control scores based on the sample identifications and a preset bad sample rate threshold value.
3. The method of claim 1, wherein the grouping each of the sets of wind-controlled scores comprises:
and performing equal-frequency grouping on each wind control evaluation set based on a preset grouping number.
4. The method of claim 1, wherein the grouping the first wind control score groups according to the bad sample rate comprises:
clustering the bad sample rate to generate a cluster;
and grouping the first wind control evaluation group according to the clustering cluster.
5. The method of claim 4, wherein the grouping the first wind control score group according to the cluster comprises:
and dividing the first wind control evaluation groups corresponding to each cluster into the same group.
6. The method of claim 1, wherein fitting the mean values of the wind control scores corresponding to the second wind control score groups comprises:
and mapping each wind control score mean value corresponding to each second wind control score group to a preset score range.
7. The method of claim 1, wherein prior to the grouping each of the sets of wind control scores, the method further comprises:
sorting the wind control scores in each wind control score set to obtain each sorted wind control score set; and
the grouping of each of the sets of wind control scores includes:
and grouping the sorted wind control evaluation sets.
8. A model calibration device, comprising:
the acquisition unit is configured to respond to the abnormal model output, call a calibration program to acquire a corresponding wind control model identification and acquire a corresponding set of wind control evaluation according to the wind control model identification;
a first grouping unit configured to group each of the sets of wind-controlled evaluations to obtain each of first wind-controlled evaluation groups;
the calculating unit is configured to determine a bad sample rate and a wind control score mean value corresponding to each first wind control score group;
the second grouping unit is configured to group the first wind control evaluation groups according to the bad sample rate to obtain second wind control evaluation groups;
and the calibration unit is configured to fit each wind control score mean value corresponding to each second wind control score group to generate each fitting data, and further determine a model parameter of the regression model corresponding to the wind control model identification based on each fitting data, so as to perform calibration on the model corresponding to the wind control model identification according to the model parameter.
9. The apparatus of claim 8, wherein the first grouping unit is further configured to:
acquiring a sample identifier corresponding to each wind control score in each wind control score set;
and grouping the wind control scores based on the sample identifications and a preset bad sample rate threshold value.
10. The apparatus of claim 8, wherein the first grouping unit is further configured to:
and performing equal-frequency grouping on each wind control evaluation set based on a preset grouping number.
11. A model calibration electronic device, configured to:
one or more processors;
a storage device to store one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
12. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202210629001.7A 2022-06-06 2022-06-06 Model calibration method and device, electronic equipment and computer readable medium Pending CN114998029A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210629001.7A CN114998029A (en) 2022-06-06 2022-06-06 Model calibration method and device, electronic equipment and computer readable medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210629001.7A CN114998029A (en) 2022-06-06 2022-06-06 Model calibration method and device, electronic equipment and computer readable medium

Publications (1)

Publication Number Publication Date
CN114998029A true CN114998029A (en) 2022-09-02

Family

ID=83032120

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210629001.7A Pending CN114998029A (en) 2022-06-06 2022-06-06 Model calibration method and device, electronic equipment and computer readable medium

Country Status (1)

Country Link
CN (1) CN114998029A (en)

Similar Documents

Publication Publication Date Title
US20180248879A1 (en) Method and apparatus for setting access privilege, server and storage medium
CN110929799B (en) Method, electronic device, and computer-readable medium for detecting abnormal user
WO2020082734A1 (en) Text emotion recognition method and apparatus, electronic device, and computer non-volatile readable storage medium
CN110659657B (en) Method and device for training model
CN112561685B (en) Customer classification method and device
CN112308173B (en) Multi-target object evaluation method based on multi-evaluation factor fusion and related equipment thereof
CN110309142B (en) Method and device for rule management
CN112990583A (en) Method and equipment for determining mold entering characteristics of data prediction model
CN113627536A (en) Model training method, video classification method, device, equipment and storage medium
CN113298121A (en) Message sending method and device based on multi-data source modeling and electronic equipment
CN115619448A (en) User loss prediction method and device, computer equipment and storage medium
CN115049446A (en) Merchant identification method and device, electronic equipment and computer readable medium
CN113779346A (en) Method and device for identifying one person with multiple accounts
CN116468479A (en) Method for determining page quality evaluation dimension, and page quality evaluation method and device
CN114998029A (en) Model calibration method and device, electronic equipment and computer readable medium
CN116341680A (en) Artificial intelligence model adaptation method, device, electronic equipment and storage medium
CN114092162B (en) Recommendation quality determination method, and training method and device of recommendation quality determination model
CN113568739B (en) User resource quota allocation method and device and electronic equipment
CN113961797A (en) Resource recommendation method and device, electronic equipment and readable storage medium
CN115550259B (en) Flow distribution method based on white list and related equipment
CN112906723A (en) Feature selection method and device
CN111858917A (en) Text classification method and device
CN109240878B (en) Data processing method and device
CN113034123B (en) Abnormal resource transfer identification method and device, electronic equipment and readable storage medium
CN116015811A (en) Method, device, storage medium and electronic equipment for evaluating network security

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination