CN110348993A - Wind is discussed and select model workers determination method, determining device and the electronic equipment of type label - Google Patents

Wind is discussed and select model workers determination method, determining device and the electronic equipment of type label Download PDF

Info

Publication number
CN110348993A
CN110348993A CN201910578914.9A CN201910578914A CN110348993A CN 110348993 A CN110348993 A CN 110348993A CN 201910578914 A CN201910578914 A CN 201910578914A CN 110348993 A CN110348993 A CN 110348993A
Authority
CN
China
Prior art keywords
sample data
overdue
rate
label
recall rate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910578914.9A
Other languages
Chinese (zh)
Other versions
CN110348993B (en
Inventor
熊庄
苏绥绥
常富洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Qiyu Information Technology Co Ltd
Original Assignee
Beijing Qiyu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qiyu Information Technology Co Ltd filed Critical Beijing Qiyu Information Technology Co Ltd
Priority to CN201910578914.9A priority Critical patent/CN110348993B/en
Publication of CN110348993A publication Critical patent/CN110348993A/en
Application granted granted Critical
Publication of CN110348993B publication Critical patent/CN110348993B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Technology Law (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

It discusses and select model workers determination method, determining device and the electronic equipment of type label the present invention provides a kind of wind, wherein the discuss and select model workers determination method of type label of wind includes: to obtain sample data to be calibrated;Obtain accuracy rate and recall rate of the sample data under the different overdue performance phases;Based on the accuracy rate and the recall rate, the target overdue performance phase is filtered out from the overdue performance of the difference is interim, and as label;According to the label filtered out, the sample data is demarcated.According to the technical solution of the present invention, contingency table timing is being carried out to sample data, is demarcating, timely and accurately sample data can be predicted, to enhance the timeliness of sample data again after being showed completely without waiting for sample data.

Description

Wind is discussed and select model workers determination method, determining device and the electronic equipment of type label
Technical field
The present invention relates to internet financial technology field, discuss and select model workers the determination side of type label in particular to a kind of wind Method, a kind of wind are discussed and select model workers determining device, a kind of electronic equipment and a kind of computer readable storage medium of type label.
Background technique
With the rapid development of economy, credit consuming is also more and more concerned, credit card purchase, is helped personal automobile loan It is increasing to learn the various personal consumption loans such as loan, small amount consumptive loan, and growth rate is very fast.Consumer credit Rapid growth needs each credit guarantee side to have more perfect management of credit risk system, thus each credit guarantee side's meeting Type is discussed and select model workers using wind to carry out risk profile control to business, the discuss and select model workers validity of type of wind directly affects risk evaluating result Accuracy.
At present during building wind discusses and select model workers type, the used universal deviation of sample data timeliness, especially for The some length of maturity longer samples, such as 12 months, then 1 year or more time is at least needed to wait these samples complete Performance is " good person " or " bad person " to demarcate sample to determine label, then is discussed and select model workers type according to sample to construct wind, because of building Used sample is older when model, it is constructed go out wind discuss and select model workers type be not just yet very well, it is inadequate to will lead to its evaluating result Accurately.
Summary of the invention
Present invention seek to address that existing comment used sample data when modeling that cannot be classified calibration in time for wind, The problem of poor in timeliness.
In order to solve the above-mentioned technical problem, the first aspect of the present invention proposes a kind of wind and discusses and select model workers the determination side of type label Method, comprising: obtain sample data to be calibrated;It obtains accuracy rate of the sample data under the different overdue performance phases and recalls Rate;Based on the accuracy rate and the recall rate, the target overdue performance phase is filtered out from the overdue performance of the difference is interim, and will It is as label;According to the label filtered out, the sample data is demarcated.
In the technical scheme, by obtaining sample data and sample data to be calibrated in the different overdue performance phases Under accuracy rate and recall rate, according to accuracy rate and recall rate from Bu Tong it is overdue show it is interim filter out suitable label, with right Sample data is demarcated, and is demarcated again after showing completely without waiting for sample data, can be timely and accurately to sample Data are predicted, the timeliness of sample data is enhanced.
In the above-mentioned technical solutions, it is preferable that it is described based on the accuracy rate and the recall rate, it is overdue from the difference Show interim the step of filtering out target overdue performance phase, specifically include: building includes that the overdue performance of the difference is interim The statistical form of each overdue performance phase accuracy rate corresponding with its and recall rate;It is overdue that the target is screened according to the statistical form The performance phase.
In the technical scheme, statistical form building convenient under different condition accuracy rate and recall rate carry out screening ratio It is right, and then guarantee accurately to filter out the target overdue performance phase.
In any of the above-described technical solution, it is preferable that described to screen the overdue performance of target according to the statistical form The step of phase, specifically includes: filtering out accuracy rate from the statistical form greater than first threshold and recall rate is greater than second threshold The corresponding overdue performance phase, as the target overdue performance phase.
In the technical scheme, recall rate is lower when accuracy rate is higher in general, accuracy rate if recall rate is higher It is lower, so filtering out accuracy rate by comparing accuracy rate and recall rate in statistical form and recall rate being all relatively high One group of corresponding overdue performance phase as label, to ensure to screen the reasonability of label, ensures subsequent to sample to the full extent The accuracy of notebook data classification calibration.
In any of the above-described technical solution, it is preferable that calculate standard of the sample data under the different overdue performance phases The step of true rate, specifically includes: calculating accuracy rate of the sample data under the different overdue performance phases: a according to the following formula =(TP+TN)/(TP+FP+TN+FN);Wherein, a indicates accuracy rate, and TP indicates that forecast sample is good person, really good person, FP table Show that forecast sample is good person, really bad person, TN indicates that forecast sample is bad person, really bad person, and FN indicates that forecast sample is Bad person, really good person.
In any of the above-described technical solution, it is preferable that obtain the sample data calling together in the case where not having to the overdue performance phase It the step of rate of returning, specifically includes: calculating recall rate of the sample data under the different overdue performance phases: b according to the following formula =TP/ (TP+FN);Wherein, b indicates recall rate, and TP indicates that forecast sample is good person, and really good person, FP indicate forecast sample It is good person, really bad person, TN indicate that forecast sample is bad person, really bad person, and FN indicates that forecast sample is bad person, really Good person.
In any of the above-described technical solution, it is preferable that the difference overdue performance phase includes issue and overdue number of days.
In any of the above-described technical solution, it is preferable that described the step of obtaining sample data to be calibrated, specific to wrap It includes: transferring the sample data from database;And/or the sample data is transferred from third party's loan platform.
In any of the above-described technical solution, it is preferable that further include: based on to calibrated sample data progress engineering Simulated training is practised, is discussed and select model workers type with constructing wind.
In the technical scheme, strong using timeliness, the accurate sample data of classification is discussed and select model workers type to construct wind, utmostly Shangdi ensures available preferable model, and then improves the accuracy of model evaluating result.
In order to solve the above-mentioned technical problem, what the second aspect of the present invention proposed that a kind of wind discusses and select model workers type label determines dress It sets, comprising: first acquisition unit, for obtaining sample data to be calibrated;Second acquisition unit, for obtaining the sample number According to the accuracy rate and recall rate under the different overdue performance phases;Processing unit, for being based on the accuracy rate and the recall rate, The target overdue performance phase is filtered out from the overdue performance of the difference is interim, and as label;Unit is demarcated, for according to sieve The label selected, demarcates the sample data.
In the technical scheme, by obtaining sample data and sample data to be calibrated in the different overdue performance phases Under accuracy rate and recall rate, according to accuracy rate and recall rate from Bu Tong it is overdue show it is interim filter out suitable label, with right Sample data is demarcated, and is demarcated again after showing completely without waiting for sample data, can be timely and accurately to sample Data are predicted, the timeliness of sample data is enhanced.
In any of the above-described technical solution, it is preferable that the processing unit includes: statistical form construction unit, is used for structure Building includes the overdue statistical form for showing each of interim accuracy rate corresponding with its of overdue performance phase and recall rate of the difference; Screening unit, for screening the target overdue performance phase according to the statistical form.
In the technical scheme, statistical form building convenient under different condition accuracy rate and recall rate carry out screening ratio It is right, and then guarantee accurately to filter out the target overdue performance phase.
In any of the above-described technical solution, it is preferable that the screening unit is specifically used for: being screened from the statistical form Accuracy rate is greater than first threshold out and recall rate is greater than the second threshold corresponding overdue performance phase, overdue as the target The performance phase.
In the technical scheme, recall rate is lower when accuracy rate is higher in general, accuracy rate if recall rate is higher It is lower, so filtering out accuracy rate by comparing accuracy rate and recall rate in statistical form and recall rate being all relatively high One group of corresponding overdue performance phase as label, to ensure to screen the reasonability of label, ensures subsequent to sample to the full extent The accuracy of notebook data classification calibration.
In any of the above-described technical solution, it is preferable that the second acquisition unit is specifically used for: counting according to the following formula Calculate accuracy rate of the sample data under the different overdue performance phases: a=(TP+TN)/(TP+FP+TN+FN);Wherein, a is indicated Accuracy rate, TP indicate that forecast sample is good person, really good person, and FP indicates that forecast sample is good person, and really bad person, TN are indicated Forecast sample is bad person, really bad person, and FN indicates that forecast sample is bad person, really good person.
In any of the above-described technical solution, it is preferable that the second acquisition unit is specifically used for: counting according to the following formula Calculate recall rate of the sample data under the different overdue performance phases: b=TP/ (TP+FN);Wherein, b indicates recall rate, TP table Show that forecast sample is good person, really good person, FP indicates that forecast sample is good person, really bad person, and TN indicates that forecast sample is Bad person, really bad person, FN indicate that forecast sample is bad person, really good person.
In any of the above-described technical solution, it is preferable that the difference overdue performance phase includes issue and overdue number of days.
In any of the above-described technical solution, it is preferable that the first acquisition unit is specifically used for: transferring from database The sample data;And/or the sample data is transferred from third party's loan platform.
In any of the above-described technical solution, it is preferable that further include: model construction unit, for based on to calibrated Sample data carries out machine learning simulated training, is discussed and select model workers type with constructing wind.
In the technical scheme, strong using timeliness, the accurate sample data of classification is discussed and select model workers type to construct wind, utmostly Shangdi ensures available preferable model, and then improves the accuracy of model evaluating result.
In order to solve the above-mentioned technical problem, third aspect present invention proposes a kind of electronic equipment, comprising: processor and The memory of computer executable instructions is stored, the executable instruction makes the processor execute such as above-mentioned skill when executed Method described in any one of art scheme.
In order to solve the above-mentioned technical problem, fourth aspect present invention proposes a kind of computer readable storage medium, wherein The computer-readable recording medium storage one or more program, one or more of programs when being executed by a processor, Realize the method as described in any one of above-mentioned technical proposal.
Since present invention employs calculate accuracy rate and recall rate of the sample data under the different overdue performance phases, and foundation Accuracy rate and recall rate filter out suitable label from Bu Tong overdue performance is interim, to be demarcated to sample data, therefore this Invention can carry out contingency table timing to sample data, demarcate again after showing completely without waiting for sample data, energy It is enough that timely and accurately sample data is predicted, to enhance the timeliness of sample data.
Detailed description of the invention
In order to keep technical problem solved by the invention, the technological means of use and the technical effect of acquirement clearer, Detailed description of the present invention specific embodiment below with reference to accompanying drawings.But it need to state, drawings discussed below is only this The attached drawing of invention exemplary embodiment of the present, to those skilled in the art, before not making the creative labor It puts, the attached drawing of other embodiments can be obtained according to these attached drawings.
The wind that Fig. 1 shows embodiment according to the present invention discuss and select model workers type label determination method schematic flow diagram;
It discusses and select model workers the schematic block diagram of the determining device of type label Fig. 2 shows the wind of embodiment according to the present invention;
Fig. 3 shows the schematic block diagram of the electronic equipment of embodiment according to the present invention;
Fig. 4 shows the schematic block diagram of the computer readable storage medium of embodiment according to the present invention.
Specific embodiment
Exemplary embodiment of the present invention is described more fully with reference to the drawings.However, exemplary embodiment can Implement in a variety of forms, and is understood not to that present invention is limited only to embodiments set forth herein.On the contrary, it is exemplary to provide these Embodiment enables to the present invention more full and complete, easily facilitates the technology that inventive concept is comprehensively communicated to this field Personnel.Identical appended drawing reference indicates same or similar element, component or part in figure, thus will omit weight to them Multiple description.
Under the premise of meeting technical concept of the invention, the feature described in some specific embodiment, structure, spy Property or other details be not excluded for can be combined in any suitable manner in one or more other embodiments.
In the description for specific embodiment, feature, structure, characteristic or the other details that the present invention describes are to make Those skilled in the art fully understands embodiment.But, it is not excluded that those skilled in the art can practice this hair Bright technical solution is one or more without special characteristic, structure, characteristic or other details.
Flow chart shown in the drawings is merely illustrative, it is not necessary to including all content and operation/step, It is not required to execute by described sequence.For example, some operation/steps can also decompose, and some operation/steps can close And or part merge, therefore the sequence actually executed is possible to change according to the actual situation.
Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or realized in one or more hardware modules or integrated circuit These functional entitys, or these functional entitys are realized in heterogeneous networks and/or processor device and/or microcontroller device.
Respectively the same reference numbers in the drawings refer to same or similar element, component or parts, thus hereinafter may It is omitted to same or similar element, component or partial repeated description.Although should also be understood that may use the herein One, the attribute of the expressions such as second, third number describes various devices, element, component or part, but these devices, element, Component or part should not be limited by these attributes.That is, these attributes are intended merely to distinguish one and another one.Example Such as, the first device is also referred to as the second device without departing from the technical solution of essence of the invention.In addition, term "and/or" or " and/or " it include the associated all combinations for listing any of project and one or more.
Usually during making model label, such problems can be encountered: when the length of maturity is longer, such as 12 Month, then equal samples data have showed completely, " bad person " is all come out, and at least needs to wait 1 year, this means that at least with one Year pervious sample makees this model, and the model obtained in this way is obviously bad, and preferable model in order to obtain, it is necessary to when The strong sample data of effect property is trained, and for these sample datas because true quality does not show, needs label It predicts to classify, in order to get suitable label, introduces accuracy rate and recall rate, wherein accuracy rate refers to that prediction is correct Data occupy the ratio of total data, recall rate is then practical be good person data in predict correct data probability, the two it Between relationship be that the higher recall rate of accuracy rate is lower, recall rate more high-accuracy is lower, using under the different overdue performance phases Preparation rate and recall rate filter out suitable label, prediction classification is carried out to sample data, with enhance sample data when Effect property, specifically label determination process, as shown in Figure 1, comprising:
Step S102 obtains sample data to be calibrated.
Wherein, the source of sample data can be transferred from database, can also be with third party's loan platform (in such as client Debt-credit APP in information) in adjust, can be with the sample data in integrated database and third party's loan platform, it is ensured that sample Data it is comprehensive.
Step S104 obtains accuracy rate and recall rate of the sample data under the different overdue performance phases.Wherein, different overdue The performance phase is one day overdue when may include issue and overdue number of days, such as 6 phase.
Specifically, in formula (1) and formula (2), TP indicates that forecast sample is good person, really good person, and FP indicates prediction Sample is good person, really bad person, and TN indicates that forecast sample is bad person, really bad person, and FN indicates that forecast sample is bad person, real Border is good person.
Accuracy rate a of the sample data under the different overdue performance phases is calculated according to formula (1):
A=(TP+TN)/(TP+FP+TN+FN) (1).
Recall rate b of the sample data under the different overdue performance phases is calculated according to formula (2):
B=TP/ (TP+FN) (2).
Step S106 is based on accuracy rate and recall rate, filters out the target overdue performance phase from different overdue performances are interim, and As label.
Specifically screening process: building includes each of interim standard corresponding with its of overdue performance phase of different overdue performances The statistical form of true rate and recall rate filters out accuracy rate greater than first threshold from statistical form and recall rate is greater than second threshold pair The overdue performance phase answered, as the target overdue performance phase.Recall rate is lower when accuracy rate is higher in general, recall rate Accuracy rate is lower if higher, so filtering out accuracy rate by comparing accuracy rate and recall rate in statistical form and recalling One group of all relatively high corresponding overdue performance phase of rate as label, to ensure to screen the reasonability of label, utmostly On ensure it is subsequent to sample data classification calibration accuracy.
It is illustrated by taking number of days overdue when sample data is in 6 phase as an example below:
Different overdue performance phase (when 6 phase when overdue 1 day, 6 phase when overdue 7 days, 6 phase overdue 30 days when overdue 15 days, 6 phase) Corresponding accuracy rate and recall rate are as shown in table 1:
The overdue performance phase Accuracy rate Recall rate
Overdue 1 day when 6 phase a1 b1
Overdue 7 days when 6 phase a2 b2
Overdue 15 days when 6 phase a3 b3
Overdue 30 days when 6 phase a4 b4
Table 1
By the comparison to accuracy rate and recall rate under the overdue performance phase each in table 1, if standard when 6 phase at overdue 30 days True rate a4With recall rate b4It is relatively high, then it can choose overdue 30 days when 6 phase and be used as label, overdue 30 days samples when by 6 phase Data definition is bad person.What needs to be explained here is that in this example 6 phases be only illustrated as an example, not to the overdue performance phase It is defined.
Step S108 demarcates sample data according to the label filtered out.
In the present embodiment, by obtaining sample data and sample data to be calibrated under the different overdue performance phases Accuracy rate and recall rate filter out suitable label from Bu Tong overdue performance is interim with recall rate according to accuracy rate, to sample Data are demarcated, and are demarcated again after showing completely without waiting for sample data, can be timely and accurately to sample data It is predicted, enhances the timeliness of sample data.
Further, further includes: based on machine learning simulated training is carried out to calibrated sample data, commented with constructing wind Model., classification accurate sample data strong using timeliness is discussed and select model workers type to construct wind, utmostly Shangdi ensure it is available compared with Good model, and then improve the accuracy of model evaluating result.
It will be understood by those skilled in the art that realizing that all or part of the steps of above-described embodiment is implemented as by computer The program (computer program) that data processing equipment executes.It is performed in the computer program, offer of the present invention is provided The above method.Moreover, the computer program can store in computer readable storage medium, which can be with It is the readable storage medium storing program for executing such as disk, CD, ROM, RAM, is also possible to the storage array of multiple storage medium compositions, such as disk Or tape storage array.The storage medium is not limited to centralised storage, is also possible to distributed storage, such as based on cloud The cloud storage of calculating.
The device of the invention embodiment is described below, which can be used for executing embodiment of the method for the invention.For Details described in apparatus of the present invention embodiment should be regarded as the supplement for above method embodiment;For in apparatus of the present invention Undisclosed details in embodiment is referred to above method embodiment to realize.
Usually during making model label, such problems can be encountered: when the length of maturity is longer, such as 12 Month, then equal samples data have showed completely, " bad person " is all come out, and at least needs to wait 1 year, this means that at least with one Year pervious sample makees this model, and the model obtained in this way is obviously bad, and preferable model in order to obtain, it is necessary to when The strong sample data of effect property is trained, and for these sample datas because true quality does not show, needs label It predicts to classify, in order to get suitable label, introduces accuracy rate and recall rate, wherein accuracy rate refers to that prediction is correct Data occupy the ratio of total data, recall rate is then practical be good person data in predict correct data probability, the two it Between relationship be that the higher recall rate of accuracy rate is lower, recall rate more high-accuracy is lower, using under the different overdue performance phases Preparation rate and recall rate filter out suitable label, prediction classification is carried out to sample data, with enhance sample data when Effect property specifically realizes the device that label determines, the determining device 200 of type label includes: first as shown in Fig. 2, wind is discussed and select model workers Acquiring unit 202, second acquisition unit 204, processing unit 206 and calibration unit 208.
Wherein, first acquisition unit 202 is for obtaining sample data to be calibrated, wherein the source of sample data can be with It transfers, can also can also be integrated with being adjusted in third party's loan platform (information in the debt-credit APP in such as client) from database Sample data in database and third party's loan platform, it is ensured that sample data it is comprehensive.
Second acquisition unit 204 is used to obtain accuracy rate and recall rate of the sample data under the different overdue performance phases.Its In, the different overdue performance phases are one day overdue when may include issue and overdue number of days, such as 6 phase.
In following formula (1) and formula (2), TP indicates that forecast sample is good person, really good person, and FP indicates forecast sample It is good person, really bad person, TN indicate that forecast sample is bad person, really bad person, and FN indicates that forecast sample is bad person, really Good person.
Specifically, second acquisition unit 204 is accurate under the different overdue performance phases according to formula (1) calculating sample data Rate a:
A=(TP+TN)/(TP+FP+TN+FN) (1).
Second acquisition unit 204 calculates recall rate b of the sample data under the different overdue performance phases according to formula (2):
B=TP/ (TP+FN) (2).
Processing unit 206 is used to be based on accuracy rate and recall rate, filters out the overdue table of target from different overdue performances are interim It is current, and as label.
Specifically, processing unit 206 includes statistical form construction unit 2062 and screening unit 2064, statistical form construction unit 2062 buildings include the different overdue statistics for showing each of interim accuracy rate corresponding with its of overdue performance phase and recall rate Table, accuracy rate is filtered out from statistical form greater than first threshold for screening unit 2064 and recall rate exceedes greater than second threshold is corresponding Phase shows the phase, as the target overdue performance phase.Recall rate is lower when accuracy rate is higher in general, and recall rate is higher Words accuracy rate is lower, so filtering out accuracy rate and recall rate all phases by comparing accuracy rate and recall rate in statistical form Higher one group corresponding overdue performance phase is ensured as label with ensuring to screen the reasonability of label to the full extent The subsequent accuracy to sample data classification calibration.
It is illustrated by taking number of days overdue when sample data is in 6 phase as an example below:
Different overdue performance phase (when 6 phase when overdue 1 day, 6 phase when overdue 7 days, 6 phase overdue 30 days when overdue 15 days, 6 phase) Corresponding accuracy rate and recall rate are as shown in table 1:
The overdue performance phase Accuracy rate Recall rate
Overdue 1 day when 6 phase a1 b1
Overdue 7 days when 6 phase a2 b2
Overdue 15 days when 6 phase a3 b3
Overdue 30 days when 6 phase a4 b4
Table 1
By the comparison to accuracy rate and recall rate under the overdue performance phase each in table 1, if standard when 6 phase at overdue 30 days True rate a4With recall rate b4It is relatively high, then it can choose overdue 30 days when 6 phase and be used as label, overdue 30 days samples when by 6 phase Data definition is bad person.What needs to be explained here is that in this example 6 phases be only illustrated as an example, not to the overdue performance phase It is defined.
Calibration unit 208 is used to demarcate sample data according to the label filtered out.
In the present embodiment, by obtaining sample data and sample data to be calibrated under the different overdue performance phases Accuracy rate and recall rate filter out suitable label from Bu Tong overdue performance is interim with recall rate according to accuracy rate, to sample Data are demarcated, and are demarcated again after showing completely without waiting for sample data, can be timely and accurately to sample data It is predicted, enhances the timeliness of sample data.
Further, wind is discussed and select model workers the determining device 200 of type label further include: model construction unit 210, for based on pair Calibrated sample data carries out machine learning simulated training, is discussed and select model workers type with constructing wind., classification accurate sample strong using timeliness Notebook data is discussed and select model workers type to construct wind, and utmostly Shangdi ensures available preferable model, and then improves model evaluating result Accuracy.
It will be understood by those skilled in the art that each module in above-mentioned apparatus embodiment can be distributed in device according to description In, corresponding change can also be carried out, is distributed in one or more devices different from above-described embodiment.The mould of above-described embodiment Block can be merged into a module, can also be further split into multiple submodule.
Electronic equipment embodiment of the invention is described below, which can be considered as the method for aforementioned present invention With the specific entity embodiment of Installation practice.For details described in electronic equipment embodiment of the present invention, should be regarded as pair In the above method or the supplement of Installation practice;For undisclosed details, Ke Yican in electronic equipment embodiment of the present invention It is realized according to the above method or Installation practice.
Fig. 3 is the structural block diagram of the exemplary embodiment of a kind of electronic equipment according to the present invention.It is retouched referring to Fig. 3 State the electronic equipment 300 of the embodiment according to the present invention.The electronic equipment 300 that Fig. 3 is shown is only an example, should not be right The function and use scope of the embodiment of the present invention bring any restrictions.
As shown in figure 3, electronic equipment 300 is showed in the form of universal computing device.The component of electronic equipment 300 can wrap It includes but is not limited to: at least one processing unit 310, at least one storage unit 320, (including the storage of the different system components of connection Unit 320 and processing unit 310) bus 330, display unit 340 etc..
Wherein, the storage unit is stored with program code, and said program code can be held by the processing unit 310 Row, so that the processing unit 310 executes described in this specification above-mentioned electronic prescription circulation processing method part according to this The step of inventing various illustrative embodiments.For example, the processing unit 310 can execute step as shown in Figure 1.
The storage unit 320 may include the readable medium of volatile memory cell form, such as random access memory Unit (RAM) 3201 and/or cache memory unit 3202 can further include read-only memory unit (ROM) 3203.
The storage unit 320 can also include program/practical work with one group of (at least one) program module 3205 Tool 3204, such program module 3205 includes but is not limited to: operating system, one or more application program, other programs It may include the realization of network environment in module and program data, each of these examples or certain combination.
Bus 330 can be to indicate one of a few class bus structures or a variety of, including storage unit bus or storage Cell controller, peripheral bus, graphics acceleration port, processing unit use any bus structures in a variety of bus structures Local bus.
Electronic equipment 300 can also be with one or more external equipments 400 (such as keyboard, sensing equipment, bluetooth equipment Deng) communication, can also be enabled a user to one or more equipment interact with the electronic equipment 300 communicate, and/or with make Any equipment (such as the router, modulation /demodulation that the electronic equipment 300 can be communicated with one or more of the other calculating equipment Device etc.) communication.This communication can be carried out by input/output (I/O) interface 350.Also, electronic equipment 300 can be with By network adapter 360 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, Such as internet) communication.Network adapter 360 can be communicated by bus 330 with other modules of electronic equipment 300.It should Understand, although not shown in the drawings, other hardware and/or software module can be used in conjunction with electronic equipment 300, including but unlimited In: microcode, device driver, redundant processing unit, external disk drive array, RAID system, tape drive and number According to backup storage system etc..
Through the above description of the embodiments, those skilled in the art it can be readily appreciated that the present invention describe it is exemplary Embodiment can also be realized by software realization in such a way that software is in conjunction with necessary hardware.Therefore, according to this hair The technical solution of bright embodiment can be embodied in the form of software products, which can store calculates at one In the readable storage medium of machine (can be CD-ROM, USB flash disk, mobile hard disk etc.) or on network, including some instructions are so that one Platform calculates equipment (can be personal computer, server or network equipment etc.) and executes according to the above method of the present invention.When When the computer program is executed by a data processing equipment so that the computer-readable medium can be realized it is of the invention upper State method, it may be assumed that obtain sample data to be calibrated, obtain accuracy rate of the sample data under the different overdue performance phases and recall Rate is based on accuracy rate and recall rate, filters out the target overdue performance phase from different overdue performances are interim, and as label, According to the label filtered out, sample data is demarcated.
Fig. 4 is the schematic diagram of a computer readable storage medium of the invention.As shown in figure 4, the computer program It can store on one or more computer-readable mediums.Computer-readable medium can be readable signal medium or readable Storage medium.Readable storage medium storing program for executing for example can be but be not limited to the system of electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, dress It sets or device, or any above combination.The more specific example (non exhaustive list) of readable storage medium storing program for executing includes: to have It is the electrical connections of one or more conducting wires, portable disc, hard disk, random access memory (RAM), read-only memory (ROM), erasable Formula programmable read only memory (EPROM or flash memory), optical fiber, portable compact disc read only memory (CD-ROM), optical memory Part, magnetic memory device or above-mentioned any appropriate combination.
The computer readable storage medium may include in a base band or the data as the propagation of carrier wave a part are believed Number, wherein carrying readable program code.The data-signal of this propagation can take various forms, including but not limited to electromagnetism Signal, optical signal or above-mentioned any appropriate combination.Readable storage medium storing program for executing can also be any other than readable storage medium storing program for executing Readable medium, the readable medium can send, propagate or transmit for by instruction execution system, device or device use or Person's program in connection.The program code for including on readable storage medium storing program for executing can transmit with any suitable medium, packet Include but be not limited to wireless, wired, optical cable, RF etc. or above-mentioned any appropriate combination.
The program for executing operation of the present invention can be write with any combination of one or more programming languages Code, described program design language include object oriented program language-Java, C++ etc., further include conventional Procedural programming language-such as " C " language or similar programming language.Program code can be fully in user It calculates and executes in equipment, partly executes on a user device, being executed as an independent software package, partially in user's calculating Upper side point is executed on a remote computing or is executed in remote computing device or server completely.It is being related to far Journey calculates in the situation of equipment, and remote computing device can pass through the network of any kind, including local area network (LAN) or wide area network (WAN), it is connected to user calculating equipment, or, it may be connected to external computing device (such as utilize ISP To be connected by internet).
In conclusion the present invention can be implemented in hardware, or the software to run on one or more processors Module is realized, or is implemented in a combination thereof.It will be understood by those of skill in the art that micro process can be used in practice The communications data processing units such as device or digital signal processor (DSP) come realize according to embodiments of the present invention in it is some or The some or all functions of whole components.The present invention is also implemented as a part for executing method as described herein Or whole device or device program (for example, computer program and computer program product).Such realization present invention Program can store on a computer-readable medium, or may be in the form of one or more signals.Such letter It number can be downloaded from an internet website to obtain, be perhaps provided on the carrier signal or be provided in any other form.
Particular embodiments described above has carried out further in detail the purpose of the present invention, technical scheme and beneficial effects It describes in detail bright, it should be understood that the present invention is not inherently related to any certain computer, virtual bench or electronic equipment, various The present invention also may be implemented in fexible unit.The above is only a specific embodiment of the present invention, is not limited to this hair Bright, all within the spirits and principles of the present invention, any modification, equivalent substitution, improvement and etc. done should be included in the present invention Protection scope within.

Claims (10)

  1. A kind of determination method of type label 1. wind is discussed and select model workers characterized by comprising
    Obtain sample data to be calibrated;
    Obtain accuracy rate and recall rate of the sample data under the different overdue performance phases;
    Based on the accuracy rate and the recall rate, the target overdue performance phase is filtered out from the overdue performance of the difference is interim, and As label;
    According to the label filtered out, the sample data is demarcated.
  2. The determination method of type label 2. wind according to claim 1 is discussed and select model workers, which is characterized in that described based on described accurate Rate and the recall rate are specifically included from the overdue performance of the difference interim the step of filtering out target overdue performance phase:
    Building, which includes that the difference is overdue, shows each of interim accuracy rate corresponding with its of overdue performance phase and recall rate Statistical form;
    The target overdue performance phase is screened according to the statistical form.
  3. 3. -2 described in any item wind are discussed and select model workers the determination method of type label according to claim 1, which is characterized in that the foundation The statistical form screens the step of target overdue performance phase, specifically includes:
    Accuracy rate is filtered out from the statistical form greater than first threshold and recall rate is greater than the corresponding overdue performance of second threshold Phase, as the target overdue performance phase.
  4. The determination method of type label 4. wind according to claim 1-3 is discussed and select model workers, which is characterized in that described in calculating It the step of accuracy rate of the sample data under the different overdue performance phases, specifically includes:
    Accuracy rate of the sample data under the different overdue performance phases is calculated according to the following formula:
    A=(TP+TN)/(TP+FP+TN+FN);
    Wherein, a indicates accuracy rate, and TP indicates that forecast sample is good person, really good person, and FP indicates that forecast sample is good person, real Border is bad person, and TN indicates that forecast sample is bad person, really bad person, and FN indicates that forecast sample is bad person, really good person.
  5. The determination method of type label 5. wind according to claim 1-4 is discussed and select model workers, which is characterized in that described in acquisition Sample data specifically includes the step of not having to the recall rate under the overdue performance phase:
    Recall rate of the sample data under the different overdue performance phases is calculated according to the following formula:
    B=TP/ (TP+FN);
    Wherein, b indicates recall rate, and TP indicates that forecast sample is good person, really good person, and FP indicates that forecast sample is good person, real Border is bad person, and TN indicates that forecast sample is bad person, really bad person, and FN indicates that forecast sample is bad person, really good person.
  6. The determination method of type label 6. wind according to claim 1-5 is discussed and select model workers, which is characterized in that the difference The overdue performance phase includes issue and overdue number of days.
  7. The determination method of type label 7. wind according to claim 1-6 is discussed and select model workers, which is characterized in that the acquisition It the step of sample data to be calibrated, specifically includes:
    The sample data is transferred from database;And/or
    The sample data is transferred from third party's loan platform.
  8. The determining device of type label 8. a kind of wind is discussed and select model workers characterized by comprising
    First acquisition unit, for obtaining sample data to be calibrated;
    Second acquisition unit, for obtaining accuracy rate and recall rate of the sample data under the different overdue performance phases;
    Processing unit filters out target from the overdue performance of the difference is interim for being based on the accuracy rate and the recall rate The overdue performance phase, and as label;
    Unit is demarcated, for being demarcated to the sample data according to the label filtered out.
  9. 9. a kind of electronic equipment, wherein the electronic equipment, comprising:
    Processor;And
    The memory of computer executable instructions is stored, the executable instruction makes the processor execute basis when executed Method of any of claims 1-7.
  10. 10. a kind of computer readable storage medium, wherein the computer-readable recording medium storage one or more program, One or more of programs when being executed by a processor, realize method of any of claims 1-7.
CN201910578914.9A 2019-06-28 2019-06-28 Determination method and determination device for label for wind assessment model and electronic equipment Active CN110348993B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910578914.9A CN110348993B (en) 2019-06-28 2019-06-28 Determination method and determination device for label for wind assessment model and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910578914.9A CN110348993B (en) 2019-06-28 2019-06-28 Determination method and determination device for label for wind assessment model and electronic equipment

Publications (2)

Publication Number Publication Date
CN110348993A true CN110348993A (en) 2019-10-18
CN110348993B CN110348993B (en) 2023-12-22

Family

ID=68177378

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910578914.9A Active CN110348993B (en) 2019-06-28 2019-06-28 Determination method and determination device for label for wind assessment model and electronic equipment

Country Status (1)

Country Link
CN (1) CN110348993B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130138554A1 (en) * 2011-11-30 2013-05-30 Rawllin International Inc. Dynamic risk assessment and credit standards generation
CN103699628A (en) * 2013-12-20 2014-04-02 北京百度网讯科技有限公司 Multiple tag obtaining method and device
CN107909097A (en) * 2017-11-08 2018-04-13 阿里巴巴集团控股有限公司 The update method and device of sample in sample storehouse
CN108595497A (en) * 2018-03-16 2018-09-28 北京达佳互联信息技术有限公司 Data screening method, apparatus and terminal
CN109242499A (en) * 2018-09-19 2019-01-18 中国银行股份有限公司 A kind of processing method of transaction risk prediction, apparatus and system
CN109388760A (en) * 2017-08-03 2019-02-26 腾讯科技(北京)有限公司 Recommend label acquisition method, media content recommendations method, apparatus and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130138554A1 (en) * 2011-11-30 2013-05-30 Rawllin International Inc. Dynamic risk assessment and credit standards generation
CN103699628A (en) * 2013-12-20 2014-04-02 北京百度网讯科技有限公司 Multiple tag obtaining method and device
CN109388760A (en) * 2017-08-03 2019-02-26 腾讯科技(北京)有限公司 Recommend label acquisition method, media content recommendations method, apparatus and storage medium
CN107909097A (en) * 2017-11-08 2018-04-13 阿里巴巴集团控股有限公司 The update method and device of sample in sample storehouse
CN108595497A (en) * 2018-03-16 2018-09-28 北京达佳互联信息技术有限公司 Data screening method, apparatus and terminal
CN109242499A (en) * 2018-09-19 2019-01-18 中国银行股份有限公司 A kind of processing method of transaction risk prediction, apparatus and system

Also Published As

Publication number Publication date
CN110348993B (en) 2023-12-22

Similar Documents

Publication Publication Date Title
CN110119413B (en) Data fusion method and device
EP4137961A1 (en) Method and apparatus for executing automatic machine learning process, and device
JP6713238B2 (en) Electronic device, method for constructing retail store evaluation model, system and storage medium
CN107220217A (en) Characteristic coefficient training method and device that logic-based is returned
US11642783B2 (en) Automated generation of robotic computer program code
CN110348726A (en) A kind of user's amount method of adjustment, device and electronic equipment based on social networks network
CN104516730A (en) Data processing method and device
CN110363575A (en) A kind of credit user moves branch wish prediction technique, device and equipment
CN110348991A (en) Assess the method, apparatus and electronic equipment of user's accrediting amount upper limit
CN110363654A (en) A kind of favor information method for pushing, device and electronic equipment
CN116245670B (en) Method, device, medium and equipment for processing financial tax data based on double-label model
CN111598677A (en) Resource quota determining method and device and electronic equipment
CN110348852A (en) A kind of credit evaluation model modification method, device, electronic equipment
CN110309142A (en) The method and apparatus of regulation management
CN110362825A (en) A kind of text based finance data abstracting method, device and electronic equipment
CN109947811A (en) Generic features library generating method and device, storage medium, electronic equipment
CN112016792A (en) User resource quota determining method and device and electronic equipment
US8495018B2 (en) Transitioning application replication configurations in a networked computing environment
US20180359852A1 (en) Modifying a Circuit Design
CN112508692A (en) Resource recovery risk prediction method and device based on convolutional neural network and electronic equipment
CN112348658A (en) Resource allocation method and device and electronic equipment
CN111582649A (en) Risk assessment method and device based on user APP unique hot coding and electronic equipment
CN110349022A (en) A kind of automated testing method, device and the electronic equipment of the virtual credit card transaction scene based on micro services
CN110363392A (en) Line of credit method of adjustment, device and electronic equipment based on user's Wifi information
CN110348993A (en) Wind is discussed and select model workers determination method, determining device and the electronic equipment of type label

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant