CN113379528A - Wind control model establishing method and device and risk control method - Google Patents

Wind control model establishing method and device and risk control method Download PDF

Info

Publication number
CN113379528A
CN113379528A CN202110570387.4A CN202110570387A CN113379528A CN 113379528 A CN113379528 A CN 113379528A CN 202110570387 A CN202110570387 A CN 202110570387A CN 113379528 A CN113379528 A CN 113379528A
Authority
CN
China
Prior art keywords
wind control
wind
sample
user
variable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110570387.4A
Other languages
Chinese (zh)
Inventor
程健
薛志超
吴强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Souche Data Technology Co ltd
Original Assignee
Hangzhou Souche Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Souche Data Technology Co ltd filed Critical Hangzhou Souche Data Technology Co ltd
Priority to CN202110570387.4A priority Critical patent/CN113379528A/en
Publication of CN113379528A publication Critical patent/CN113379528A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Technology Law (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the specification provides a method and a device for establishing a wind control model and a risk control method, wherein the method for establishing the wind control model comprises the following steps: acquiring sample characteristic data of a plurality of wind control sample users, wind control expression data of the wind control sample users in a scene to be wind controlled and wind controlled duration information of the wind control sample users in the scene to be wind controlled; determining a wind control evaluation label of a wind control sample user according to the wind control performance data; processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain target characteristic data of the wind control sample user; and establishing a wind control model for the scene to be wind controlled according to the target characteristic data of the wind control sample user, the wind controlled duration information and the wind control evaluation label.

Description

Wind control model establishing method and device and risk control method
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for establishing a wind control model, and a risk control method.
Background
In a scene to be wind-controlled, in order to establish a traditional wind-control model, time limitation exists when a user of a wind-control sample is selected, and the user with short wind-control expression time is difficult to be used as the user of the wind-control sample required by modeling. And only the risk control result corresponding to the preset time length can be obtained through the traditional wind control model, and the risk control result corresponding to the non-preset time length is inaccurate. When the number of users to be controlled by wind is greatly increased, the traditional wind control model is gradually difficult to meet the increasing wind control requirements.
Disclosure of Invention
One or more embodiments of the present specification provide a wind control model building method. The wind control model establishing method comprises the following steps:
acquiring sample characteristic data of a plurality of wind control sample users, wind control performance data of the wind control sample users in a scene to be wind controlled and wind controlled duration information of the wind control sample users in the scene to be wind controlled;
determining a wind control evaluation label of the wind control sample user according to the wind control performance data;
processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain target characteristic data of the wind control sample user;
and establishing a wind control model for the scene to be wind controlled according to the target characteristic data of the wind control sample user, the wind controlled duration information and the wind control evaluation label.
Optionally, determining a wind control evaluation tag of the wind control sample user according to the wind control performance data includes:
extracting risk behavior data of the wind control sample user from the wind control performance data; the risk behavior data comprises risk behavior duration and/or risk behavior times;
performing rolling rate analysis on the risk behavior data to obtain a risk behavior data threshold;
and determining a wind control evaluation label of the wind control sample user according to the risk behavior data of the wind control sample user and the risk behavior data threshold value.
Optionally, the sample characteristic data includes variable values of a plurality of sample characteristic variables; processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain the target characteristic data of the wind control sample user, and the method comprises the following steps:
performing chi-square binning on the variable value of the sample characteristic variable aiming at any sample characteristic variable of the wind control sample user to obtain a plurality of bins of the sample characteristic variable;
screening the multiple sample characteristic variables according to the sub-boxes of the sample characteristic variables to obtain target variables;
aiming at any one of the sub-boxes of any target variable, determining a coding value corresponding to the sub-box of the target variable according to a wind control evaluation label of the wind control sample user;
and determining the coding value corresponding to the sub-box as the target characteristic data of the wind control sample user of which the variable value of the target variable falls into the sub-box.
Optionally, the wind control evaluation tag includes a first tag and a second tag; for any one of the bins of any one of the target variables, determining a code value corresponding to the bin of the target variable according to the wind control evaluation tag of the wind control sample user, including:
aiming at any one of the sub-boxes of any one target variable, determining a first user of which the variable value of the target variable falls into the sub-box;
counting the number of users with the first labels in the first users to obtain a first number, and counting the number of users with the second labels in the first users to obtain a second number;
counting the number of users with the first labels in the wind control sample users to obtain a third number, and counting the number of users with the second labels in the wind control sample users to obtain a fourth number;
and calculating to obtain the coding value corresponding to the sub-box of the target variable according to the first quantity, the second quantity, the third quantity and the fourth quantity.
Optionally, the wind control evaluation tag includes a first tag and a second tag; screening the multiple sample characteristic variables according to the sub-boxes of the sample characteristic variables to obtain target variables, wherein the target variables comprise:
for any one of the sub-boxes of any one of the sample characteristic variables, determining a second user of the sub-box, wherein the variable value of the sample characteristic variable falls into the sub-box;
counting the number of users with the first label in each second user to obtain a fifth number, and counting the number of users with the second label in each second user to obtain a sixth number;
counting the number of users with the first labels in the wind control sample users to obtain a seventh number, and counting the number of users with the second labels in the wind control sample users to obtain an eighth number;
calculating to obtain a first information quantity of the sample characteristic variable according to the fifth quantity of each box of the sample characteristic variable, the sixth quantity of each box of the sample characteristic variable, the seventh quantity and the eighth quantity;
and screening target variables in the sample characteristic variables according to the first information quantity of the sample characteristic variables.
Optionally, the screening of the sample characteristic variables according to the first information amount of each sample characteristic variable to obtain a target variable includes:
and taking the sample characteristic variable with the first information quantity larger than a first information quantity threshold value as the target variable in the plurality of sample characteristic variables.
Optionally, the screening of the sample characteristic variables according to the first information amount of each sample characteristic variable to obtain a target variable includes:
taking the sample characteristic variable with the first information quantity larger than a first information quantity threshold value as a candidate variable in the plurality of sample characteristic variables;
calculating a correlation coefficient between any two candidate variables according to the variable value of each candidate variable;
and deleting redundant variables from the candidate variables according to the correlation coefficient and a preset coefficient threshold value to obtain target variables.
Optionally, obtaining sample feature data of a plurality of wind control sample users includes:
acquiring original characteristic data of a plurality of wind control sample users; the original characteristic data comprises variable values of a plurality of characteristic variables to be screened; the characteristic variables to be screened have at least one data source;
counting the source number of the data source of each characteristic variable to be screened and the total source number corresponding to a plurality of wind control sample users;
and determining the variable value of the characteristic variable to be screened, of which the ratio of the source number to the total source number is greater than a preset proportion threshold value, as the sample characteristic data of the wind control sample user.
One or more embodiments of the present specification provide a risk control method. The risk control method comprises the following steps:
acquiring user characteristic data of a user to be subjected to wind control and wind control duration information of the user to be subjected to wind control in a scene to be subjected to wind control;
processing the user characteristic data of the user to be subjected to wind control to obtain target characteristic data of the user to be subjected to wind control;
determining a wind control evaluation label of the user to be wind controlled corresponding to the time length information to be wind controlled according to the user target characteristic data, the time length information to be wind controlled and the wind control model in the wind control model establishing method
One or more embodiments of the present specification provide a wind control model establishing apparatus, including:
the system comprises a data acquisition module, a data acquisition module and a data processing module, wherein the data acquisition module is used for acquiring sample characteristic data of a plurality of wind control sample users, wind control performance data of the wind control sample users in a scene to be wind controlled and wind controlled duration information of the wind control sample users in the scene to be wind controlled;
the label determining module is used for determining a wind control evaluation label of the wind control sample user according to the wind control performance data;
the data processing module is used for processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain target characteristic data of the wind control sample user;
and the model establishing module is used for establishing a wind control model for the scene to be wind controlled according to the target characteristic data of the wind control sample user, the wind controlled duration information and the wind control evaluation label.
The method for establishing the wind control model provided by one or more embodiments of the present specification obtains sample characteristic data of a plurality of wind control sample users, wind control performance data of the wind control sample users in a scene to be wind controlled, and wind controlled duration information of the wind control sample users in the scene to be wind controlled; determining a wind control evaluation label of a wind control sample user according to the wind control performance data; processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain target characteristic data of the wind control sample user; according to the target characteristic data of the wind control sample user, the wind control time length information and the wind control evaluation label, a wind control model is established for a scene to be wind controlled, the time limit borne by the wind control sample user is reduced by introducing the wind control time length information, the wind control model is established for the scene to be wind controlled, a corresponding risk control result can be obtained according to the wind control time length information, and the accuracy of risk control is improved.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and that other drawings can be obtained by those skilled in the art without inventive exercise.
Fig. 1 is a process flow diagram of a method for building a wind control model according to one or more embodiments of the present disclosure;
FIG. 2 is a process flow diagram of a risk control method provided in one or more embodiments of the present disclosure;
FIG. 3 is a process flow diagram of another method for building a wind control model according to one or more embodiments of the present disclosure;
fig. 4 is a schematic view of a wind control model establishing apparatus according to one or more embodiments of the present disclosure.
FIG. 5 is a schematic diagram of a risk control device according to one or more embodiments of the present disclosure
Detailed Description
In order to make those skilled in the art better understand the technical solutions in one or more embodiments of the present disclosure, the technical solutions in one or more embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in one or more embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all embodiments. All other embodiments that can be derived by a person skilled in the art from one or more of the embodiments described herein without making any inventive step shall fall within the scope of protection of this document.
The embodiment of the wind control model establishing method provided by the specification comprises the following steps:
fig. 1 is a process flow diagram of a method for building a wind control model according to one or more embodiments of the present disclosure.
Referring to fig. 1, the method for establishing a wind control model according to the present embodiment includes steps S102 to S108.
Step S102, obtaining sample characteristic data of a plurality of wind control sample users, wind control expression data of the wind control sample users in a scene to be wind controlled, and wind controlled duration information of the wind control sample users in the scene to be wind controlled.
The scene to be controlled by the wind may be a credit wind control scene, a face recognition wind control scene, or other wind control scenes not mentioned.
Traditional credit wind control mainly takes manual examination and approval as a main part, and manual examination and approval generally requires several weeks to realize paying-off, so that the efficiency is low, and the flow is complicated.
Taking the car financing lease service under the credit wind control scene as an example, as the lease target customer group sinks, the lease company may face a higher overdue risk, so it is necessary to accurately evaluate the customer group whose traditional risk index does not cover enough. In the scenario of car financing lease business, the ownership and the right of use of a car are separated, and a financing lease company needs to judge the overdue rate of each term after the customer is credited in advance and evaluate expected risk loss by combining the residual accounts receivable and the car residual value. On the basis, the risk control and approval efficiency is greatly improved, and the visitor is more accurate.
Hereinafter, the credit wind control scenario will be mainly described as an example. In a credit wind scenario, the wind-controlled sample user may be a sample lending user who successfully applied for a loan. In specific implementation, before step S102, the lending duration of each candidate lending user may be obtained, and each candidate lending user is screened according to the lending duration of each candidate lending user of the preset duration threshold, so as to obtain a sample lending user.
The preset time threshold may be one month or more than one month, may also be a preset number of days, and may also be a loan term, which is not particularly limited in the present invention.
Taking a preset time threshold as one month as an example, obtaining the loan durations of a plurality of candidate loan users, for example, the loan duration of the candidate loan user 1 is 0 month, the loan duration of the candidate loan user 2 is 3 months, and the loan duration of the candidate loan user 3 is 1 month. Taking the candidate lending users whose lending duration is greater than or equal to the preset duration threshold as sample lending users, for example, screening the three candidate lending users, and taking the candidate lending user 2 and the candidate lending user 3 as sample lending users.
Here, the loan duration of 0 month may be that the number of days from the loan date to the sample collection date of the candidate loan user 1 is zero, that is, the candidate loan user 1 successfully applies for the loan on the sample collection date; a loan duration of 0 months may also refer to a number of days from the date of deposit to the date of sample collection of less than one month. The loan duration of each of the other candidate loan users is similar to the loan duration of the candidate loan user 1, and will not be described herein again.
In the credit wind control scenario, external data and internal data of the wind control sample user may be obtained in advance. The internal data and the external data are relative to a lending institution providing lending services to the users of the wind-controlled samples.
The internal data can be personal characteristic data filled and submitted by the wind control sample user when applying for loan, such as age, gender, occupation, salary, home address, telephone and the like, can also be a channel for the wind control sample user to apply for loan, such as a way of scanning codes through payment software to apply for loan, or a small program to apply for loan, and can also be historical credit evaluation information left by the wind control sample user at the loan institution.
The external data may be feature data of the wind control sample user obtained from a third-party institution different from the lending institution, for example, app installation information and activity information of the wind control sample user on the electronic device, or historical credit evaluation information of the third-party institution on the wind control sample user. The third party institution may be a bank, an institution corresponding to the shopping application, or a credit investigation institution.
The external data are analyzed, and the available data sources are obtained through evaluation, and the method can be as follows: and counting the variable type quantity of the sample characteristic users included in each data source, and determining the data source with the variable type quantity less than a preset type quantity threshold value as a target data source.
For example, if the a data source includes variable values of X variables of the sample feature user, and X is smaller than the preset category threshold, the a data source is discarded, that is, the a data source includes more categories of variables which are not suitable for modeling.
In specific implementation, the internal data and/or the external data may be processed according to business requirements, for example, the age may be divided into a plurality of groups, respectively 20-25, 25-35, 35-40, etc., according to the business requirements.
And screening the processed data according to the data source of the sample characteristic data and/or according to the variance among the sample characteristic data to obtain the sample characteristic data.
Optionally, obtaining sample feature data of a plurality of wind control sample users includes: acquiring original characteristic data of a plurality of wind control sample users; the original characteristic data comprises variable values of a plurality of characteristic variables to be screened; the characteristic variables to be screened have at least one data source; counting the source number of the data source of each characteristic variable to be screened and the total source number corresponding to a plurality of wind control sample users; and determining the variable value of the characteristic variable to be screened, of which the ratio of the source number to the total source number is greater than a preset proportion threshold value, as the sample characteristic data of the wind control sample user.
The original feature data may be obtained from a plurality of data sources, or may be processed feature data obtained by processing the obtained feature data according to the service requirement. The total number of sources corresponding to the plurality of wind control sample users may be understood as a union of data sources of the plurality of wind control sample users, for example, if the data sources of the wind control sample users 1, the wind control sample users 2, and the wind control sample users 5 are the sources 1, the data source of the wind control sample users 3 is the source 2, the data source of the wind control sample users 4 is the source 3, and then the total number of sources corresponding to the 5 wind control sample users is 3.
The embodiment provides a first screening method for screening original characteristic data to obtain sample characteristic data of a wind control sample user, namely a screening method utilizing the number of data sources of characteristic variables to be screened.
In general, for each data source, the number of types of feature variables to be filtered included in the raw feature data obtained from the data source is fixed. For different wind control sample users of the same data source, one part or more parts of the number of the types of the feature variables to be screened may be omitted, and the situation is ignored. The data source of a part of feature variables to be filtered may be multiple, and the data source of a part of feature variables to be filtered may be only one, for example, the feature variables to be filtered corresponding to the data source a include age, gender, telephone, home address; the characteristic variables to be screened corresponding to the data source B comprise gender, telephone, occupation and credit scores; the feature variables to be screened corresponding to the data source C comprise gender, age and telephone.
The ratio of the number of the sources of the characteristic variables to be screened to the total number of the sources, i.e. the yield of the data sources, or the coverage. For example, the raw feature data of 10 wind control sample users are obtained through 5 data sources, and the raw feature data comprises variable values of a plurality of feature variables to be screened. Aiming at a certain characteristic variable to be screened, in 5 data sources, the variable value of the characteristic variable to be screened can be obtained by inquiring 4 data sources, so that the search rate of the characteristic variable to be screened is 80%.
When a certain feature variable to be screened has a small ratio between the number of data sources and the total number of sources, it can be understood that the feature variable to be screened obtained from most of the data sources does not have the feature variable to be screened, and therefore, the feature variable to be screened is not suitable for modeling. The ratio of the number of data sources of the feature variable to be filtered to the number of all data sources can reflect the coverage of the feature variable to be filtered in each data source.
In another embodiment, a second screening method for screening the original feature data to obtain the sample feature data of the wind control sample user is provided, that is, a screening method using the variance of the variable values of the feature variables to be screened. In the embodiment, the variance between the variable values of the same characteristic variable to be screened of different wind control sample users can be solved, and the sample characteristic data of the wind control sample users can be obtained by screening according to the variance. Variance is a measure of the degree of dispersion when probability theory and statistical variance measure a random variable or a set of data. The variance in probability theory is used to measure the degree of deviation between a random variable and its mathematical expectation. The variance in the statistics is the mean of the squared values of the difference between each sample value and the mean of the total sample values. If the variance is small, the values of the variable values of the characteristic variables to be screened are concentrated, and if the variance is large, the values of the variable values of the characteristic variables to be screened are dispersed. For example, the age of the wind control sample user corresponding to data source a is 25, the age of the wind control sample user corresponding to data source B is 25, and the age of the wind control sample user corresponding to data source C is 60. It is easy to think that if the ages of the obtained multiple wind control sample users are all around 25, the sample feature data of the multiple wind control sample users are not beneficial to modeling, and the influence of the ages on the wind control evaluation labels cannot be reflected.
In another embodiment, the original feature data may be further filtered according to the variance and the number of data sources, and the specific implementation is similar to the two aforementioned filtering methods, which is not described herein again.
The wind control performance data of the wind control sample user in the scene to be controlled by wind is, for example, a credit wind control scene, the wind control performance data can be payment information of each month after the payment date of the wind control sample user, for example, the user 1 makes a payment on time in the first month, makes a payment in the second month after 3 days, and makes no payment in the third month after the third month.
The method includes that wind-controlled duration information of a wind-controlled sample user in a scene to be wind-controlled is, for example, a credit wind-controlled scene, the wind-controlled duration information may be time from a release date to a sample collection date of the wind-controlled sample user, for example, if the user 1 applies for a loan successfully in month X, and the release date is the day on which the loan application succeeds, and if a lending institution obtains sample characteristic data of the user 1 in month X + Y, the wind-controlled duration information of the user 1 is month Y. The number of months may also be replaced by the number of lending periods.
It should be noted that the duration information may also be used as a duration label of the wind control sample user.
In another embodiment, the information of the number of times of wind control of the user of the wind control sample in the scene to be wind controlled can also be obtained. For example, in a face recognition wind control scenario, the total number of times that the user 1 performs face recognition from the time the face recognition unlock payment function is enabled to the time the sample is acquired is obtained. The controlled wind times information can also be used as a wind control times label of a wind control sample user. The function of the time tag for wind control is similar to that of the time tag for wind control, and the time tag for wind control can be replaced with each other or exist at the same time, and are not described again.
And step S104, determining a wind control evaluation label of the wind control sample user according to the wind control performance data.
The wind control evaluation label can be used for representing whether the wind control sample user has risks. In a credit wind control scene, the wind control evaluation label can be an overdue severity label, the overdue severity label comprises a first label and a second label, the first label is used for representing that the sample feature user is seriously overdue, and the second label is used for representing that the sample feature user is not seriously overdue.
Optionally, determining a wind control evaluation tag of a wind control sample user according to the wind control performance data includes: extracting risk behavior data of a wind control sample user from the wind control performance data; the risk behavior data comprises risk behavior duration and/or risk behavior times; performing rolling rate analysis on the risk behavior data to obtain a risk behavior data threshold; and determining a wind control evaluation label of the wind control sample user according to the risk behavior data and the risk behavior data threshold value of the wind control sample user.
Taking a credit wind control scenario as an example, the time length of the risky behavior may be the overdue time length of the payment, that is, the time length of the payment that the user does not pay on time. Taking the face recognition wind control scene as an example, the risk behavior times may be times of face recognition failure.
The scroll rate analysis is the evolution of the worst state from a period of time before a certain observation point (called observation period) to a period of time after the observation point (called presentation period). And performing rolling rate analysis on the risk behavior duration and/or the risk behavior frequency to obtain a risk behavior data threshold, marking a first label on the risk control sample user of which the risk behavior data is greater than the risk behavior data threshold, and marking a second label on the risk control sample user of which the risk behavior data is less than or equal to the risk behavior data threshold.
In one embodiment, for each wind control sample user, the wind control sample user is bound with a corresponding wind control duration label and a corresponding wind control evaluation label.
And S106, processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain the target characteristic data of the wind control sample user.
And according to the wind control evaluation label of the wind control sample user, performing chi-square binning, screening and coding on the sample characteristic data of the wind control sample user to obtain target characteristic data of the wind control sample user.
When the variable values of the same sample characteristic variable of a plurality of sample characteristic users are close, the influence of the variable values on the wind control model is very little, so that the target wind control sample users with the variable values belonging to a certain variable value range can be determined, and the variable values of all the target wind control sample users are converted into the code values corresponding to the variable value range. The target characteristic data is obtained by performing chi-square binning, screening and encoding on the sample characteristic data, and modeling is performed by using the target characteristic data, so that the complexity of a wind control model can be simplified, and the calculated amount is reduced.
Optionally, the sample characteristic data includes variable values of a plurality of sample characteristic variables; according to the wind control evaluation label of the wind control sample user, processing the sample characteristic data of the wind control sample user to obtain the target characteristic data of the wind control sample user, and the method comprises the following steps: performing chi-square binning on variable values of the sample characteristic variables aiming at any sample characteristic variable of a wind control sample user to obtain a plurality of bins of the sample characteristic variables; screening multiple sample characteristic variables according to each sub-box of each sample characteristic variable to obtain a target variable; aiming at any sub-box of any target variable, determining a coding value corresponding to the sub-box of the target variable according to a wind control evaluation label of a wind control sample user; and determining the coding value corresponding to the sub-box as the target characteristic data of the wind control sample user of which the variable value of the target variable falls into the sub-box.
Taking sample characteristic variables as age and gender, and taking the sample characteristic variables as an example in a credit wind control scene, explaining the method: performing chi-square analysis on variable values of the ages according to the ages of the users of the wind control samples to obtain a plurality of age bins, such as 20-24, 24-30, 30-38 and 38-52, and a plurality of gender bins, such as male and female; and calculating the IV (Information Value) Value of the age and the IV Value of the sex according to each box of the age and the sex, and screening to obtain the target variable. And then, according to the wind control evaluation label, calculating to obtain the WOE (Weight of Evidence) coding value of each sub-box.
Determining the code value corresponding to each bin as the target feature data of the wind control sample user whose variable value of the target variable falls into the bin, and it can be understood that, for example, if the age of the user 1 is 25, the age of the user 2 is 27, and both users fall into the bins of 24-30, then the code values of the bins of 24-30 are determined as the target feature data of the users 1 and 2.
Optionally, the wind control evaluation tag includes a first tag and a second tag; for any sub-box of any target variable, determining a coding value corresponding to the sub-box of the target variable according to a wind control evaluation label of a wind control sample user, wherein the method comprises the following steps: aiming at any one sub-box of any target variable, determining a first user of which the variable value of the target variable falls into the sub-box; counting the number of users with first labels in each first user to obtain a first number, and counting the number of users with second labels in each first user to obtain a second number; counting the number of users with the first label in each wind control sample user to obtain a third number, and counting the number of users with the second label in each wind control sample user to obtain a fourth number; and calculating to obtain the coding value corresponding to the sub-box of the target variable according to the first quantity, the second quantity, the third quantity and the fourth quantity.
For any bin of any target variable, a first user whose variable value of the target variable falls into the bin is determined, e.g., a number of bins of age: 20-24, 24-30, 30-38, 38-52, for the bins 20-24, determine the first user between the ages of 20-24.
Counting the number of users with first labels in each first user to obtain a first number a1, and counting the number of users with second labels in each first user to obtain a second number b 1; counting the number of users with the first label in each wind control sample user to obtain a third number A, and counting the number of users with the second label in each wind control sample user to obtain a fourth number B; based on the first quantity a1, the second quantity B1, the third quantity a and the fourth quantity B, the WOE code value ln (a1B/Ab1) corresponding to the bin of the target variable is calculated, and can be referred to table 1.
Table 1 shows a coding value calculation method provided in one or more embodiments of the present specification.
TABLE 1
Figure BDA0003082381750000111
Optionally, the wind control evaluation tag includes a first tag and a second tag; screening multiple sample characteristic variables according to each box of each sample characteristic variable to obtain target variables, wherein the screening comprises the following steps: aiming at any sub-box of any sample characteristic variable, determining a second user of the sub-box of which the variable value of the sample characteristic variable falls; counting the number of users with the first labels in each second user to obtain a fifth number, and counting the number of users with the second labels in each second user to obtain a sixth number; counting the number of users with the first label in each wind control sample user to obtain a seventh number, and counting the number of users with the second label in each wind control sample user to obtain an eighth number; calculating to obtain a first information quantity of the sample characteristic variable according to the fifth quantity of each box of the sample characteristic variable, the sixth quantity, the seventh quantity and the eighth quantity of each box of the sample characteristic variable; and screening the characteristic variables of each sample according to the first information quantity of the characteristic variables of each sample to obtain target variables.
For any bin of any target variable, determining a second user whose variable value of the target variable falls into the bin, e.g., a plurality of bins of age: 20-24, 24-30, 30-38, 38-52, determining second users between the ages of 20-24 for the bins 20-24. The second user may be a user corresponding to the same bin as the first user, or may be a user corresponding to a different bin.
The fifth quantity corresponds to the first quantity, the sixth quantity corresponds to the second quantity, the seventh quantity corresponds to the third quantity, and the eighth quantity corresponds to the fourth quantity.
Counting the number of users with the first label in each second user to obtain a fifth number a1, and counting the number of users with the second label in each second user to obtain a sixth number b 1; counting the number of users with the first label in each wind control sample user to obtain a seventh number A, and counting the number of users with the second label in each wind control sample user to obtain an eighth number B; the first information quantity IV value of the sample characteristic variable is calculated from the fifth quantity a1, a2, a3 of each bin of the sample characteristic variable, the sixth quantity B1, B2, B3 of each bin of the sample characteristic variable, the seventh quantity a, and the eighth quantity B, and can be referred to table 2.
Table 2 shows a calculation manner of the first information amount IV value provided in one or more embodiments of the present specification.
Figure BDA0003082381750000121
Optionally, the screening of the characteristic variables of each sample according to the first information amount of the characteristic variable of each sample to obtain the target variable includes: among the plurality of types of sample characteristic variables, a sample characteristic variable having a first information amount greater than a first information amount threshold value is taken as a target variable.
And comparing the first information quantity IV value with a preset first information quantity threshold, and if the first information quantity IV value is larger than the first information quantity threshold, taking a sample characteristic variable corresponding to the first information quantity IV value as a target variable.
Optionally, the screening of the characteristic variables of each sample according to the first information amount of the characteristic variable of each sample to obtain the target variable includes: taking the sample characteristic variable with the first information quantity larger than a first information quantity threshold value as a candidate variable in the multiple sample characteristic variables; calculating to obtain a correlation coefficient between any two candidate variables according to the variable value of each candidate variable; and deleting redundant variables from the candidate variables according to the correlation coefficient and a preset coefficient threshold value to obtain target variables.
And comparing the first information quantity IV value with a preset first information quantity threshold, and if the first information quantity IV value is larger than the first information quantity threshold, taking a sample characteristic variable corresponding to the first information quantity IV value as a candidate variable. Based on the variable values of the respective candidate variables, a pearson correlation coefficient, a covariance, or other parameters not mentioned for characterizing the correlation between two variables can be calculated between any two candidate variables. Taking covariance as an example, determining a redundant variable in the candidate variables according to the covariance and a preset covariance threshold, and deleting the redundant variable from each candidate variable to obtain a target variable.
In this embodiment, the correlation coefficient is used to reflect the correlation between the two variables, and when the correlation between the two variables is large, only one of the two variables needs to be retained. By deleting the redundant variables, the sample characteristic data required by the wind control model can be reduced, the complexity of the wind control model is further reduced, and the workload is reduced.
In another embodiment, the sample feature variables may be first screened according to the correlation coefficient to obtain a second candidate variable, and then the second candidate variable is screened according to the IV value to obtain a target variable, that is, the execution sequence of the two screening methods is exchanged.
And S108, establishing a wind control model for the scene to be wind controlled according to the target characteristic data of the wind control sample user, the wind controlled duration information and the wind control evaluation label.
The wind control model may be a COX model.
According to target characteristic data of a wind control sample user, a wind control time length label and a wind control evaluation label are obtained according to wind control time length information of the wind control sample user, a training set sample and a corresponding training set label combination and a testing set sample and a corresponding testing set label combination are generated, a COX model is established, and the COX model is trained and tested until the testing requirements are met. In a credit wind control scene, after different pieces of controlled duration information are set for the COX model, wind control evaluation labels of the different pieces of controlled duration information can be predicted.
By introducing the COX model, the modeling label of which is [ T, label ] and two dimensions of the wind control duration label and the wind control evaluation label, compared with the traditional LR model (the label is only label), the method has two main advantages:
(a1) when a sample is screened, the LR model needs to analyze repayment data after credit through the vision to determine the risk maturity period, and a recent maturity sample with the maximum presentation period within the maturity period cannot be used; and a wind control duration label is introduced into the COX model label, so that the data can be used as a modeling sample as long as more than one period of post-loan expression data exists, and the modeling label is richer.
(a2) When the COX model is used for prediction, the prediction result is more comprehensive. According to the method, overdue probabilities corresponding to different durations can be predicted according to different input wind control duration labels, and expected risk loss is evaluated by combining changes of remaining accounts receivable and automobile residual values of different durations, so that risk preparation is well done.
According to the wind control model establishing method in the embodiment shown in fig. 1, sample characteristic data of a plurality of wind control sample users, wind control performance data of the wind control sample users in a scene to be wind controlled, and wind controlled duration information of the wind control sample users in the scene to be wind controlled are obtained; determining a wind control evaluation label of a wind control sample user according to the wind control performance data; processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain target characteristic data of the wind control sample user; according to the target characteristic data of the wind control sample user, the wind control duration information and the wind control evaluation label, a wind control model is established for a scene to be wind controlled, the wind control model is provided for the scene to be wind controlled, the time limit borne by the wind control sample user is reduced by introducing the wind control duration information, the wind control model is established for the scene to be wind controlled, a corresponding risk control result can be obtained according to the wind control duration information, and the accuracy of risk control is improved.
The wind control model establishing method provided in this embodiment is further described below by taking an application of the wind control model establishing method provided in this embodiment in a credit wind control scene as an example, and referring to fig. 3, the wind control model establishing method applied in the credit wind control scene specifically includes steps S302 to S318.
Step S302, the internal and external characteristic data sources are arranged.
And step S304, variable processing and primary screening.
And step S306, sorting the credited data.
And step S308, obtaining a wind control evaluation label by utilizing rolling rate analysis.
Step S310, determining a wind control duration label.
In step S312, the features are matched with the tags.
And step S314, variable binning and encoding.
Step S316, information amount screening and correlation screening.
And step S318, constructing a COX model according to the wind control duration label, the wind control evaluation label and the characteristics.
The embodiment shown in fig. 3 may implement each step in the foregoing method embodiments, and is not described here again.
The embodiment of the wind control method provided by the specification comprises the following steps:
referring to fig. 2, the wind control method provided in this embodiment includes steps S202 to S206.
Step S202, user characteristic data of the user to be subjected to wind control and the information of the time length to be subjected to wind control of the user to be subjected to wind control in the scene to be subjected to wind control are obtained.
The wind control scene to be detected can be a credit wind control scene, a face recognition wind control scene, or other wind control scenes which are not mentioned.
Traditional credit wind control mainly takes manual examination and approval as a main part, and manual examination and approval generally requires several weeks to realize paying-off, so that the efficiency is low, and the flow is complicated.
Taking a credit pneumatic control scene as an example of a scene of an automobile financing lease service, as a lease target customer group sinks, a lease company may face a high overdue risk, so that it is necessary to accurately evaluate a customer group whose traditional risk indexes do not cover the deficiency. In the scenario of car financing lease business, the ownership and the right of use of a car are separated, and a financing lease company needs to judge the overdue rate of each term after the customer is credited in advance and evaluate expected risk loss by combining the residual accounts receivable and the car residual value. On the basis, the risk control and approval efficiency is greatly improved, and the visitor is more accurate.
Hereinafter, the credit wind control scenario will be mainly described as an example. In a credit wind scenario, the user to be wind controlled may be a lending user who successfully applies for a loan.
The user characteristic data can be personal characteristic data filled and submitted by the to-be-programmed user when the loan institution applies for loan, such as age, gender, occupation, salary, home address, telephone and the like, or can be a channel for the to-be-programmed user to apply for loan, such as a mode of scanning codes through payment software to apply for loan, or a small program to apply for loan, or historical credit evaluation information left by the to-be-programmed user at the loan institution.
The method includes the steps of obtaining the information of the duration to be controlled by wind of a user to be controlled in a scene to be controlled by wind, wherein the information of the duration to be controlled by wind input by the user can be obtained.
It should be noted that the information of the duration to be controlled by wind may also be used as a time tag of the duration to be controlled by wind of the user to be controlled by wind.
In another embodiment, the information of the number of times of waiting for wind control of the user waiting for wind control in the scene waiting for wind control can also be obtained. The information of the number of times of waiting for wind control can also be used as a label of the number of times of waiting for wind control of the user, the label of the number of times of wind control is similar to the label of the duration of wind control, and the label of the number of times of wind control and the label of the duration of wind control can be replaced with each other or exist at the same time, and are not described again here.
And step S204, processing the user characteristic data of the user to be subjected to wind control to obtain target characteristic data of the user to be subjected to wind control.
In specific implementation, the processing is performed on the user characteristic data of the user to be subjected to wind control to obtain the target characteristic data of the user to be subjected to wind control, and the processing comprises the following steps: acquiring a target variable corresponding to the wind control model, each box of the target variable and a coding value corresponding to each box of the target variable in the embodiment of the method; and processing the user characteristic data of the user to be wind controlled according to the target variable, each box of the target variable and the coding value corresponding to each box of the target variable to obtain the target characteristic data of the user to be wind controlled.
Each wind control model has a corresponding target variable, each bin of the target variable and a coding value corresponding to the bin of each bin of the target variable, and an input data processing rule corresponding to the wind control model can be generated according to the target variable, each bin of the target variable and the coding value corresponding to the bin of each bin of the target variable, wherein the input data processing rule is as follows:
according to the target variable, each box of the target variable and the coding value corresponding to each box of the target variable, processing the user characteristic data of the user to be controlled by wind to obtain the target characteristic data of the user to be controlled by wind, comprising the following steps: the user characteristic data comprises variable values of a plurality of user characteristic variables; according to the model target variable, each model box of the model target variable and the box coding value of each model box of the model target variable, screening and coding are carried out on the user characteristic data to obtain the user target characteristic data, and the method comprises the following steps: taking the user characteristic variable which is the same as the model target variable as a user target variable; according to each model box of the model target variable and the box coding value of each model box of the model target variable, carrying out box separation on the user target variable corresponding to the model target variable to obtain a plurality of variable boxes of the user target variable and the variable coding value of each variable box; aiming at any variable sub-box of any user target variable, determining a target user to be wind-controlled, of which the variable value of the user target variable falls into the variable sub-box; and determining the variable coding value of the variable sub-box of the user target variable as the user target characteristic data of the target user to be subjected to wind control.
Step S206, determining a wind control evaluation label corresponding to the time length information to be wind controlled of the user to be wind controlled according to the user target characteristic data, the time length information to be wind controlled and the wind control model in the method embodiment.
And inputting the target characteristic data of the user and the information of the time length to be controlled by the wind into the wind control model in the embodiment of the wind control model establishing method, and outputting a wind control evaluation label by the wind control model, wherein the wind control evaluation label is the wind control evaluation label of the user to be controlled by the wind, which corresponds to the information of the time length to be controlled by the wind.
In the risk control method provided in one or more embodiments of the present specification, user characteristic data of a user to be subjected to wind control and information of a duration to be subjected to wind control of the user to be subjected to wind control in a scene to be subjected to wind control are obtained; processing the user characteristic data of the user to be subjected to wind control to obtain target characteristic data of the user to be subjected to wind control; according to the user target characteristic data, the time information to be subjected to wind control and the wind control model, the wind control evaluation label of the user to be subjected to wind control and corresponding to the time information to be subjected to wind control is determined, the time limit of determining the user of the wind control sample is reduced by introducing the time information to be subjected to wind control, the wind control model is established for the scene to be subjected to wind control, the corresponding risk control result can be obtained according to the time information to be subjected to wind control, and the accuracy of risk control is improved.
Fig. 4 is a schematic view of a wind control model establishing apparatus according to one or more embodiments of the present disclosure.
The present embodiment provides a wind control model building apparatus, including:
a data obtaining module 402, configured to obtain sample feature data of multiple wind control sample users, wind control performance data of the wind control sample users in a scene to be wind controlled, and wind controlled duration information of the wind control sample users in the scene to be wind controlled;
a label determination module 404, configured to determine a wind control evaluation label of a wind control sample user according to the wind control performance data;
the data processing module 406 is configured to process the sample feature data of the wind control sample user according to the wind control evaluation tag of the wind control sample user to obtain target feature data of the wind control sample user;
the model establishing module 408 is configured to establish a wind control model for the scene to be wind controlled according to the target feature data of the wind control sample user, the wind-controlled duration information, and the wind control evaluation tag.
Optionally, the tag determining module 404 is specifically configured to:
extracting risk behavior data of a wind control sample user from the wind control performance data; the risk behavior data comprises risk behavior duration and/or risk behavior times;
performing rolling rate analysis on the risk behavior data to obtain a risk behavior data threshold;
and determining a wind control evaluation label of the wind control sample user according to the risk behavior data and the risk behavior data threshold value of the wind control sample user.
Optionally, the sample characteristic data includes variable values of a plurality of sample characteristic variables; a data processing module 406 comprising:
the variable binning submodule is used for performing chi-square binning on the variable value of the sample characteristic variable aiming at any sample characteristic variable of a wind control sample user to obtain a plurality of bins of the sample characteristic variable;
the variable screening submodule is used for screening various sample characteristic variables according to each sub-box of each sample characteristic variable to obtain a target variable;
the coding value determining submodule is used for determining a coding value corresponding to any sub-box of any target variable according to a wind control evaluation label of a wind control sample user aiming at any sub-box of any target variable;
and the data determination submodule is used for determining the coding value corresponding to the sub-box as the target characteristic data of the wind control sample user of which the variable value of the target variable falls into the sub-box.
Optionally, the wind control evaluation tag includes a first tag and a second tag; an encoding value determination submodule, configured to:
aiming at any one sub-box of any target variable, determining a first user of which the variable value of the target variable falls into the sub-box;
counting the number of users with first labels in each first user to obtain a first number, and counting the number of users with second labels in each first user to obtain a second number;
counting the number of users with the first label in each wind control sample user to obtain a third number, and counting the number of users with the second label in each wind control sample user to obtain a fourth number;
and calculating to obtain the coding value corresponding to the sub-box of the target variable according to the first quantity, the second quantity, the third quantity and the fourth quantity.
Optionally, the wind control evaluation tag includes a first tag and a second tag; a variable screening submodule comprising:
the second user determining unit is used for determining a second user of which the variable value of any sample characteristic variable falls into any bin of the sample characteristic variable;
a first number determining unit, configured to count the number of users with the first tag in each second user to obtain a fifth number, and count the number of users with the second tag in each second user to obtain a sixth number;
the second quantity determining unit is used for counting the quantity of users with the first labels in all the wind control sample users to obtain a seventh quantity, and counting the quantity of users with the second labels in all the wind control sample users to obtain an eighth quantity;
a first information quantity determining unit, configured to calculate a first information quantity of the sample feature variable according to a fifth quantity of each bin of the sample feature variable, a sixth quantity, a seventh quantity, and an eighth quantity of each bin of the sample feature variable;
and the target variable screening unit is used for screening the target variables in the sample characteristic variables according to the first information quantity of the sample characteristic variables.
Optionally, the target variable screening unit is specifically configured to:
among the plurality of types of sample characteristic variables, a sample characteristic variable having a first information amount greater than a first information amount threshold value is taken as a target variable.
Optionally, the target variable screening unit is specifically configured to:
taking the sample characteristic variable with the first information quantity larger than a first information quantity threshold value as a candidate variable in the multiple sample characteristic variables;
calculating to obtain a correlation coefficient between any two candidate variables according to the variable value of each candidate variable;
and deleting redundant variables from the candidate variables according to the correlation coefficient and a preset coefficient threshold value to obtain target variables.
Optionally, the data obtaining module 402 is specifically configured to:
acquiring original characteristic data of a plurality of wind control sample users; the original characteristic data comprises variable values of a plurality of characteristic variables to be screened; the characteristic variables to be screened have at least one data source;
counting the source number of the data source of each characteristic variable to be screened and the total source number corresponding to a plurality of wind control sample users;
and determining the variable value of the characteristic variable to be screened, of which the ratio of the source number to the total source number is greater than a preset proportion threshold value, as the sample characteristic data of the wind control sample user.
The method for establishing the wind control model provided by one or more embodiments of the present specification obtains sample characteristic data of a plurality of wind control sample users, wind control performance data of the wind control sample users in a scene to be wind controlled, and wind controlled duration information of the wind control sample users in the scene to be wind controlled; determining a wind control evaluation label of a wind control sample user according to the wind control performance data; processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain target characteristic data of the wind control sample user; according to the target characteristic data of the wind control sample user, the wind control time length information and the wind control evaluation label, a wind control model is established for a scene to be wind controlled, the time limit borne by the wind control sample user is reduced by introducing the wind control time length information, the wind control model is established for the scene to be wind controlled, a corresponding risk control result can be obtained according to the wind control time length information, and the accuracy of risk control is improved.
Fig. 5 is a schematic diagram of a risk control device according to one or more embodiments of the present disclosure.
The present embodiment provides a risk control device, including:
a wind control data obtaining module 502, configured to obtain user characteristic data of a user to be wind controlled and information of a duration to be wind controlled of the user to be wind controlled in a scene to be wind controlled;
the user data processing module 504 is configured to process user feature data of a user to be subjected to wind control to obtain target feature data of the user to be subjected to wind control;
and an evaluation label determining module 506, configured to determine, according to the user target feature data, the to-be-wind-controlled duration information, and the wind control model, a wind control evaluation label of the to-be-wind-controlled user, which corresponds to the to-be-wind-controlled duration information.
In the risk control method provided in one or more embodiments of the present specification, user characteristic data of a user to be subjected to wind control and information of a duration to be subjected to wind control of the user to be subjected to wind control in a scene to be subjected to wind control are acquired; processing the user characteristic data of the user to be subjected to wind control to obtain target characteristic data of the user to be subjected to wind control; according to the user target characteristic data, the time information to be subjected to wind control and the wind control model, a wind control evaluation label of the user to be subjected to wind control and corresponding to the time information to be subjected to wind control is determined, time limit of determining a wind control sample user is reduced by introducing the time information to be subjected to wind control, the wind control model is established for the scene to be subjected to wind control, a corresponding risk control result can be obtained according to the time information to be subjected to wind control, and accuracy of risk control is improved.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
In the 30 s of the 20 th century, improvements in a technology could clearly be distinguished between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Hardware Description Language), traffic, pl (core universal Programming Language), HDCal (jhdware Description Language), lang, Lola, HDL, laspam, hardward Description Language (vhr Description Language), vhal (Hardware Description Language), and vhigh-Language, which are currently used in most common. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be considered a hardware component, and the means included therein for performing the various functions may also be considered as a structure within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functions of the units may be implemented in the same software and/or hardware or in multiple software and/or hardware when implementing the embodiments of the present description.
One skilled in the art will recognize that one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The description has been presented with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
One or more embodiments of the present description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of this document and is not intended to limit this document. Various modifications and changes may occur to those skilled in the art from this document. Any modifications, equivalents, improvements, etc. which come within the spirit and principle of the disclosure are intended to be included within the scope of the claims of this document.

Claims (10)

1. A method for establishing a wind control model is characterized by comprising the following steps:
acquiring sample characteristic data of a plurality of wind control sample users, wind control performance data of the wind control sample users in a scene to be wind controlled and wind controlled duration information of the wind control sample users in the scene to be wind controlled;
determining a wind control evaluation label of the wind control sample user according to the wind control performance data;
processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain target characteristic data of the wind control sample user;
and establishing a wind control model for the scene to be wind controlled according to the target characteristic data of the wind control sample user, the wind controlled duration information and the wind control evaluation label.
2. The method of claim 1, wherein determining a wind control rating label for the wind control sample user based on the wind control performance data comprises:
extracting risk behavior data of the wind control sample user from the wind control performance data; the risk behavior data comprises risk behavior duration and/or risk behavior times;
performing rolling rate analysis on the risk behavior data to obtain a risk behavior data threshold;
and determining a wind control evaluation label of the wind control sample user according to the risk behavior data of the wind control sample user and the risk behavior data threshold value.
3. The method of claim 1, wherein the sample characteristic data comprises variable values of a plurality of sample characteristic variables; processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain the target characteristic data of the wind control sample user, and the method comprises the following steps:
performing chi-square binning on the variable value of the sample characteristic variable aiming at any sample characteristic variable of the wind control sample user to obtain a plurality of bins of the sample characteristic variable;
screening the multiple sample characteristic variables according to the sub-boxes of the sample characteristic variables to obtain target variables;
aiming at any one of the sub-boxes of any target variable, determining a coding value corresponding to the sub-box of the target variable according to a wind control evaluation label of the wind control sample user;
and determining the coding value corresponding to the sub-box as the target characteristic data of the wind control sample user of which the variable value of the target variable falls into the sub-box.
4. The method of claim 3, wherein the wind-controlled rating label comprises a first label and a second label; for any one of the bins of any one of the target variables, determining a code value corresponding to the bin of the target variable according to the wind control evaluation tag of the wind control sample user, including:
aiming at any one of the sub-boxes of any one target variable, determining a first user of which the variable value of the target variable falls into the sub-box;
counting the number of users with the first labels in the first users to obtain a first number, and counting the number of users with the second labels in the first users to obtain a second number;
counting the number of users with the first labels in the wind control sample users to obtain a third number, and counting the number of users with the second labels in the wind control sample users to obtain a fourth number;
and calculating to obtain the coding value corresponding to the sub-box of the target variable according to the first quantity, the second quantity, the third quantity and the fourth quantity.
5. The method of claim 3, wherein the wind-controlled rating label comprises a first label and a second label; screening the multiple sample characteristic variables according to the sub-boxes of the sample characteristic variables to obtain target variables, wherein the target variables comprise:
for any one of the sub-boxes of any one of the sample characteristic variables, determining a second user of the sub-box, wherein the variable value of the sample characteristic variable falls into the sub-box;
counting the number of users with the first label in each second user to obtain a fifth number, and counting the number of users with the second label in each second user to obtain a sixth number;
counting the number of users with the first labels in the wind control sample users to obtain a seventh number, and counting the number of users with the second labels in the wind control sample users to obtain an eighth number;
calculating to obtain a first information quantity of the sample characteristic variable according to the fifth quantity of each box of the sample characteristic variable, the sixth quantity of each box of the sample characteristic variable, the seventh quantity and the eighth quantity;
and screening target variables in the sample characteristic variables according to the first information quantity of the sample characteristic variables.
6. The method of claim 5, wherein screening the sample characteristic variables to obtain target variables according to the first information amount of each sample characteristic variable comprises:
and taking the sample characteristic variable with the first information quantity larger than a first information quantity threshold value as the target variable in the plurality of sample characteristic variables.
7. The method of claim 5, wherein screening the sample characteristic variables to obtain target variables according to the first information amount of each sample characteristic variable comprises:
taking the sample characteristic variable with the first information quantity larger than a first information quantity threshold value as a candidate variable in the plurality of sample characteristic variables;
calculating a correlation coefficient between any two candidate variables according to the variable value of each candidate variable;
and deleting redundant variables from the candidate variables according to the correlation coefficient and a preset coefficient threshold value to obtain target variables.
8. The method of claim 1, wherein obtaining sample characteristic data for a plurality of wind-controlled sample users comprises:
acquiring original characteristic data of a plurality of wind control sample users; the original characteristic data comprises variable values of a plurality of characteristic variables to be screened; the characteristic variables to be screened have at least one data source;
counting the source number of the data source of each characteristic variable to be screened and the total source number corresponding to a plurality of wind control sample users;
and determining the variable value of the characteristic variable to be screened, of which the ratio of the source number to the total source number is greater than a preset proportion threshold value, as the sample characteristic data of the wind control sample user.
9. A risk control method, comprising:
acquiring user characteristic data of a user to be subjected to wind control and wind control duration information of the user to be subjected to wind control in a scene to be subjected to wind control;
processing the user characteristic data of the user to be subjected to wind control to obtain target characteristic data of the user to be subjected to wind control;
determining a wind control evaluation label of the user to be wind controlled corresponding to the time information to be wind controlled according to the user target characteristic data, the time information to be wind controlled and the wind control model of any one of claims 1 to 8.
10. A wind control model building device is characterized by comprising:
the system comprises a data acquisition module, a data acquisition module and a data processing module, wherein the data acquisition module is used for acquiring sample characteristic data of a plurality of wind control sample users, wind control performance data of the wind control sample users in a scene to be wind controlled and wind controlled duration information of the wind control sample users in the scene to be wind controlled;
the label determining module is used for determining a wind control evaluation label of the wind control sample user according to the wind control performance data;
the data processing module is used for processing the sample characteristic data of the wind control sample user according to the wind control evaluation label of the wind control sample user to obtain target characteristic data of the wind control sample user;
and the model establishing module is used for establishing a wind control model for the scene to be wind controlled according to the target characteristic data of the wind control sample user, the wind controlled duration information and the wind control evaluation label.
CN202110570387.4A 2021-05-25 2021-05-25 Wind control model establishing method and device and risk control method Pending CN113379528A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110570387.4A CN113379528A (en) 2021-05-25 2021-05-25 Wind control model establishing method and device and risk control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110570387.4A CN113379528A (en) 2021-05-25 2021-05-25 Wind control model establishing method and device and risk control method

Publications (1)

Publication Number Publication Date
CN113379528A true CN113379528A (en) 2021-09-10

Family

ID=77571841

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110570387.4A Pending CN113379528A (en) 2021-05-25 2021-05-25 Wind control model establishing method and device and risk control method

Country Status (1)

Country Link
CN (1) CN113379528A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115713399A (en) * 2022-09-28 2023-02-24 睿智合创(北京)科技有限公司 User credit assessment system combined with third-party data source
CN116937820A (en) * 2023-09-19 2023-10-24 深圳凯升联合科技有限公司 High-voltage circuit state monitoring method based on deep learning algorithm

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180299469A1 (en) * 2014-01-24 2018-10-18 National Jewish Health Methods for detection of respiratory diseases
CN111898675A (en) * 2020-07-30 2020-11-06 北京云从科技有限公司 Credit wind control model generation method and device, scoring card generation method, machine readable medium and equipment
CN112801775A (en) * 2021-01-29 2021-05-14 中国工商银行股份有限公司 Client credit evaluation method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180299469A1 (en) * 2014-01-24 2018-10-18 National Jewish Health Methods for detection of respiratory diseases
CN111898675A (en) * 2020-07-30 2020-11-06 北京云从科技有限公司 Credit wind control model generation method and device, scoring card generation method, machine readable medium and equipment
CN112801775A (en) * 2021-01-29 2021-05-14 中国工商银行股份有限公司 Client credit evaluation method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115713399A (en) * 2022-09-28 2023-02-24 睿智合创(北京)科技有限公司 User credit assessment system combined with third-party data source
CN115713399B (en) * 2022-09-28 2023-10-20 睿智合创(北京)科技有限公司 User credit evaluation system combined with third-party data source
CN116937820A (en) * 2023-09-19 2023-10-24 深圳凯升联合科技有限公司 High-voltage circuit state monitoring method based on deep learning algorithm
CN116937820B (en) * 2023-09-19 2024-01-05 深圳凯升联合科技有限公司 High-voltage circuit state monitoring method based on deep learning algorithm

Similar Documents

Publication Publication Date Title
CN110413877B (en) Resource recommendation method and device and electronic equipment
CN109858970B (en) User behavior prediction method, device and storage medium
WO2018014786A1 (en) Modeling method and device for evaluation model
CN106651057A (en) Mobile terminal user age prediction method based on installation package sequence table
CN113688313A (en) Training method of prediction model, information pushing method and device
CN110020427B (en) Policy determination method and device
CN113379528A (en) Wind control model establishing method and device and risk control method
CN112214652B (en) Message generation method, device and equipment
CN112417093B (en) Model training method and device
CN112598294A (en) Method, device, machine readable medium and equipment for establishing scoring card model on line
CN114240101A (en) Risk identification model verification method, device and equipment
CN114997472A (en) Model training method, business wind control method and business wind control device
CN114490786B (en) Data sorting method and device
CN112328869A (en) User loan willingness prediction method and device and computer system
CN116563006A (en) Service risk early warning method, device, storage medium and device
CN110033092B (en) Data label generation method, data label training device, event recognition method and event recognition device
CN109903166B (en) Data risk prediction method, device and equipment
CN113010562B (en) Information recommendation method and device
CN111259975B (en) Method and device for generating classifier and method and device for classifying text
CN116029556B (en) Service risk assessment method, device, equipment and readable storage medium
CN110738562B (en) Method, device and equipment for generating risk reminding information
CN115953248B (en) Wind control method, device, equipment and medium based on saprolitic additivity interpretation
CN111461352B (en) Model training method, service node identification device and electronic equipment
CN117035695B (en) Information early warning method and device, readable storage medium and electronic equipment
CN116823407B (en) Product information pushing method, device, electronic equipment and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination