CA3059937A1 - User credit evaluation method and device, electronic device, storage medium - Google Patents

User credit evaluation method and device, electronic device, storage medium Download PDF

Info

Publication number
CA3059937A1
CA3059937A1 CA3059937A CA3059937A CA3059937A1 CA 3059937 A1 CA3059937 A1 CA 3059937A1 CA 3059937 A CA3059937 A CA 3059937A CA 3059937 A CA3059937 A CA 3059937A CA 3059937 A1 CA3059937 A1 CA 3059937A1
Authority
CA
Canada
Prior art keywords
parameter
type
binning
target
weight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CA3059937A
Other languages
French (fr)
Inventor
Pengcheng CHEN
Ying Ma
Jinhui Chen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
10353744 Canada Ltd
Original Assignee
10353744 Canada Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 10353744 Canada Ltd filed Critical 10353744 Canada Ltd
Publication of CA3059937A1 publication Critical patent/CA3059937A1/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0609Buyer or seller confidence or verification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Human Resources & Organizations (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Technology Law (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Educational Administration (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The present disclosure relates to a user credit evaluation method and device, an electronic device, and a storage medium, and relates to the field of Internet technologies. The method includes: acquiring a plurality of feature information of a target user, wherein the plurality of feature information comprises a first type of parameter and a second of type of parameter; preprocessing the first type of parameter and the second type of parameter; converting the preprocessed first type of parameter to generate a target parameter; inputting the preprocessed second type of parameter and the target parameter into a machine learning model to obtain a credit evaluation result of the target user; wherein, the feature information having an IV value lower than a preset threshold is the first type of parameter, and the feature information having an IV value higher than the preset threshold is the second type of parameter. The present disclosure can more accurately determine the user credit evaluation result and accurately identify a credit risk.

Description

USER CREDIT EVALUATION METHOD AND DEVICE, ELECTRONIC
DEVICE, STORAGE MEDIUM
Technical Field [0001] The present disclosure relates to the field of Internet technologies, and more specifically, relates to a user credit evaluation method and device, an electronic device, and a storage medium.
Back2round Art
[0002] The credit evaluation card model is the most common risk scoring model used in the financial field. This model has a balance between the interpretability and algorithm complexity.
[0003] In the related art, typically the parameters of the strong financial attribute are used to predict the default probability of a user. However, the user data obtained in most cases does not have such a strong financial attribute. Therefore, the amount of the strong financial attribute parameters that can be used is quite limited, which may result in an inaccurate credit evaluation result and a limited application scope, and thus cannot accurately measure the user risk.
[0004] The above information disclosed in the background art section is only for enhancement of understanding the background of the present disclosure. It may therefore comprise information that does not constitute prior art known to a person of ordinary skill in the art.
Summary 100051 An object of the present disclosure is to provide a user credit evaluation method and device, an electronic device, and a storage medium, which can, at least to some extent, overcome the problem of inaccurate measurement of user risks due to the limitations and deficiencies of the related art.
[0006] Other features and advantages of the present disclosure will be apparent from the following detailed description, or learned in part by the practice of implementing the disclosure.
[0007] According to one aspect of the present disclosure, a user credit evaluation method is provided, and the method comprises: obtaining a plurality of feature information of a target user, wherein the plurality of feature information comprises a first type of parameter and a second type of parameter; preprocessing the first type of parameter and the second type of parameter;
converting the preprocessed first type of parameter to generate a target parameter; inputting the preprocessed second type of parameter and the target parameter into a machine learning model to obtain a credit evaluation result of the target user; wherein the feature information having an IV
value lower than a preset threshold is the first type of parameter, and the feature information having an IV value higher than the preset threshold is the second type of parameter.
[0008] In one exemplary embodiment of the present application, the step of preprocessing the first type of parameter and the second type of parameter comprises:
separately binning the first type of parameter and the second type of parameter according to a weight of evidence, so as to obtain the first type of parameter and the second type of parameter after the binning.
[0009] In one exemplary embodiment of the present application, the step of converting the preprocessed first type of parameter to generate a target parameter comprises:
using a linear discriminant algorithm to carry out feature combination on the first type of parameters associated with each topic, so as to generate the target parameter.
[0010] In one exemplary embodiment of the present application, the method further comprises: performing second binning with the target parameter, and placing the target parameter after the second binning into a candidate variable pool; and placing the second type of parameter after the binning in the candidate variable pool.
[0011] In one exemplary embodiment of the present application, the step of inputting the preprocessed second type of parameter and the target parameter into a machine learning model comprises: excluding a multicollinearity between the second type of parameter after the binning and the target parameter after the second binning in the candidate variable pool, so as to obtain a remaining parameter; and inputting the remaining parameter into the machine learning model.
[0012] In one exemplary embodiment of the present application, the step of excluding a multicollinearity between the second type of parameter after the binning and the target parameter after the second binning in the candidate variable pool, so as to obtain a remaining parameter comprises: excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value from the candidate variable pool, so as to obtain the remaining parameter.

[0013] In one exemplary embodiment of the present application, the step of excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value from the candidate variable pool comprises: excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value according to an order of the weight of evidence from low to high;
recalculating the weight of evidence of the second type of parameter after the binning and the weight of evidence of the target parameter after the second binning after the excluding; and excluding the second type of parameter after the binning having the recalculated weight of evidence lower than the preset value and the target parameter after the second binning having the recalculated weight of evidence lower than the preset value according to the order of the weight of evidence from low to high, until every second type of parameter having the weight of evidence lower than the preset value and every target parameter having the weight of evidence lower than the preset value have been excluded.
[0014] According to another aspect of the present disclosure, a user credit evaluation device is provided, and the device comprises: a feature obtaining module, which is used for obtaining a plurality of feature information of a target user, wherein the plurality of feature information comprises a first type of parameter and a second type of parameter; a parameter preprocessing module, which is used for preprocessing the first type of parameter and the second type of parameter; a target parameter generating module, which is used for converting the preprocessed first type of parameter to generate a target parameter; and an evaluation result determining module, which is used for inputting the preprocessed second type of parameter and the target parameter into a machine learning model to obtain a credit evaluation result of the target user;
[0015] wherein the feature information having an IV value lower than a preset threshold is the first type of parameter, and the feature information having an IV value higher than the preset threshold is the second type of parameter.
[0016] According to another aspect of the present disclosure, an electronic device is provided, and the electronic device comprises a processor; and a memory for storing an executable instruction of the processor; wherein the processor is configured to execute the executable instruction so as to implement any one of the user credit evaluation methods mentioned above.

[0017] According to another aspect of the present disclosure, a computer readable medium having a computer program stored thereon, the computer program is executed by a processor to implement any one of the user credit evaluation methods mentioned above.
[0018] A user credit evaluation method, a user credit evaluation device, an electronic device, and a computer readable storage medium are provided in some exemplary embodiments of the present disclosure. On the one hand, by means of converting a first type of parameter after the pre-processing to generate a target parameter, the target parameter generated by the conversion from the first type of parameter can be used for credit evaluation. In this way, it can avoid the problem that in the related art, only the second type of parameter is used for credit evaluation, and thus the amount of data is insufficient and the application range is small. Thus, the present invention can increase the data volume and application range. On the other hand, by means of inputting the preprocessed second type of parameter and the target parameter generated after the conversion into a machine learning model, the amount of data is increased. In this way, it is able to obtain accurate credit evaluation results based on the second type of parameter after the preprocessing and the target parameter generated from the conversion, and accurately measure user credit risks.
[0019] It should be understood that the above general description and the following detailed description are merely exemplary and are not intended to limit the present disclosure.
Brief Description of the Drawin2s [0020] The drawings herein are incorporated into the description and form a part of this description. The embodiments of the present disclosure are shown and used in conjunction with the specification to explain the principles of the present disclosure.
Apparently, the drawings in the following description are only some embodiments of the present disclosure.
For a person of ordinary skill in the art, other drawings may also be obtained from these drawings without inventive skills.
[0021] FIG. 1 is a schematic diagram showing a user credit evaluation method in an exemplary embodiment of the present disclosure.
[0022] FIG. 2 schematically shows a specific flowchart of user credit evaluation in an exemplary embodiment of the present disclosure.

[0023] FIG. 3 is a block diagram schematically showing a user credit evaluation device in an exemplary embodiment of the present disclosure.
[0024] FIG. 4 is a block diagram schematically showing an electronic device in an exemplary embodiment of the present disclosure.
[0025] FIG. 5 schematically illustrates a program product in an exemplary embodiment of the present disclosure.
Detailed Description [0026] Exemplary embodiments will now be described in detail with reference to the accompanying drawings. The exemplary embodiments can be embodied in many forms, and should not be construed as being limited to the examples set forth herein;
rather, these embodiments are provided so that this disclosure will be more complete. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details will be set forth to facilitate understanding of the embodiments of the present disclosure.
However, one skilled in the art will appreciate that one or more of the specific details may be omitted or other methods, components, devices, steps, etc. may be added. In other instances, the well-known technical features will not be described in detail as they may obscure the main aspects of the present disclosure.
[0027] In addition, the drawings are merely schematic representations of the present disclosure and are not necessarily to scale. The same reference numerals in the drawings denote the same or similar parts, and the repeated description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily have to correspond to physically or logically separate entities. These functional entities may be implemented in software, or implemented in one or more hardware modules or integrated circuits, or implemented in different network and/or processor devices and/or microcontroller devices.
[0028] In the present exemplary embodiment, a user credit evaluation method is first provided. Referring to FIG. 1, the user credit evaluation method will be described in detail.
[0029] Step S110 includes obtaining a plurality of feature information of a target user, wherein the plurality of feature information comprises a first type of parameter and a second type of parameter.

[0030] In this exemplary embodiment, the feature information refers to a data feature corresponding to the historical data of a target user. Specifically, the first type of parameter and the second type of parameter may be included in the feature information. The feature information having an IV value that is lower than a preset threshold would be the first type of parameter, and the feature information having an IV value that is higher than a preset threshold would be the second type of parameter. The feature information may be a parameter corresponding to each topic, and each topic may include a plurality of feature information. For example, the feature information includes, but is not limited to, age, income, consumption data, number of views, browsing time, and the like.
[0031] After obtaining the plurality of feature information, the obtained feature information may be classified into the first type of parameter and the second type of parameter, and the feature information may be classified according to the IV value. IV value may also be referred to as IV value, which refers to an indicator used to measure the ability of certain feature information to distinguish between good and bad customers when constructing a model through logistic regression, decision tree or other model methods. All feature information can be filtered and classified using the IV value. In general, the higher the IV value, the higher the information value of the feature information. Therefore, feature information with a high IV value can be placed in a model to perform fitting training on the model.
100321 For example, suppose that in a classification problem, there are two categories of feature information: Y1 and Y2. For an individual A to be predicted, it is necessary to obtain certain information in order to determine whether the individual A belongs to Y1 or Y2. Assume that the total amount of information is I, and the information required is contained in the information Cl, C2, C3, Cn. In this case, for one feature information Ci, the more information it contains, the greater the contribution of the feature information for determining whether A
belongs to Y1 or Y2. The higher is the information value of Ci, the higher is the IV value of Ci.
This indicates that the feature information has better distinguishing ability.
Accordingly, the feature information Ci can be used to build the model.
[0033] A first type of parameter is a weak parameter, such as a weak financial attribute parameter. A second type of parameter is a strong parameter, such as strong financial attribute parameter. In the exemplary embodiment, the preset threshold may be set to 0.2, and the feature information having an information value of lower than 0.2 may be determined as a first type of parameter, while the feature information having an information value of lower than 0.2 may be determined as a second type of parameter. However, the preset threshold is not limited to the above value, and may be set according to actual needs. In another case, the feature information having an information value of lower than 0.05 may also be regarded as an extremely weak parameter. For an extremely weak parameter, because the distinguishing ability of this feature information is very poor, the extremely weak parameter may be directly filtered out to avoid the influence of the extremely weak parameter on the overall evaluation result.
[0034] Step S120 includes preprocessing the first type of parameter and the second type of parameter.
[0035] In the present exemplary embodiment, the preprocessing refers to binning processing.
The binning process is a grouping process. It refers to discretizing continuous parameters or merging discrete parameters of multiple states into discrete states with fewer states. In the exemplary embodiment, the WOE algorithm may be specifically used for the binning processing.
WOE (Weight of Evidence) refers to a form of coding for an original parameter.
Before WOE
encoding a parameter, the parameter is topiced to a binning process.
Specifically, the method may include equidistant binning, equal-depth binning, optimal binning, and the like. After binning, the weight of evidence value for the i-th group can be calculated by the following formula (1):
#.Y,/
PY, /#YT ) WOE,=In(¨) = ln( pn, #n,/
/#nT
(1) [0036] In this formula, Py, is the proportion of the responding customers in the i-th group to all responding customers in the sample. Pn, is the proportion of unresponsive customers in the i-th group to all unresponsive customers in the sample. #y, is the number of responding customers in the i-th group., #ni is the number of unresponsive customers in the i-th group. #yT is the number of all responding customers in the sample. #nT is the number of all unresponsive customers in the sample. The responding customer herein refers to an individual in the model having a parameter value of 1.
[0037] It can be seen that the weight of evidence value WOE represents the difference between "the proportion of responding customers to all responding customers in the current group" and "the proportion of unresponsive customers to all unresponsive customers in the current group". Hence, formula (1) can be converted to the following formula (2):
#.Y,/ #y1/
WOE,=In(¨py,) = ln( /# YT )= In( /#n, pn, #n,/ #Y
/#17 y7. nr (2) [0038] In the above formula, the higher the weight of evidence (WOE) value, the greater the difference. Thus, it indicates that the potential response in this group is high.
[0039] By means of binning the first type of parameter and the second type of parameter, random errors or abnormal parameters for the parameters can be avoided, thereby denoising the parameters and improving the processing speed and efficiency.
[0040] Step S130 includes converting the preprocessed first type of parameter to generate a target parameter.
[0041] In the present exemplary embodiment, since the first type of parameters cannot be directly used for evaluating in a credit evaluation, a part of the feature information cannot be fully utilized. When there is no second type of parameter belonging to the strong financial attribute in the feature information, the credit evaluation cannot be performed. In order to avoid the above problem, all of the first type of parameters can be converted to change an original first type of parameter into a target parameter. The specific process of conversion is a feature combination process, and the target parameter herein refers to a parameter that is associated with a second type of parameter, that is, a parameter having a strong financial attribute. The parameter associated with a second type of parameter herein refers to a parameter of the same type as a second type of parameter, for example, combining the original weak parameters into a strong parameter, so that the strong parameter and the strong parameter converted by the weak parameters can be used to perform credit evaluation. It should be noted that, in step S110, a plurality of first type of parameters may be obtained, and each topic may also include multiple first type of parameters, for example, 10, 50, and the like. This is not particularly limited in the exemplary embodiment. However, through the combination of features in step S130, only one target parameter can be obtained for each topic. For example, the weak parameter 1, the weak parameter 2, the weak parameter 3, and the weak parameter 4 corresponding to the topic 1 may be combined to obtain a strong parameter 1 of the topic 1.

[0042] When performing feature combination, a linear discriminant algorithm may be used to perform feature combination on all binned first type of parameters associated with each topic so as to obtain a parameter associated with a second type of parameter. The topics may include, for example, transactions, browsing, forecasting, and the like. For each topic, the first type of parameters and the second type of parameters included may be different. In order to make all weak parameters satisfy the interpretability of the business, it is necessary to perform feature combination for all the first type of parameters associated with each topic.
For example, all the first type of parameters corresponding to the topic of transaction may be combined, and all the first type of parameters corresponding to the topic of browsing may also be combined. By combining the features of the first type of parameters of each topic, the mutual influence of parameters between different topics can be avoided, thereby improving the efficiency and accuracy of feature combination.
[0043] The linear discriminant algorithm LDA refers to in the classification process linearly combining a plurality of weak parameters after binning to form a linear expression. Specifically, a plurality of weak parameters are linearly combined to obtain a linear expression including each weak parameter, in a classification process, the target parameter is rotated around by different angles in the feature space where the linear expression is located, the linear discriminant algorithm is used to obtain an optimal angle during the rotation, so that the target parameter has the largest classification potential at the optimal angle, a strong parameter can be obtained according to the target parameter with the largest classification potential.
In other words, an optimal linear combination may be obtained based on the optimal angle, such that the classification potential of the target parameter corresponding to the optimal linear combination is maximized, so that a strong parameter can be obtained according to the target parameter with the largest classification potential. In this case, the target parameter refers to any one of all weak parameters. The classification potential refers to the potential used for classification.
[0044] For example, for the topic of transaction, if the weak parameters include the number of views xl and the evaluation yl, the number of views xl and the evaluation y I can be linearly combined to obtain a linear expression Axl+Byl, next according to the linear discriminant algorithm, the optimal angle for maximizing the target parameter classification potential can be obtained in the feature space where the linear expression is located. Further, the number of views xl and the evaluation y I are combined into one strong parameter associated with the number of views and the evaluation. In this way, the weak parameters under each topic can be combined into a strong parameter, so that a model can be constructed using the strong parameters following the conversion. As a result, with respect to the related art, it is possible to completely construct a highly interpretable risk scoring model by using weak parameters alone, and this can include more parameters and more types of parameters.
[0045] After the target parameter is generated, the target parameter and the second type of parameter can be placed in a candidate variable pool. The first type of parameters and the parameters associated with the first type of parameter placed in the candidate variable pool of the model should be the parameters after binning processing. In order to meet this requirement, after converting the first type of parameters into a target parameter, the target parameter needs to be binned again, and the target parameter is put into the candidate variable pool after the second binning. The candidate variable pool may include, for example, a strong parameter 1 after binning, a strong parameter 4 composed of a weak parameter 2 and a weak parameter 3 after binning, and the like.
[0046] Next, Step S140 includes inputting the preprocessed second type of parameter and the target parameter into a machine learning model to obtain a credit evaluation result of the target user.
[0047] In the exemplary embodiment, the machine learning model may be a trained machine learning model, such as a convolutional neural network algorithm, a deep learning algorithm, or the like. The convolutional neural network model is taken as an example in the present exemplary embodiment. A convolutional neural network model generally includes an input layer, a mapping layer, and an output layer.
[0048] In the process of inputting the preprocessed second type of parameter and the target parameter into a machine learning model, in order to ensure the accuracy of the result, the target parameter and the second type of parameter after binning in the candidate variable pool may be further filtered. That is, all the parameters in the candidate variable pool may be filtered to obtain the remaining parameters, and the remaining parameters are then input as input parameters into the machine learning model. The output of the output layer of the machine learning model may be the probability by which a user credit may belong to a certain level, thereby determining the user credit evaluation result according to the probability belonging to a certain level. For example, when the probability of belonging to a level of good credit is the highest, it is then determined that the credit evaluation result is good.
[0049] Specifically, in order to ensure the accuracy of the result obtained, the multicollinearity between the second type of parameter and the target parameter in the candidate variable pool may be excluded. The multicollinearity herein refers that the model estimate may be distorted or it is difficult to estimate accurately due to the existence of precise correlations or highly correlated relationships between parameters in a linear regression model. The weight of evidence should generally be positive. If there is a negative value in the calculated weight of evidence, it may be considered about whether this is due to the influence of parameter multicollinearity. Based on this, it can be determined whether there is multicollinearity between the parameters according to the relationship between the weight of evidence and the preset value.
The preset value may be, for example, 0. If the weight of evidence value is less than 0 (that is, a negative value), then the multicollinearity between the parameters should be considered.
Therefore, the parameters having the weight of evidence value of less than 0 may be sequentially removed to obtain the remaining parameters.
[0050] In the process of excluding the multicollinearity between the parameters, the following steps may be performed, excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value according to an order of the weight of evidence from low to high; recalculating the weight of evidence of the second type of parameter after the binning and the weight of evidence of the target parameter after the second binning after the excluding; excluding the second type of parameter after the binning having the recalculated weight of evidence lower than the preset value and the target parameter after the second binning having the recalculated weight of evidence lower than the preset value according to the order of the weight of evidence from low to high, until every second type of parameter having the weight of evidence lower than the preset value and every target parameter having the weight of evidence lower than the preset value have been excluded.
[0051] For example, if the weight of evidence value of parameter 1 is -3, the weight of evidence value of parameter 2 is -1, and the weight of evidence value of parameter 3 is 1, the parameter 1 with the weight of evidence value of -3 may be removed first. The parameter 2 and parameter 3 are then linearly combined and the weight of evidence value for each parameter is recalculated. Next, the parameter whose weight of evidence value is the smallest negative value will be excluded. In addition, multicollinearity can be excluded by other algorithms. By means of sequentially excluding a parameter whose weight of evidence value is lower than a preset value, and then recalculating the weight of evidence values of all remaining parameters, it is possible to more accurately exclude all parameters whose weight of evidence value is lower than the preset value, thereby obtaining more accurate remaining parameters. A more accurate credit evaluation result can be obtained by inputting the remaining parameters to a trained machine learning model, thereby accurately measuring user risks.
[0052] A specific flow chart for determining the result of the credit evaluation is shown in FIG. 2. In this figure:
[0053] In step S201, feature extraction is performed on the data in a modeling layer to obtain a feature summary wide table including a plurality of feature information.
[0054] In step S202, feature filtering is performed based on the modeling layer and the information value IV, the plurality of feature information is divided into strong parameters, weak parameters, and extremely weak parameters according to the IV values thereof, and the extremely weak parameters are directly filtered out.
[0055] In step S203, the strong parameters and the weak parameters are binned by a WOE
algorithm to obtain a strong parameter WOE bin and a weak parameter WOE bin.
[0056] In step S204, feature combinations are performed on the weak parameters, a plurality of weak parameters corresponding to each topic are linearly combined according to the LDA
linear discriminant algorithm to obtain a strong parameter corresponding to each topic. For example, get an LDA combination of the topic of browsing, an LDA combination of the topic of trading, and a combination of the topic of transaction credit card. Further, the strong parameters converted by the weak parameters corresponding to each topic are binned again to obtain the WOE binning after the LDA combination.
[0057] In step S205, multicollinearity is excluded from the strong parameters after the WOE
binning the strong parameter generated by the LDA combination of the weak parameters after the WOE binning to obtain the remaining parameters, and the remaining parameters are input into a machine learning model to obtain a user credit evaluation result.
[0058] Next, the results of the credit assessment can be monitored. For example, the ROC
curve, the Gini coefficient AR value, the discrimination ability index KS
value, or the Lorenz curve may be used for monitoring. In addition, PSI indicators may also be used to monitor the accuracy of the evaluation results.
[0059] Through the steps shown in FIG. 2, the weak parameter features of each topic can be combined into a strong parameter to perform a credit evaluation based on the strong parameters.
In this way, weak parameters can be fully utilized for more interpretable risk scores, and more and more types of parameters can be included, making the parameters more accurate and comprehensive. As a result, the credit evaluation results will be more interpretable, so that the user risk can be measured more accurately and in a timely manner.
[0060] The present disclosure also provides a user credit evaluation device. Referring to FIG.
3, the user credit evaluation device 300 may include:
[0061] a feature obtaining module 301, which is used for obtaining a plurality of feature information of a target user;
[0062] a parameter preprocessing module 302, which is used for preprocessing the first type of parameter and the second type of parameter;
[0063] a target parameter generating module 303, which is used for converting the preprocessed first type of parameter to generate a target parameter;
[0064] an evaluation result determining module 304, which is used for inputting the preprocessed second type of parameter and the target parameter into a machine learning model to obtain a credit evaluation result of the target user;
[0065] wherein the feature information having an IV value lower than a preset threshold is the first type of parameter, and the feature information having an IV value higher than the preset threshold is the second type of parameter.
[0066] In an exemplary embodiment of the present disclosure, the parameter preprocessing module may include a binning processing module, which is used for separately binning the first type of parameter and the second type of parameter according to a weight of evidence, so as to obtain the first type of parameter and the second type of parameter after the binning.
[0067] In an exemplary embodiment of the present disclosure, the target parameter generating module may include a feature combining module, which is used for using a linear discriminant algorithm to carry out feature combination on the first type of parameters associated with each topic, so as to generate the target parameter.

[0068] In an exemplary embodiment of the present disclosure, the device further includes a first storing module, which is used for performing second binning with the target parameter, and placing the target parameter after the second binning into a candidate variable pool; and a second storing module, which is used for placing the second type of parameter after the binning in the candidate variable pool.
[0069] In an exemplary embodiment of the present disclosure, the evaluation result determining module include a parameter excluding model, which is used for excluding a multicollinearity between the second type of parameter after the binning and the target parameter after the second binning in the candidate variable pool, so as to obtain a remaining parameter;
and an input controlling module, which is used for inputting the remaining parameter into the machine learning model.
[0070] In an exemplary embodiment of the present disclosure, the parameter excluding model includes an exclusion controlling module, which is used for excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value from the candidate variable pool, so as to obtain the remaining parameter.
[0071] In an exemplary embodiment of the present disclosure, the exclusion controlling module includes a first excluding module, which is used for excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value according to an order of the weight of evidence from low to high; a weight of evidence value recalculating module, which is used for recalculating the weight of evidence of the second type of parameter after the binning and the weight of evidence of the target parameter after the second binning after the excluding; and a second excluding module, which is used for excluding the second type of parameter after the binning having the recalculated weight of evidence lower than the preset value and the target parameter after the second binning having the recalculated weight of evidence lower than the preset value according to the order of the weight of evidence from low to high, until every second type of parameter having the weight of evidence lower than the preset value and every target parameter having the weight of evidence lower than the preset value have been excluded.

[0072] It should be noted that the specific details of each module in the foregoing user credit evaluation device have been described in detail in the corresponding user credit evaluation method, and therefore will not be described herein again.
[0073] It should be noted that although several modules or units of equipment for action execution are mentioned in the detailed description above, such division is not mandatory.
Indeed, in accordance with embodiments of the present disclosure, the features and functions of two or more modules or units described above may be embodied in one module or unit. Rather, the features and functions of one of the modules or units described above may be further divided into multiple modules or units.
[0074] In addition, although the various steps of the method of the present disclosure are described in a particular order in the drawings, this is not required or implied that the steps must be performed in the specific order, or all the steps shown must be performed to achieve the desired result. Additionally or alternatively, certain steps may be omitted, multiple steps being combined into one step execution, and/or one step being decomposed into multiple step executions and the like.
[0075] In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.
[0076] A person skilled in the art will appreciate that various aspects of the present invention can be implemented as a system, method, or program product. Therefore, various aspects of the present invention may be embodied in the form of a complete hardware implementation, a complete software implementation (including firmware, microcode, etc.), or a combination of hardware and software, which may be collectively referred to herein as "circuit," "module," or "system."
[0077] An electronic device 400 in accordance with such an embodiment of the present invention will be described below with reference to FIG. 4. The electronic device 400 shown in FIG. 4 is merely an example and should not impose any limitation on the function and scope of use of the embodiments of the present invention.
[0078] As shown in FIG. 4, the electronic device 400 is embodied in the form of a general purpose computing device. The components of the electronic device 400 may include, but are not limited to, at least one processing unit 410, at least one storage unit 420, and a bus 430 that connects different system components (including the storage unit 420 and the processing unit 410).
[0079] The storage unit stores program code that can be executed by the processing unit 410 such that the processing unit 410 performs the steps of various exemplary embodiments in accordance with the present invention as described in the "Exemplary Methods"
section of the present specification. For example, the processing unit 410 can perform the steps as shown in FIG.1.
[0080] The storage unit 420 may include a readable medium in the form of a volatile storage unit, such as a random access storage unit (RAM) 4201 and/or a cache storage unit 4202, and may further include a read only storage unit (ROM) 4203.
[0081] The storage unit 420 can also include a program/utility 4204 having a set (at least one) of the program modules 4205. Such program modules 4205 include, but are not limited to, an operating system, one or more applications, other program modules, and program data, each of which may include an implementation of a network environment.
[0082] The bus 430 can represent one or more of several types of bus structures, including a memory unit bus or a memory unit controller, a peripheral bus, a graphics acceleration port, and a processing unit, or a local bus using any of a variety of bus structures.
[0083] The display unit 440 may be a display having a display function to display a processing result obtained by the processing unit 410 performing the method in the present exemplary embodiment through the display. The display may include, but are not limited to, a liquid crystal display or other type of displays.
[0084] The electronic device 400 can also communicate with one or more external devices 600 (for example, a keyboard, a pointing device, a Bluetooth device, etc.), and can also communicate with one or more devices that enable a user to interact with the electronic device 400, and/or the electronic device 400 is enabled to communicate with any device (for example, a router, a modem, etc.) that is in communication with one or more other computing devices. This communication can take place via an input/output (I/0) interface 450. Also, the electronic device 400 can communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through a network adapter 460. As shown, the network adapter 460 communicates with other modules of the electronic device 400 via the bus 430. It should be understood that although not shown in the figures, other hardware and/or software modules may be utilized in conjunction with the electronic device 400, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, etc.
[0085] In an exemplary embodiment of the present disclosure, a computer readable storage medium is also provided, the computer readable storage medium hasa program product stored thereon capable of implementing the above method of the present specification.
In some possible implementations, aspects of the invention may also be implemented in the form of a program product, including program code. When the program product is run on a terminal device, the program code is for causing the terminal device to perform the steps according to various exemplary embodiments of the present invention described in the "Exemplary Method" section of the present specification.
[0086] Referring to FIG. 5, a program product 500 for implementing the above method, which may employ a portable compact disk read only memory (CD-ROM) and include program code, and may be in a terminal device, is illustrated in accordance with an embodiment of the present invention, for example running on a personal computer. However, the program product of the present invention is not limited thereto, and in the present document, the readable storage medium may be any tangible medium containing or storing a program that can be used by or in connection with an instruction execution system, apparatus or device.
[0087] The program product can employ any combination of one or more readable media.
The readable medium can be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the above. More specific examples (non-exhaustive lists) of readable storage media include:
electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read-only memory (EPROM
or flash memory), optical fibers, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
[0088] The computer readable signal medium may include a data signal that is propagated in the baseband or as part of a carrier, carrying readable program code. Such propagated data signals can take a variety of forms including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing. The readable signal medium can also be any readable medium other than a readable storage medium that can transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
[0089] Program code embodied on a readable medium can be transmitted using any suitable medium, including but not limited to wireless, wireline, optical cable, RF, etc., or any suitable combination of the foregoing.
[0090] Program code for performing the operations of the present disclosure can be written in any combination of one or more programming languages. The programming language includes object oriented programming languages such as Java, C++, etc., as well as conventional procedural programming languages such as the "C" language or similar programming languages.
The program code can be executed entirely on a user computing device, partially on a user device, as a stand-alone software package, partially on a remote computing device on a user computing device, or entirely on a remote computing device or server. In the case of a remote computing device, the remote computing device can be connected to a user computing device via any kind of network, including a local area network (LAN), or a wide area network (WAN), or can be connected to an external computing device (for example, to be connected via the Internet through an Internet service provider).
[0091] In addition, the above-described drawings are merely illustrative of the processes included in the method according to the exemplary embodiments of the present invention, and are not intended to be limiting. It is easy to understand that the processing shown in the above figures does not indicate or limit the chronological order of these processes.
Further, it is also easy to understand that these processes may be performed synchronously or asynchronously, for example, in a plurality of modules.
[0092] A person skilled in the art, after considering the specification and practicing the invention disclosed herein, can readily obtain other embodiments of the present disclosure. This application is intended to cover any variations, uses, or adaptations of the disclosure. These variations, uses, or adaptations are subject to the general principles of the present disclosure and include common general knowledge or conventional technical means in the art that are not disclosed in the present disclosure. The description and examples are to be regarded as illustrative only. The true scope and spirit of the disclosure is pointed out by the claims.

Claims (10)

Claims:
1. A user credit evaluation method, characterized in that the method comprises:
obtaining a plurality of feature information of a target user, wherein the plurality of feature information comprises a first type of parameter and a second type of parameter;
preprocessing the first type of parameter and the second type of parameter;
converting the preprocessed first type of parameter to generate a target parameter;
inputting the preprocessed second type of parameter and the target parameter into a machine learning model to obtain a credit evaluation result of the target user;
wherein the feature information having an IV value lower than a preset threshold is the first type of parameter, and the feature information having an IV value higher than the preset threshold is the second type of parameter.
2. The user credit evaluation method according to claim 1, characterized in that the preprocessing of the first type of parameter and the second type of parameter comprises:
separately binning the first type of parameter and the second type of parameter according to a weight of evidence, so as to obtain the first type of parameter and the second type of parameter after the binning.
3. The user credit evaluation method according to claim 2, characterized in that the converting the preprocessed first type of parameter to generate a target parameter comprises:
using a linear discriminant algorithm to carry out feature combination on the first type of parameters associated with each topic, so as to generate the target parameter.
4. The user credit evaluation method according to claim 3, characterized in that the method further comprises:
performing second binning with the target parameter, and placing the target parameter after the second binning into a candidate variable pool;
placing the second type of parameter after the binning in the candidate variable pool.
5. The user credit evaluation method according to claim 4, characterized in that the inputting the preprocessed second type of parameter and the target parameter into a machine learning model comprises:
excluding a multicollinearity between the second type of parameter after the binning and the target parameter after the second binning in the candidate variable pool, so as to obtain a remaining parameter;
inputting the remaining parameter into the machine learning model.
6. The user credit evaluation method according to claim 5, characterized in that the excluding a multicollinearity between the second type of parameter after the binning and the target parameter after the second binning in the candidate variable pool, so as to obtain a remaining parameter comprises:
excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value from the candidate variable pool, so as to obtain the remaining parameter.
7. The user credit evaluation method according to claim 6, characterized in that the excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value from the candidate variable pool comprises:
excluding the second type of parameter after the binning having the weight of evidence lower than the preset value and the target parameter after the second binning having the weight of evidence lower than the preset value according to an order of the weight of evidence from low to high;
recalculating the weight of evidence of the second type of parameter after the binning and the weight of evidence of the target parameter after the second binning after the excluding;
excluding the second type of parameter after the binning having the recalculated weight of evidence lower than the preset value and the target parameter after the second binning having the recalculated weight of evidence lower than the preset value according to the order of the weight of evidence from low to high, until every second type of parameter having the weight of evidence lower than the preset value and every target parameter having the weight of evidence lower than the preset value have been excluded.
8. A user credit evaluation device, characterized in that the device comprises:
a feature obtaining module, which is used for obtaining a plurality of feature information of a target user, wherein the plurality of feature information comprises a first type of parameter and a second type of parameter;
a parameter preprocessing module, which is used for preprocessing the first type of parameter and the second type of parameter;
a target parameter generating module, which is used for converting the preprocessed first type of parameter to generate a target parameter;
an evaluation result determining module, which is used for inputting the preprocessed second type of parameter and the target parameter into a machine learning model to obtain a credit evaluation result of the target user;
wherein the feature information having an IV value lower than a preset threshold is the first type of parameter, and the feature information having an IV value higher than the preset threshold is the second type of parameter.
9. An electronic device, characterized in that the electronic device comprises:
a processor; and a memory for storing an executable instruction of the processor;
wherein the processor is configured to execute the executable instruction so as to implement the user credit evaluation method according to any one of claims 1 to 7.
10. A computer readable medium having a computer program stored thereon, characterized in that the computer program is executed by a processor to implement the user credit evaluation method according to any one of claims 1 to 7.
CA3059937A 2018-10-26 2019-10-24 User credit evaluation method and device, electronic device, storage medium Pending CA3059937A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811260889.1A CN109447461B (en) 2018-10-26 2018-10-26 User credit evaluation method and device, electronic equipment and storage medium
CN201811260889.1 2018-10-26

Publications (1)

Publication Number Publication Date
CA3059937A1 true CA3059937A1 (en) 2020-04-26

Family

ID=65548550

Family Applications (1)

Application Number Title Priority Date Filing Date
CA3059937A Pending CA3059937A1 (en) 2018-10-26 2019-10-24 User credit evaluation method and device, electronic device, storage medium

Country Status (2)

Country Link
CN (1) CN109447461B (en)
CA (1) CA3059937A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815457A (en) * 2020-07-01 2020-10-23 北京金堤征信服务有限公司 Target object evaluation method and device
CN115880053A (en) * 2022-12-05 2023-03-31 中电金信软件有限公司 Training method and device for grading card model

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110727510A (en) * 2019-09-25 2020-01-24 浙江大搜车软件技术有限公司 User data processing method and device, computer equipment and storage medium
CN110782349A (en) * 2019-10-25 2020-02-11 支付宝(杭州)信息技术有限公司 Model training method and system
CN110866696B (en) * 2019-11-15 2023-05-26 成都数联铭品科技有限公司 Training method and device for risk assessment model of shop drop
CN111709826A (en) * 2020-06-11 2020-09-25 中国建设银行股份有限公司 Target information determination method and device
CN111950889A (en) * 2020-08-10 2020-11-17 中国平安人寿保险股份有限公司 Client risk assessment method and device, readable storage medium and terminal equipment
CN112734433A (en) * 2020-12-10 2021-04-30 深圳市欢太科技有限公司 Abnormal user detection method and device, electronic equipment and storage medium
CN112529477A (en) * 2020-12-29 2021-03-19 平安普惠企业管理有限公司 Credit evaluation variable screening method, device, computer equipment and storage medium
CN112734568B (en) * 2021-01-29 2024-01-12 深圳前海微众银行股份有限公司 Credit scoring card model construction method, device, equipment and readable storage medium
CN113570066B (en) * 2021-07-23 2024-03-29 中国恩菲工程技术有限公司 Data processing method, system, electronic device and storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010266983A (en) * 2009-05-13 2010-11-25 Sony Corp Information processing apparatus and method, learning device and method, program, and information processing system
US20150178754A1 (en) * 2013-12-19 2015-06-25 Microsoft Corporation Incentive system for interactive content consumption
CN104866484B (en) * 2014-02-21 2018-12-07 阿里巴巴集团控股有限公司 A kind of data processing method and device
US11521106B2 (en) * 2014-10-24 2022-12-06 National Ict Australia Limited Learning with transformed data
CN108230067A (en) * 2016-12-14 2018-06-29 阿里巴巴集团控股有限公司 The appraisal procedure and device of user credit
CN108399255A (en) * 2018-03-06 2018-08-14 中国银行股份有限公司 A kind of input data processing method and device of Classification Data Mining model

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111815457A (en) * 2020-07-01 2020-10-23 北京金堤征信服务有限公司 Target object evaluation method and device
CN115880053A (en) * 2022-12-05 2023-03-31 中电金信软件有限公司 Training method and device for grading card model
CN115880053B (en) * 2022-12-05 2024-05-31 中电金信软件有限公司 Training method and device for scoring card model

Also Published As

Publication number Publication date
CN109447461B (en) 2022-05-03
CN109447461A (en) 2019-03-08

Similar Documents

Publication Publication Date Title
CA3059937A1 (en) User credit evaluation method and device, electronic device, storage medium
US20200193234A1 (en) Anomaly detection and reporting for machine learning models
US11012289B2 (en) Reinforced machine learning tool for anomaly detection
US10481195B2 (en) Distributed IoT based sensor analytics for power line diagnosis
TWI700578B (en) Method and device for abnormal detection
CN113837596B (en) Fault determination method and device, electronic equipment and storage medium
EP4105872A2 (en) Data processing method and apparatus
US20190250975A1 (en) Tree-based anomaly detection
US11410111B1 (en) Generating predicted values based on data analysis using machine learning
CN112241805A (en) Defect prediction using historical inspection data
US20230385707A1 (en) System for modelling a distributed computer system of an enterprise as a monolithic entity using a digital twin
CN114610561A (en) System monitoring method, device, electronic equipment and computer readable storage medium
CN116244444A (en) Equipment fault diagnosis method and device
CN117235608B (en) Risk detection method, risk detection device, electronic equipment and storage medium
CN112015912A (en) Intelligent index visualization method and device based on knowledge graph
CN114116431B (en) System operation health detection method and device, electronic equipment and readable storage medium
CN115641198A (en) User operation method, device, electronic equipment and storage medium
CN115392715A (en) Power utilization data risk assessment method, device, equipment and storage medium
CN115237970A (en) Data prediction method, device, equipment, storage medium and program product
CN113807391A (en) Task model training method and device, electronic equipment and storage medium
Wu et al. Network Construction for Bearing Fault Diagnosis Based on Double Attention Mechanism
US20220397698A1 (en) Flow-after-flow tests in hydrocarbon wells
CN113591813B (en) Association rule algorithm-based abnormity studying and judging method, model construction method and device
US20240070534A1 (en) Individualized classification thresholds for machine learning models
CN116842837A (en) Transformer fault diagnosis method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916

EEER Examination request

Effective date: 20220916