CN112581254A - Method and device for measuring financial risk of small and micro enterprises - Google Patents

Method and device for measuring financial risk of small and micro enterprises Download PDF

Info

Publication number
CN112581254A
CN112581254A CN202011475707.XA CN202011475707A CN112581254A CN 112581254 A CN112581254 A CN 112581254A CN 202011475707 A CN202011475707 A CN 202011475707A CN 112581254 A CN112581254 A CN 112581254A
Authority
CN
China
Prior art keywords
data
enterprise
small micro
model
micro
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011475707.XA
Other languages
Chinese (zh)
Inventor
何泾沙
夏新宇
朱娜斐
张宇晗
宜裕紫
陈宝存
薛瑞昕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202011475707.XA priority Critical patent/CN112581254A/en
Publication of CN112581254A publication Critical patent/CN112581254A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention discloses a method, a device, electronic equipment and a storage medium for measuring financial risks of a small micro-enterprise, belonging to the technical field of risks of financial industries, wherein the method comprises the steps of taking industrial and commercial data, credit investigation data and peer-to-peer data of the small micro-enterprise and in-line settlement data, blacklist data and historical data of the small micro-enterprise in a bank as rejection strategies, and carrying out data preprocessing on invoice data of the small micro-enterprise obtained by screening to obtain a data characteristic set; respectively calculating the data feature set by using an LR model and an XGboost model, comparing the average value of the obtained probability mapping scores with a threshold value, and if the average value is greater than the threshold value, determining that the client is good; otherwise, the client is bad; bad customers are classified into bad customers with a variety of different risk labels using the LR softmax multi-classification model. According to the method, two-stage classification prediction is adopted, and small and micro enterprises are classified into good customers and bad customers by the first-stage classification; the secondary classification classifies bad customers into bad customers of different risk categories, so that a more accurate prediction method is provided for financial industries such as banks and the like.

Description

Method and device for measuring financial risk of small and micro enterprises
Technical Field
The invention belongs to the technical field of financial industry risks, and particularly relates to a method and a device for measuring financial risks of small and micro enterprises, electronic equipment and a storage medium.
Background
The traditional financial risk measurement method focuses on measuring account data such as operation funds, total amount of assets and total amount of liabilities of enterprises, the traditional account data is often confidential and private of companies, account information is high in security density when risks are generally evaluated, the risks are not easy to evaluate, and the oriented objects are often large companies. And the existing financial risk measurement method cannot carry out risk type division.
Disclosure of Invention
In view of the above problems, the present invention provides a method, an apparatus, an electronic device and a storage medium for measuring financial risk of a small-scale enterprise.
A method for measuring financial risk of a small micro-enterprise comprises the following steps:
screening the small micro-enterprises by taking the business data, credit investigation data and peer-shield data of the small micro-enterprises as rejection strategies;
screening the small micro-enterprise again by taking in-line settlement data, blacklist data and historical data of the small micro-enterprise in a bank as rejection strategies;
performing data preprocessing on the invoice data of the small micro-enterprise obtained by screening again, and screening to obtain a data characteristic set;
respectively calculating the data feature set by using an LR model and an XGboost model, and respectively obtaining an output probability mapping score of the LR model and an output probability mapping score of the XGboost model;
comparing the average value of the output probability mapping score of the LR model and the output probability mapping score of the XGboost model with a threshold value, and if the average value is greater than the threshold value, judging the small and micro enterprise as a good client; if the value is smaller than the threshold value, judging the small and micro enterprise as a bad client;
and classifying the bad clients by using an LR softmax multi-classification model, and classifying the bad clients into bad clients with different risk labels.
Preferably, the step of screening the small micro-enterprise again by using the business data, credit investigation data and peer-to-peer data of the small micro-enterprise as rejection policies, screening the small micro-enterprise and the small micro-enterprise obtained by screening, and using inline settlement data, blacklist data and historical data of the small micro-enterprise in a bank as rejection policies comprises:
setting a threshold value, and continuing if the business data, credit investigation data and peer shield data of the small and micro enterprise are all larger than the threshold value; otherwise, ending;
setting a threshold value, and if the inline settlement data, the blacklist data and the historical data of the small micro enterprise in the bank are all larger than the threshold value, continuing; otherwise, ending.
Preferably, the step of performing data preprocessing on the invoice data of the small micro-enterprise obtained by re-screening, and the step of obtaining a data feature set by screening includes:
selecting different missing value methods for processing variable characteristics in the invoice data;
determining that the fixed data outside a specific distribution area or range is replaced by an average value through a box plot and a MAD based on the fixed data distribution normality in the invoice data;
adopting forward selection and backward deletion methods to screen out the data with the best attribute from the invoice data with various attributes;
based on the relevance of the univariate data in the invoice data and the predictive variable data thereof, deleting the univariate data with low predictive capability by adopting a method of combining pearson correlation coefficient, chi-square test and tree model;
and calculating the characteristic data which belong to the same class and have similarity in the invoice data according to respective weight to obtain new characteristic data.
Preferably, the step of calculating the set of data features using an LR model comprises:
dividing the small micro-enterprise into a positive sample and a negative sample, and ordering Eθ(X) ═ 0 is the boundary, and the set of data features is X ═ {1, X1,x2,x3,...,xn};
Let formula Eθ(X)=XTθ;
Wherein T is a transposed symbol;
will be said formula Eθ(X) conversion to a function hθ(X)=sigmoid(Eθ(θ));
Then h isθ(X) ═ 0.5 for boundaries;
and carrying out iterative computation on the function to obtain the output probability mapping fraction theta.
Preferably, the LR softmax multi-classification model establishing step includes:
carrying out unsupervised clustering calculation on the data feature sets of the bad clients, and aggregating the bad clients into a plurality of category sets according to risk types;
calculating the category sets by adopting a risk index model respectively to obtain judgment results of the category sets, and adding the judgment results into the category sets;
and establishing the LR softmax multi-classification model according to the classification set of the judgment result.
Preferably, the step of establishing the risk indicator model includes:
the feature set of the category set is U ═ U1,u2,u3,...,unRisk category V ═ V }1,v2,v3,...,vm},
Carrying out fuzzy judgment on each feature in the feature set U according to the risk category V to obtain an assessment matrix:
Figure BDA0002835278970000031
wherein r isijRepresents uiAbout vjDegree of membership of;
determining the importance weight of each feature in the feature set U according to an analytic hierarchy process, wherein A is { alpha ═ alpha1,a2,a3,...,anAre multiplied by
Figure BDA0002835278970000032
Multiplying the weight A by the matrix R to obtain the judgment result B ═ B1,b2,b3,...,bm}。
Preferably, the step of establishing the LR softmax multi-classification model according to the class set obtained by the determination result includes:
for m judgment results, performing regression calculation on one judgment result and the rest m-1 judgment results;
and establishing an LR softmax multi-classification model for the judgment result probability by adopting a linear predictor and a normalization factor.
The device for realizing the method provided by the embodiment of the invention is characterized by comprising the following steps:
the rejection strategy module is used for screening the small micro-enterprise and screening the small micro-enterprise obtained by screening by taking the industrial and commercial data, credit investigation data and the same shield data of the small micro-enterprise as rejection strategies, and then screening the small micro-enterprise again by taking in-line settlement data, blacklist data and historical data of the small micro-enterprise in a bank as rejection strategies;
the first-stage classification module is used for performing data preprocessing on invoice data of the small micro-enterprise obtained through secondary screening to obtain a data feature set through screening, calculating the data feature set by using an LR model and an XGboost model respectively, obtaining an output probability mapping score of the LR model and an output probability mapping score of the XGboost model respectively, comparing an average value of the output probability mapping score of the LR model and the output probability mapping score of the XGboost model with a threshold value, and if the average value is larger than the threshold value, judging the small micro-enterprise to be a good client; if the value is smaller than the threshold value, judging the small and micro enterprise as a bad client;
and the secondary classification module is used for classifying the bad clients by using an LR softmax multi-classification model and classifying the bad clients into bad clients with different risk labels.
An embodiment of the present invention provides an electronic device, which includes at least one processing unit and at least one storage unit, where the storage unit stores a computer program, and when the program is executed by the processing unit, the processing unit is caused to execute the method described above.
A storage medium storing a computer program executable by an electronic device according to an embodiment of the present invention is configured to, when the program runs on the electronic device, cause the electronic device to execute the method described above.
Compared with the prior art, the invention has the beneficial effects that:
the method carries out secondary classification prediction aiming at small and micro enterprises, wherein the primary classification divides the small and micro enterprises into good customers and bad customers; the secondary classification classifies bad customers into bad customers of different risk categories, so that a more accurate prediction method is provided for financial industries such as banks and the like.
Drawings
FIG. 1 is a schematic structural diagram of a financial risk measuring device for small micro-enterprises in the present invention;
FIG. 2 is a flow chart of the method for measuring financial risk of small micro-enterprise according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
In the description of the present invention, it should also be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The invention provides a method and a device for measuring financial risk of a small and micro enterprise, electronic equipment and a storage medium.
Referring to fig. 1, a schematic structural diagram of a small micro-enterprise financial risk measurement apparatus according to an embodiment of the present application is shown, which includes:
the rejection strategy module 1 is used for screening the small micro-enterprises by taking the industrial and commercial data, credit investigation data and the sibling data of the small micro-enterprises as rejection strategies, screening the small micro-enterprises obtained by screening, and screening the small micro-enterprises again by taking in-line settlement data, blacklist data and historical data of the small micro-enterprises in a bank as the rejection strategies;
specifically, setting a threshold, and continuing if the business data, credit investigation data and peer shield data of the small micro-enterprise are all larger than the threshold; otherwise, ending;
setting a threshold value, and continuing if the inline settlement data, the blacklist data and the historical data of the small micro enterprise in the bank are all larger than the threshold value; otherwise, ending.
The primary classification module 2 is used for performing data preprocessing on invoice data of the small and micro enterprises obtained through secondary screening to obtain data feature sets through screening, calculating the data feature sets respectively by using an LR model and an XGboost model, respectively obtaining an output probability mapping score of the LR model and an output probability mapping score of the XGboost model, comparing the average value of the output probability mapping scores of the LR model and the XGboost model with a threshold value, and if the average value is greater than the threshold value, judging the small and micro enterprises to be good customers; if the value is smaller than the threshold value, judging the small micro enterprise as a bad client;
and the secondary classification module 3 is used for classifying the bad clients by using an LR softmax multi-classification model and classifying the bad clients into bad clients with different risk labels.
Specifically, unsupervised clustering calculation is carried out on a data feature set of the bad clients, and the bad clients are gathered into a plurality of category sets according to risk types;
calculating the category sets by adopting a risk index model respectively to obtain judgment results of the category sets, and adding the judgment results into the category sets;
the feature set of the class set is U ═ U1,u2,u3,...,unRisk category V ═ V }1,v2,v3,...,vm},
Further, the risk indicator model calculation method comprises the following steps:
carrying out fuzzy judgment on each feature in the feature set U according to the risk category V to obtain an evaluation matrix:
Figure BDA0002835278970000061
wherein r isijRepresents uiAbout vjDegree of membership of;
determining the importance weight of each feature in the feature set U according to an analytic hierarchy process, wherein A is { alpha ═ alpha1,a2,a3,...,anAre multiplied by
Figure BDA0002835278970000062
Multiplying the weight A by the matrix R to obtain a decision result B ═ B1,b2,b3,...,bm}。
For m judgment results, performing regression calculation on one judgment result and the rest m-1 judgment results;
and establishing an LR softmax multi-classification model for the judgment result probability by adopting a linear predictor and a normalization factor.
As shown in fig. 2, the present embodiment further provides a method for measuring financial risk of a small micro-enterprise, including:
screening the small and micro enterprises by taking the business data, credit investigation data and the peer-to-peer data of the small and micro enterprises as rejection strategies;
specifically, setting a threshold, and continuing if the business data, credit investigation data and peer shield data of the small micro-enterprise are all larger than the threshold; otherwise, ending.
Screening the small micro-enterprises again by taking in-line settlement data, blacklist data and historical data of the small micro-enterprises in the bank as rejection strategies;
specifically, setting a threshold value, and if the inline settlement data, the blacklist data and the historical data of the small micro enterprise in the bank are all larger than the threshold value, continuing; otherwise, ending.
And performing data preprocessing on the invoice data of the small micro-enterprise obtained by secondary screening, and screening to obtain a data characteristic set, wherein the data characteristic set comprises the following steps:
selecting different missing value methods for processing variable characteristics in the invoice data;
specifically, filling of the customized missing value is performed according to the missing value ratio, the importance of the variable characteristic and whether the variable characteristic is continuous, for example, the transaction amount ratio of three customers before the last 12 months, the variation coefficient of the last 24 months, the number of newly added sales commodities in the last 12 months and other variables are higher, and if the importance is lower, the variable is determined to be deleted; the ratio of the value of the special invoice invoicing amount of the value-added tax special invoice in the last 12 months to the total invoicing amount (without the waste invoice), the ratio of the variable missing values such as the amount of the waste invoice in the last 6 months/the effective value in the last 6 months plus the amount of the waste invoice and the like is lower, and the judgment is filled by using a median if the importance is lower; the effective invoicing amount is compared in the last 6 months, the variable missing values such as the ratio of the waste invoices to the total invoices in the last 12 months are too much missing, and if the importance is higher, the random forest is used for prediction and filling; and if the variables such as the amount of the red invoice in the last 12 months/the amount of the blue invoice in the last 12 months are discrete and have less different values, the variables are judged to be converted into dummy variables for filling.
Determining that the fixed data outside a specific distribution area or range is replaced by an average value through a box plot and a MAD based on the fixed data distribution normality in the invoice data;
the fixed data is, for example: the amount of the invoices which are wasted in the previous 12 months is proportional to the proportion of the amount of the invoices, and the first large commodity circulation ratio in the last 12 months is equivalent to other data.
Adopting forward selection and backward deletion methods to screen out data with the best attributes from the invoice data with various attributes;
the data of the plurality of attributes is, for example: effective sum of invoicing from L1/average effective sum of invoicing from month L2 to month L13, effective sum of invoicing from month L2/average effective sum of invoicing from month L3 to month L14, effective sum of invoicing from month L3/average effective sum of invoicing from month L4 to month L15, and the like.
Based on the relevance of the univariate data in the invoice data and the predictive variable data thereof, deleting the univariate data with low predictive capability by adopting a method of combining pearson correlation coefficient, chi-square test and tree model;
the univariate data is, for example, the effective ticket amount increase rate in the last 12 months.
And calculating the characteristic data which belong to the same class and have similarity in the invoice data according to respective weight to obtain new characteristic data.
The characteristic data with similarity in the same category are, for example, data such as an average value of valid billing sheets from L1 to L3/L4 to L15, an average value of valid billing sheets from L1/L2 to L13, an average value of valid billing sheets from L2/L3 to L14, and an average value of valid billing sheets from L3/L4 to L15.
Respectively calculating the data feature set by using an LR model and an XGboost model, and respectively obtaining an output probability mapping score of the LR model and an output probability mapping score of the XGboost model;
specifically, the step of calculating the data feature set by using an LR model is as follows;
divide the small micro-enterprise into positive and negative samples, and let Eθ(X) ═ 0 is the boundary, and the data feature set is X ═ {1, X1,x2,x3,...,xn};
Let formula Eθ(X)=XTθ;
Wherein T is a transposed symbol;
will be formula Eθ(X) conversion to a function hθ(X)=sigmoid(Eθ(θ));
Then h isθ(X) ═ 0.5 for boundaries;
and carrying out iterative computation on the function to obtain an output probability mapping fraction theta.
The use of the XGBoost model to perform a data feature set is conventional and will not be set forth in detail herein.
Comparing the average value of the output probability mapping scores of the LR model and the XGboost model with a threshold value, and if the average value is greater than the threshold value, judging that the small and micro enterprises are good customers; if the value is smaller than the threshold value, judging the small micro enterprise as a bad client; the above is a first-level classification, and the small and micro enterprises are roughly classified into good customers and bad customers.
And classifying the bad clients by using an LR softmax multi-classification model, and classifying the bad clients into bad clients with different risk labels.
Specifically, unsupervised clustering calculation is carried out on a data feature set of the bad clients, and the bad clients are gathered into a plurality of category sets according to risk types;
calculating the category sets by adopting a risk index model respectively to obtain judgment results of the category sets, and adding the judgment results into the category sets;
class setIs set as U ═ U1,u2,u3,...,unRisk category V ═ V }1,v2,v3,...,vm},
Further, the risk indicator model calculation method comprises the following steps:
carrying out fuzzy judgment on each feature in the feature set U according to the risk category V to obtain an evaluation matrix:
Figure BDA0002835278970000081
wherein r isijRepresents uiAbout vjDegree of membership of;
determining the importance weight of each feature in the feature set U according to an analytic hierarchy process, wherein A is { alpha ═ alpha1,a2,a3,...,anAre multiplied by
Figure BDA0002835278970000091
Multiplying the weight A by the matrix R to obtain a decision result B ═ B1,b2,b3,...,bm}。
For m judgment results, performing regression calculation on one judgment result and the rest m-1 judgment results;
and establishing an LR softmax multi-classification model for the judgment result probability by adopting a linear predictor and a normalization factor.
The device for realizing the method provided by the embodiment of the invention is characterized by comprising the following steps:
an embodiment of the present invention provides an electronic device, which includes at least one processing unit and at least one storage unit, where the storage unit stores a computer program, and when the program is executed by the processing unit, the processing unit is enabled to execute the method.
An embodiment of the present invention provides a storage medium, which stores a computer program executable by an electronic device, and when the program runs on the electronic device, the electronic device is caused to execute the method described above.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method for measuring financial risk of a small micro-enterprise is characterized by comprising the following steps:
screening the small micro-enterprises by taking the business data, credit investigation data and peer-shield data of the small micro-enterprises as rejection strategies;
screening the small micro-enterprise again by taking in-line settlement data, blacklist data and historical data of the small micro-enterprise in a bank as rejection strategies;
performing data preprocessing on the invoice data of the small micro-enterprise obtained by screening again, and screening to obtain a data characteristic set;
respectively calculating the data feature set by using an LR model and an XGboost model, and respectively obtaining an output probability mapping score of the LR model and an output probability mapping score of the XGboost model;
comparing the average value of the output probability mapping score of the LR model and the output probability mapping score of the XGboost model with a threshold value, and if the average value is greater than the threshold value, judging the small and micro enterprise as a good client; if the value is smaller than the threshold value, judging the small and micro enterprise as a bad client;
and classifying the bad clients by using an LR softmax multi-classification model, and classifying the bad clients into bad clients with different risk labels.
2. The method for measuring financial risk of small micro-enterprise as claimed in claim 1, wherein the step of screening the small micro-enterprise and the screened small micro-enterprise with the business data, credit investigation data and the identity data of the small micro-enterprise as rejection strategies, and then screening the small micro-enterprise again with the in-line settlement data, blacklist data and historical data of the small micro-enterprise in the bank as rejection strategies comprises:
setting a threshold value, and continuing if the business data, credit investigation data and peer shield data of the small and micro enterprise are all larger than the threshold value; otherwise, ending;
setting a threshold value, and if the inline settlement data, the blacklist data and the historical data of the small micro enterprise in the bank are all larger than the threshold value, continuing; otherwise, ending.
3. The method for measuring financial risk of small micro-enterprise as claimed in claim 1, wherein the step of performing data preprocessing on the invoice data of the small micro-enterprise obtained by re-screening, and the step of screening the data feature set comprises:
selecting different missing value methods for processing variable characteristics in the invoice data;
determining that the fixed data outside a specific distribution area or range is replaced by an average value through a box plot and a MAD based on the fixed data distribution normality in the invoice data;
adopting forward selection and backward deletion methods to screen out the data with the best attribute from the invoice data with various attributes;
based on the relevance of the univariate data in the invoice data and the predictive variable data thereof, deleting the univariate data with low predictive capability by adopting a method of combining pearson correlation coefficient, chi-square test and tree model;
and calculating the characteristic data which belong to the same class and have similarity in the invoice data according to respective weight to obtain new characteristic data.
4. The method of small micro-enterprise financial risk measurement according to claim 1, wherein the step of computing the set of data features using an LR model is:
dividing the small micro-enterprise into a positive sample and a negative sample, and ordering Eθ(X) ═ 0 as a boundary, the data characteristicsSet as X ═ 1, X1,x2,x3,...,xn};
Let formula Eθ(X)=XTθ;
Wherein T is a transposed symbol;
will be said formula Eθ(X) conversion to a function hθ(X)=sigmoid(Eθ(θ));
Then h isθ(X) ═ 0.5 for boundaries;
and carrying out iterative computation on the function to obtain the output probability mapping fraction theta.
5. The method of small micro enterprise financial risk measurement according to claim 1, wherein the LR softmax multi-classification model building step comprises:
carrying out unsupervised clustering calculation on the data feature sets of the bad clients, and aggregating the bad clients into a plurality of category sets according to risk types;
calculating the category sets by adopting a risk index model respectively to obtain judgment results of the category sets, and adding the judgment results into the category sets;
and establishing the LR softmax multi-classification model according to the classification set of the judgment result.
6. The method for small micro enterprise financial risk measurement according to claim 5, wherein the step of establishing the risk indicator model includes:
the feature set of the category set is U ═ U1,u2,u3,...,unRisk category V ═ V }1,v2,v3,...,vm},
Carrying out fuzzy judgment on each feature in the feature set U according to the risk category V to obtain an assessment matrix:
Figure FDA0002835278960000031
wherein r isijRepresents uiAbout vjDegree of membership of;
determining the importance weight of each feature in the feature set U according to an analytic hierarchy process, wherein A is { a ═ a1,a2,a3,...,anAre multiplied by
Figure FDA0002835278960000032
Multiplying the weight A by the matrix R to obtain the judgment result B ═ B1,b2,b3,...,bm}。
7. The method of claim 6, wherein the step of building the LR softmax multi-classification model according to the set of classes resulting in the determination comprises:
for m judgment results, performing regression calculation on one judgment result and the rest m-1 judgment results;
and establishing an LR softmax multi-classification model for the judgment result probability by adopting a linear predictor and a normalization factor.
8. An apparatus for implementing the method of any one of claims 1 to 7, comprising:
the rejection strategy module is used for screening the small micro-enterprise and screening the small micro-enterprise obtained by screening by taking the industrial and commercial data, credit investigation data and the same shield data of the small micro-enterprise as rejection strategies, and then screening the small micro-enterprise again by taking in-line settlement data, blacklist data and historical data of the small micro-enterprise in a bank as rejection strategies;
the first-stage classification module is used for performing data preprocessing on invoice data of the small micro-enterprise obtained through secondary screening to obtain a data feature set through screening, calculating the data feature set by using an LR model and an XGboost model respectively, obtaining an output probability mapping score of the LR model and an output probability mapping score of the XGboost model respectively, comparing an average value of the output probability mapping score of the LR model and the output probability mapping score of the XGboost model with a threshold value, and if the average value is larger than the threshold value, judging the small micro-enterprise to be a good client; if the value is smaller than the threshold value, judging the small and micro enterprise as a bad client;
and the secondary classification module is used for classifying the bad clients by using an LR softmax multi-classification model and classifying the bad clients into bad clients with different risk labels.
9. An electronic device, comprising at least one processing unit and at least one memory unit, wherein the memory unit stores a computer program that, when executed by the processing unit, causes the processing unit to perform the method of any of claims 1 to 7.
10. A storage medium storing a computer program executable by an electronic device, the program, when run on the electronic device, causing the electronic device to perform the method of any one of claims 1 to 7.
CN202011475707.XA 2020-12-14 2020-12-14 Method and device for measuring financial risk of small and micro enterprises Pending CN112581254A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011475707.XA CN112581254A (en) 2020-12-14 2020-12-14 Method and device for measuring financial risk of small and micro enterprises

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011475707.XA CN112581254A (en) 2020-12-14 2020-12-14 Method and device for measuring financial risk of small and micro enterprises

Publications (1)

Publication Number Publication Date
CN112581254A true CN112581254A (en) 2021-03-30

Family

ID=75135011

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011475707.XA Pending CN112581254A (en) 2020-12-14 2020-12-14 Method and device for measuring financial risk of small and micro enterprises

Country Status (1)

Country Link
CN (1) CN112581254A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978680A (en) * 2019-03-18 2019-07-05 杭州绿度信息技术有限公司 A kind of air control method and system segmenting objective group's credit operation air control differentiation price
CN111489095A (en) * 2020-04-15 2020-08-04 腾讯科技(深圳)有限公司 Risk user management method and device, computer equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109978680A (en) * 2019-03-18 2019-07-05 杭州绿度信息技术有限公司 A kind of air control method and system segmenting objective group's credit operation air control differentiation price
CN111489095A (en) * 2020-04-15 2020-08-04 腾讯科技(深圳)有限公司 Risk user management method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
Maldonado et al. Credit scoring using three-way decisions with probabilistic rough sets
CN110837931B (en) Customer churn prediction method, device and storage medium
Amani et al. Data mining applications in accounting: A review of the literature and organizing framework
Ravisankar et al. Detection of financial statement fraud and feature selection using data mining techniques
CN104321794B (en) A kind of system and method that the following commercial viability of an entity is determined using multidimensional grading
US20050222929A1 (en) Systems and methods for investigation of financial reporting information
Van Thiel et al. Artificial intelligence credit risk prediction: An empirical study of analytical artificial intelligence tools for credit risk prediction in a digital era
CN107633030B (en) Credit evaluation method and device based on data model
US20130117278A1 (en) Methods, computer-accessible medium and systems for construction of and interference with networked data, for example, in a financial setting
CN111401600A (en) Enterprise credit risk evaluation method and system based on incidence relation
Aphale et al. Predict loan approval in banking system machine learning approach for cooperative banks loan approval
Fan et al. Improved ML‐based technique for credit card scoring in Internet financial risk control
CN111738843B (en) Quantitative risk evaluation system and method using running water data
Van Thiel et al. Artificial intelligent credit risk prediction: An empirical study of analytical artificial intelligence tools for credit risk prediction in a digital era
CN112862585A (en) Personal loan type bad asset risk rating method based on LightGBM decision tree algorithm
CN112419030A (en) Method, system and equipment for evaluating financial fraud risk
Hasheminejad et al. Clustering of bank customers based on lifetime value using data mining methods
CN112581254A (en) Method and device for measuring financial risk of small and micro enterprises
KR102499182B1 (en) Loan regular auditing system using artificia intellicence
CN107977804B (en) Guarantee warehouse business risk assessment method
CN112215689A (en) Financial fraud risk assessment method and device based on evidence theory
CN113256351A (en) User service demand identification method and device and computer readable storage medium
Kotsiantis et al. Financial Application of Neural Networks: two case studies in Greece
Takahashi et al. Towards early detections of the bad debt customers among the mail order industry
CN111461863A (en) Data processing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination