CN111738870A - Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering - Google Patents

Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering Download PDF

Info

Publication number
CN111738870A
CN111738870A CN202010739603.9A CN202010739603A CN111738870A CN 111738870 A CN111738870 A CN 111738870A CN 202010739603 A CN202010739603 A CN 202010739603A CN 111738870 A CN111738870 A CN 111738870A
Authority
CN
China
Prior art keywords
feature
engineering
data
features
accuracy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010739603.9A
Other languages
Chinese (zh)
Other versions
CN111738870B (en
Inventor
曾雪强
谢仑辰
徐学武
史清江
陈海军
化允
陈华龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gongbao Technology Zhejiang Co ltd
Original Assignee
Gongbao Technology Zhejiang Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gongbao Technology Zhejiang Co ltd filed Critical Gongbao Technology Zhejiang Co ltd
Priority to CN202010739603.9A priority Critical patent/CN111738870B/en
Publication of CN111738870A publication Critical patent/CN111738870A/en
Application granted granted Critical
Publication of CN111738870B publication Critical patent/CN111738870B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/067Enterprise or organisation modelling

Abstract

The invention discloses a method and a platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering, which comprises the following steps of firstly, carrying out preprocessing operation on engineering service data, and constructing an initial training data set according to the preprocessed data; then, according to the initial training data set, utilizing an XGboost model to train to obtain a benchmark risk evaluation model; secondly, performing feature screening by utilizing a maximum mutual information feature selection strategy and a benchmark risk evaluation model aiming at the initial training data set to obtain a screened training data set, and training by using an XGboost model to obtain a final risk evaluation model; and finally, performing risk assessment on the item to be assessed by using the obtained risk assessment model. The method can find out key characteristics from a large amount of redundant engineering project data, and reduces the complexity of the model while ensuring the predictive performance of the model.

Description

Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering
Technical Field
The invention relates to the technical field of engineering insurance and machine learning, in particular to a method and a platform for identifying risk of engineering insurance for ensuring performance based on feature engineering.
Background
The construction process and the construction flow of the construction project are complex, the number of project participants is large, the project period is long, the related area is wide, and the default of a construction unit can cause loss in various aspects, so that the introduction of a wind control mechanism for ensuring insurance in the construction project is particularly important, the cash guarantee fund pressure can be effectively released by a construction enterprise, and the enterprise burden is relieved. For the insurance industry, the main difficult problem for developing construction engineering insurance assurance is data and wind control, and the lack of professional knowledge and technology of construction engineering projects for insurance companies leads to difficult assessment of risks of policemen, insurance projects and insureds. The non-financing type guarantees that the insurance approval speed is required to be high, and the insurance applicant, the engineering project and the insured cannot be comprehensively examined.
Risk factors causing the engineering default have the characteristics of diversity, universality, objectivity, contingency and the like, so that the number of risk factors for performing is large, and strong relevance exists among the risk factors. The current engineering insurance mainly uses manpower judgment, is long in time consumption, does not utilize extensive project data information, and is the defect of the current risk judgment method. The algorithm model of the invention utilizes a large amount of data information and an intelligent algorithm model to integrate and analyze risk factors of the policyholder, the engineering project and the insured, thereby really achieving the purpose of quickly identifying the default risk of the construction project and assisting the insurance company to reduce the underwriting risk.
Disclosure of Invention
The invention aims to provide a method and a platform for identifying the risk of insurance for ensuring the performance of an engineering based on characteristic engineering, aiming at the defects of the prior art.
The purpose of the invention is realized by the following technical scheme: a method for identifying the risk of insurance for ensuring the performance of an engineering based on characteristic engineering comprises the following steps:
s1: carrying out preprocessing operation on the engineering project information data to obtain engineering service data, and constructing an initial training data set according to the engineering service data;
s2: according to the initial training data set, a standard risk assessment model is obtained by utilizing XGboost model training, and the discrimination accuracy of the standard risk assessment model is recorded
Figure 319634DEST_PATH_IMAGE001
S3: and (3) performing feature screening by using feature engineering aiming at an initial training data set, wherein the feature engineering is a maximum correlation minimum redundancy combined maximum mutual information coefficient feature selection strategy, is recorded as MR-MIC, and is combined with a reference risk assessment model and the judgment accuracy rate thereof
Figure 331453DEST_PATH_IMAGE002
Obtaining a screened training data set; the method specifically comprises the following steps: firstly, calculating the maximum mutual information coefficient of each pair of characteristics and each characteristic and the corresponding class label in engineering service data, then constructing a characteristic index set, and recording the judgment accuracy rate of each characteristic index set
Figure 155052DEST_PATH_IMAGE003
Selecting the feature index set with the highest accuracy, and recording the highest discrimination accuracy
Figure 746571DEST_PATH_IMAGE004
Accuracy of discrimination from reference risk assessment model
Figure 972016DEST_PATH_IMAGE002
For comparison, if
Figure 154735DEST_PATH_IMAGE005
Then determining the selected feature index set as the finally selected feature index set, if so
Figure 731210DEST_PATH_IMAGE006
Then, the feature index set is sorted and traversed from large to small according to the feature number in the feature index set, and a feature index set is found, and the judgment accuracy rate is high
Figure 923157DEST_PATH_IMAGE003
Greater than the threshold of accuracy, the threshold of accuracy discriminates the accuracy according to
Figure 504573DEST_PATH_IMAGE002
And the required precision is selected, and the screened characteristic quantity is larger than the characteristic quantity threshold; performing feature screening based on the found feature index set to obtain a screened training data set;
s4: aiming at the screened training data set, using an XGboost model to train to obtain a final risk assessment model;
s5: and (4) performing data preprocessing on the information data of the engineering project to be evaluated and MR-MIC feature screening in the step S1, and inputting the preprocessed and feature screened engineering service data into the final risk evaluation model obtained in the step S4 to obtain a risk evaluation result of the project to be evaluated.
Further, the preprocessing operation in step S1 specifically includes:
and carrying out one-hot coding processing on the class characteristics described in the form of characters in the engineering service data to obtain discrete numerical characteristics, and meanwhile, filling missing values in the characteristics described in the form of numerical values in the engineering service data by using a median filling method to finish data preprocessing.
Further, the feature screening policy in step S3 specifically includes:
s31: setting a mesh partition size parameterBProduce a satisfactionmn<BVarious kinds of (A), (B), (Cm,n) A combination of positive integers of (a) is,mandnvalues for grid horizontal and vertical division;
s32: for each pair of characteristics in engineering service dataXAndYgo through each group (m,n) Will beXCharacteristic value ofSpace is evenly divided intomShare and find the feature by using dynamic programmingXAndYfeatures with maximum mutual informationYIs then fixed to the featureYUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationXIs divided, then, the feature is fixedXUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationYAnd finally outputting each group of (m,n) Corresponding maximum mutual information valueI mn (X,Y);
S33: each pair is calculated according to the following formulaXAndYmaximum mutual information coefficient of
Figure 123773DEST_PATH_IMAGE007
Figure 125228DEST_PATH_IMAGE008
Method for calculating maximum mutual information coefficient between each feature and corresponding class label in engineering service data and each pair of featuresXAndYthe maximum mutual information coefficient calculation methods are consistent;
s34: constructing feature index setsS 1
Figure 855286DEST_PATH_IMAGE009
Wherein
Figure 789744DEST_PATH_IMAGE010
For the first in engineering business datakThe characteristics of the device are as follows,cis a category label;
Figure 579846DEST_PATH_IMAGE011
for the features calculated according to step S32 and step S33
Figure 865333DEST_PATH_IMAGE010
And its corresponding category labelcMaximum mutual information coefficient therebetween;
s35: generating the remaining feature index set by the following formula
Figure 399083DEST_PATH_IMAGE012
Figure 453627DEST_PATH_IMAGE013
WhereinTRepresenting the total number of features in the engineering business data;
Figure 927813DEST_PATH_IMAGE014
indexing sets for featuresS t Is indexed byiIs characterized in that it is a mixture of two or more of the above-mentioned components,
Figure 435018DEST_PATH_IMAGE015
indexing sets for unselected features
Figure 710141DEST_PATH_IMAGE016
The middle index isjThe features of (1);
s36: indexing each feature into a setS t Inputting the corresponding data set into the XGboost model, and recording the discrimination accuracy
Figure 884771DEST_PATH_IMAGE003
And selecting the feature index set with the highest accuracy
Figure 751095DEST_PATH_IMAGE017
Simultaneously recording the highest discrimination accuracy
Figure 276755DEST_PATH_IMAGE018
S37: will be provided with
Figure 886728DEST_PATH_IMAGE018
Determination accuracy of the reference risk assessment model in step S2
Figure 650284DEST_PATH_IMAGE019
Make a comparison if
Figure 687510DEST_PATH_IMAGE020
Then determine
Figure 201931DEST_PATH_IMAGE017
For the finally selected feature index set, if
Figure 615594DEST_PATH_IMAGE021
Then go from big to smalltFind onetThe accuracy of the discrimination
Figure 499237DEST_PATH_IMAGE003
Greater than a threshold of accuracy, i.e.
Figure 972943DEST_PATH_IMAGE022
And the number of features to be screened out is greater than the threshold number of features, i.e. the number of features to be screened out
Figure 145299DEST_PATH_IMAGE023
And determineS t As a final selected feature index set, wherein,aandbis a parameter set according to requirements;
s38: and performing feature screening based on the finally selected feature index set to obtain a screened training data set.
A project performance guarantee insurance risk identification platform based on feature engineering comprises a data input module, a data processing module, a feature calculation and screening module, a model training module and a risk assessment module:
the data input module is used for receiving engineering project information data needing risk identification, and the data input module comprises engineering project information data input for model training or engineering project information data to be evaluated;
the data processing module is used for executing preprocessing operation on the engineering project information data to obtain engineering service data, and generating an initial training data set or preprocessing the engineering project information data to be evaluated;
the characteristic calculation and screening module is used for carrying out characteristic screening on data preprocessed by the data processing module by utilizing characteristic engineering, the characteristic engineering is a maximum correlation minimum redundancy combined maximum mutual information coefficient characteristic selection strategy, is recorded as MR-MIC, and is combined with a reference risk discrimination model obtained by the model training module and the discrimination accuracy rate thereof
Figure 97074DEST_PATH_IMAGE001
And (3) carrying out feature screening to obtain a screened training data set, which specifically comprises the following steps: firstly, calculating the maximum mutual information coefficient of each pair of characteristics and each characteristic and the corresponding class label in engineering service data, then constructing a characteristic index set, and recording the judgment accuracy rate of each characteristic index set
Figure 100802DEST_PATH_IMAGE003
Selecting the feature index set with the highest accuracy, and recording the highest discrimination accuracy
Figure 745410DEST_PATH_IMAGE004
Accuracy of discrimination from reference risk assessment model
Figure 201799DEST_PATH_IMAGE024
For comparison, if
Figure 458731DEST_PATH_IMAGE025
Then determining the selected feature index set as the finally selected feature index set, if so
Figure 316965DEST_PATH_IMAGE006
Then, the feature index set is sorted and traversed from large to small according to the feature number in the feature index set, and a feature index set is found, and the judgment accuracy rate is high
Figure 335737DEST_PATH_IMAGE003
Greater than the threshold of accuracy, the threshold of accuracy discriminates the accuracy according to
Figure 279422DEST_PATH_IMAGE001
And the required precision is selected, and the screened characteristic quantity is larger than the characteristic quantity threshold; performing feature screening based on the found feature index set to obtain a screened training data set;
the model training module is used for training data preprocessed by the data processing module by using an XGboost model to obtain a reference risk discrimination model and recording the discrimination accuracy of the reference risk discrimination model
Figure 573000DEST_PATH_IMAGE024
(ii) a Or training the screened training data set generated by the feature calculation and screening module by using an XGboost model to obtain a final risk discrimination model;
and the risk evaluation module is used for giving a risk judgment result of the information data of the engineering project to be evaluated, which is input by the data input module, according to the final risk evaluation model.
Furthermore, the data input module receives data input in a unified mode from the outside and stores the data in a database.
Further, the data processing module comprises a character characteristic processing module and a numerical characteristic processing module;
the character feature processing module is used for carrying out one-hot coding processing on the class features described in the form of characters in the engineering service data to obtain discrete numerical features;
and the numerical value characteristic processing module is used for filling missing values by using a median filling method aiming at the characteristics described in a numerical value form in the engineering service data.
Further, the feature calculating and screening module comprises a maximum mutual information coefficient calculating module, a feature index set generating module and a feature screening module;
the maximum mutual information coefficient calculation module is used for calculating each pair of characteristics in the engineering service data obtained by the data processing moduleXAndYor the maximum mutual information coefficient between each feature and its corresponding class label(ii) a The method comprises the following specific steps:
a. setting a mesh partition size parameterBProduce a satisfactionmn<BVarious kinds of (A), (B), (Cm,n) A combination of positive integers of (a) is,mandnvalues for grid horizontal and vertical division;
b. for each pair of characteristics in engineering service dataXAndYgo through each group (m,n) Will beXIs divided evenly intomShare and find the feature by using dynamic programmingXAndYfeatures with maximum mutual informationYIs then fixed to the featureYUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationXIs divided, then, the feature is fixedXUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationYAnd finally outputting each group of (m,n) Corresponding maximum mutual information valueI mn (X,Y);
c. Each pair is calculated according to the following formulaXAndYmaximum mutual information coefficient of
Figure 285741DEST_PATH_IMAGE007
Figure 537731DEST_PATH_IMAGE026
Method for calculating maximum mutual information coefficient between each feature and corresponding class label in engineering service data and each pair of featuresXAndYthe maximum mutual information coefficient calculation methods are consistent;
the feature index set generation module is configured to perform feature screening on the data preprocessed by the data processing module by using an MR-MIC feature selection policy according to the maximum mutual information coefficients between each pair of features calculated by the maximum mutual information coefficient module and between each feature and the corresponding category label thereof, and generate all feature index sets, specifically as follows:
a. constructing feature index setsS 1
Figure 968712DEST_PATH_IMAGE027
Wherein
Figure 65981DEST_PATH_IMAGE010
For the first in engineering business datakThe characteristics of the device are as follows,cis a category label;
Figure 134694DEST_PATH_IMAGE011
for features obtained from the maximum mutual information coefficient calculation module
Figure 495268DEST_PATH_IMAGE010
And its corresponding category labelcMaximum mutual information coefficient therebetween;
b. generating the remaining feature index set by the following formula
Figure 413545DEST_PATH_IMAGE012
Figure 48926DEST_PATH_IMAGE028
WhereinTRepresenting the total number of features in the engineering business data;
Figure 736259DEST_PATH_IMAGE014
indexing sets for featuresS t Is indexed byiIs characterized in that it is a mixture of two or more of the above-mentioned components,
Figure 330052DEST_PATH_IMAGE015
indexing sets for unselected features
Figure 470046DEST_PATH_IMAGE029
The middle index isjThe features of (1);
the feature screening module is used for selecting the feature index set with the highest accuracy value from all the feature index sets obtained by the feature index set generation module
Figure 174697DEST_PATH_IMAGE017
Simultaneously recording the highest discrimination accuracy
Figure 716537DEST_PATH_IMAGE018
Accuracy of discrimination from reference risk assessment model
Figure 418913DEST_PATH_IMAGE019
For comparison, if
Figure 547669DEST_PATH_IMAGE030
Then determine
Figure 524852DEST_PATH_IMAGE031
For the finally selected feature index set, if
Figure 452357DEST_PATH_IMAGE032
Then go from big to smalltFind onetThe accuracy of the discrimination
Figure 122373DEST_PATH_IMAGE003
Greater than a threshold of accuracy, i.e.
Figure 236959DEST_PATH_IMAGE033
And the number of features to be screened out is greater than the threshold number of features, i.e. the number of features to be screened out
Figure 17833DEST_PATH_IMAGE023
And determineS t As a final selected feature index set, wherein,aandband performing feature screening on the parameters set according to requirements and based on the finally selected feature index set to obtain a screened training data set.
The invention has the beneficial effects that: the method utilizes an MR-MIC characteristic selection strategy, can find out the characteristics most relevant to the class labels from a large amount of engineering project data, and simultaneously ensures that the redundancy degree between the selected characteristics is lower, thereby reducing the complexity of the model while ensuring the predictive performance of the model. The XGboost algorithm is adopted to construct the model, so that the result accuracy of the proposed risk identification method is ensured.
Drawings
FIG. 1 is a flow chart of a method for identifying insurance risk of project performance guarantee based on feature engineering provided by the present invention;
FIG. 2 is a schematic structural diagram of an engineering performance guarantee insurance risk identification platform based on feature engineering according to the present invention;
FIG. 3 is a diagram of a feature of the area of insurance for ensuring the performance of an engineering project.
Detailed Description
The invention will be further described with reference to the accompanying drawings and specific embodiments, which are intended to facilitate an understanding of the invention and are not intended to be limiting in any way.
The invention provides a method for identifying the insurance risk of the project performance guarantee based on characteristic engineering, which has the main flow as shown in figure 1 and comprises the following steps:
1. and carrying out preprocessing operation on the engineering service data, and constructing an initial training data set according to the preprocessed data.
The XGboost algorithm used in the invention cannot process character classification characteristics, so that the type characteristics need to be coded and converted, the characteristic structure diagram of the engineering performance assurance insurance field processed by the invention is shown in FIG. 3, in the embodiment, one-hot coding is used, the meaning is that N states are stored by using an N-bit register, each state has an independent register bit, and only one bit in the register is effective. For example, as shown in table 1, the "construction difficulty" feature includes three values, and thus can be expanded to three features. In the three-bit code after the original characteristic conversion, only the corresponding conversion bit is in the state 1, and the rest are 0, namely, the value of 'simple' can be converted into the code in which the values of 'construction difficulty _ simple', 'construction difficulty _ general' and 'construction difficulty _ complex' are respectively 1, 0 and 0.
TABLE 1 character quantity characteristic coding schematic table
Difficulty of construction Construction difficulty _ simple Construction difficulty _ general Construction difficulty _ Complex
Simple and easy 1 0 0
In general 0 1 0
Complexity of 0 0 1
In addition, the input engineering service information has partial missing values. In consideration of the actual meaning of data and the requirement of algorithm deployment, the median of the same feature dimension data can be used for filling a feature missing position, and the excessive influence on the data distribution and the actual meaning is avoided.
2. According to the initial training data set, a reference risk assessment model is obtained by utilizing XGboost model training, and the discrimination accuracy of the reference risk assessment model is recorded
Figure 534265DEST_PATH_IMAGE034
XGBoost(eXtrelement Gradient Boosting) is an efficient implementation of a Gradient Boosting (GB) method, is a learning model for regression and classification problems, and has the characteristics of difficulty in overfitting, high flexibility, high convergence speed, high accuracy and the like. The XGboost model is used, so that the risk assessment performance can be better. In the embodiment, the training data set obtained in the step 1 is used, an XGboost model with default parameters is used for direct training, a benchmark risk assessment model can be obtained, and the discrimination accuracy of the model is recorded at the moment
Figure 375182DEST_PATH_IMAGE002
For subsequent use. In observing the model results, the data results of the evaluation model have the following four possibilities:
a. true positive
Figure 478530DEST_PATH_IMAGE035
: the real type of the sample is positive, and the model prediction result is also positive;
b. true negative
Figure 63095DEST_PATH_IMAGE036
: the true category of the sample is negative, and the model prediction result is also negative;
c. false positive
Figure 637296DEST_PATH_IMAGE037
: the real type of the sample is negative, and the model prediction result is positive;
d. false negative
Figure 649114DEST_PATH_IMAGE038
: the true category of the sample is positive, and the model prediction result is negative.
The data related to the invention is classified data, and comprises two categories of 'application of insurance' and 'non-application of insurance'. The comparison standard of the model is mainly the model discrimination index of the "no-guarantee" data because the "no-guarantee" class data is less and the wrong discrimination of the classified data causes great loss to the company. If used in the definition of the invention "Non-insurable data as positive class (Positive) "application" data is negative classNegative) Then the accuracy rate of the 'no guarantee' data can be calculatedPrecisionRecall rateRecallF1-ScoreThe meaning is as follows:
a. rate of accuracyPrecision
Figure 472714DEST_PATH_IMAGE039
The proportion of positive true categories in the data samples judged to be positive, namely the judgment accuracy of the model for the positive categories;
b. recall rateRecall
Figure 126549DEST_PATH_IMAGE040
The proportion of the data samples with positive real categories judged to be positive;
c.F1-Score
Figure 351994DEST_PATH_IMAGE041
F1-Scoreis a harmonic average of precision and recall.
In addition, the proportion of all samples which are judged to be correct is also required to be compared, namely the total accuracy:
Figure 534713DEST_PATH_IMAGE042
taken together, this embodiment uses
Figure 314451DEST_PATH_IMAGE003
Of the "non-insuring" typeRecallValue and model overall accuracy
Figure 240818DEST_PATH_IMAGE043
Sum of values to reach the category data considering the greater threat to the traffic, and simultaneouslyThe aim of overall accuracy is also fulfilled.
3. And performing feature screening by using an MR-MIC feature selection strategy and a benchmark risk evaluation model aiming at the initial training data set to obtain a screened training data set.
A. Generating mesh partitions
In practice, the parameters of mesh division need to be setBProduce a satisfactionmn<BVarious kinds of (A), (B), (Cm,n) A combination of positive integers of (a) is,Bif the parameter is too large, the number of mesh divisions is large, and calculation becomes complicated, and if the parameter is too small, the interval pattern of the division is too simple, and therefore, the parameter is generally set to be an empirical parameter
Figure 822235DEST_PATH_IMAGE044
B. Determining a maximum mutual information value
For each pair of characteristics in engineering service dataXAndYgo through each group (m,n) Will beXIs divided evenly intomShare and find the feature by using dynamic programmingXAndYfeatures with maximum mutual informationYIs then fixed to the featureYUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationXIs divided, then, the feature is fixedXUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationYAnd finally outputting each group of (m,n) Corresponding maximum mutual information valueI mn (X,Y)。
C. Determining maximum mutual information coefficient
Each pair is calculated according to the following formulaXAndYmaximum mutual information coefficient of
Figure 441435DEST_PATH_IMAGE007
Figure 505206DEST_PATH_IMAGE026
Method for calculating maximum mutual information coefficient between each feature and corresponding class label in engineering service data and each pair of featuresXAndYthe maximum mutual information coefficient calculation methods are consistent;
D. constructing an initial feature index set
Initially, all features are traversed
Figure 969685DEST_PATH_IMAGE045
Selecting the label of the category and the itemcThe largest mutual information coefficient is the largest, and an initial feature index set is constructed therefromS 1
Figure 169722DEST_PATH_IMAGE009
Wherein
Figure 163086DEST_PATH_IMAGE010
For the first in engineering business datakThe characteristics of the device are as follows,cis a category label;
Figure 182995DEST_PATH_IMAGE011
for the features calculated according to step S32 and step S33
Figure 716744DEST_PATH_IMAGE010
And its corresponding category labelcMaximum mutual information coefficient therebetween;
E. constructing all feature index sets
After the initial feature index set is obtained, one and the category label are selected for each feature additioncThe index of the feature with the highest correlation and the lowest correlation with the selected features, and the rest feature index set is generated by the following formula
Figure 36867DEST_PATH_IMAGE012
Figure 233755DEST_PATH_IMAGE046
WhereinTRepresenting the total number of features in the engineering business data;
Figure 6539DEST_PATH_IMAGE047
indexing sets for featuresS t Is indexed byiIs characterized in that it is a mixture of two or more of the above-mentioned components,
Figure 78401DEST_PATH_IMAGE015
indexing sets for unselected features
Figure 987451DEST_PATH_IMAGE048
The middle index isjThe characteristics of (1).
F. Performing model judgment and result recording
After all feature index sets are generated, each feature index set needs to be generatedS t Inputting the corresponding data set into the XGboost model, and recording the discrimination accuracy
Figure 322617DEST_PATH_IMAGE003
And selecting the feature index set with the highest discrimination accuracy
Figure 317118DEST_PATH_IMAGE049
Record the highest discrimination accuracy
Figure 192670DEST_PATH_IMAGE004
G. Feature index set selection
Will be provided with
Figure 487385DEST_PATH_IMAGE004
The discrimination accuracy of the reference risk assessment model in the step 2
Figure 524611DEST_PATH_IMAGE002
Make a comparison if
Figure 761734DEST_PATH_IMAGE050
Then determine
Figure 175397DEST_PATH_IMAGE017
In the embodiment, the screening of the optimal feature index set can be completed through the standard. In addition, when
Figure 262302DEST_PATH_IMAGE051
If the results show different losses after screening, the process needs to be traversed from large to smalltIn an embodiment, the setting finds a satisfaction
Figure 470430DEST_PATH_IMAGE052
Number of features simultaneously screened
Figure 439523DEST_PATH_IMAGE053
Characteristic index set ofS t I.e., accuracy does not decrease by more than 5% and more than 20% of the features are screened out and determined to be the final selected feature index set, the selection criteria being used to achieve the goal of deleting as many features as possible while preserving data performance.
H. Obtaining a filtered data set
And screening the engineering service data by using the finally selected feature index set so as to obtain a screened training data set.
4. And aiming at the screened training data set, training by using an XGboost model to obtain a final risk assessment model.
In this embodiment, after the final feature index set and the filtered data set are determined, the model is trained again by using the filtered data, and the comparison between the "non-insurable" model indexes before and after feature filtering and the accuracy is shown in table 2:
TABLE 2 comparison of "No insurances" class model indices before and after feature screening with accuracy
Precision Recall F1-Score Accuracy
Before screening 0.67 0.55 0.61 0.86
After screening 0.71 0.56 0.63 0.87
The observation shows that after the characteristic screening, the model index of the 'no-guarantee' class is obviously improved, and the overall accuracy rate is increased, which shows that the MR-MIC characteristic screening method has better effect.
5. And (3) preprocessing the data of the project to be evaluated and screening the characteristics in the step (1), and then inputting the preprocessed and screened data into the final risk evaluation model obtained in the step (4) to obtain a risk identification result of the project to be evaluated.
As shown in FIG. 2, the invention also provides a feature engineering-based engineering performance guarantee insurance risk identification platform, which comprises a data input module, a data processing module, a feature calculation and screening module, a model training module and a risk assessment module
The data input module is used for receiving engineering project information data needing risk identification, and the data input module comprises engineering project information data input for model training or engineering project information data to be evaluated;
the data processing module is used for executing preprocessing operation on the engineering project information data to obtain engineering service data, and generating an initial training data set or preprocessing the engineering project information data to be evaluated;
the characteristic calculation and screening module is used for carrying out characteristic screening on data preprocessed by the data processing module by utilizing characteristic engineering, the characteristic engineering is a maximum correlation minimum redundancy combined maximum mutual information coefficient characteristic selection strategy, is recorded as MR-MIC, and is combined with a reference risk discrimination model obtained by the model training module and the discrimination accuracy rate thereof
Figure 922457DEST_PATH_IMAGE001
Firstly, calculating the maximum mutual information coefficient of each pair of characteristics and each characteristic and the corresponding class label in engineering service data, then constructing a characteristic index set, and recording the judgment accuracy rate of each characteristic index set
Figure 660605DEST_PATH_IMAGE003
Selecting the feature index set with the highest accuracy, and recording the highest discrimination accuracy
Figure 305213DEST_PATH_IMAGE004
Accuracy of discrimination from reference risk assessment model
Figure 761602DEST_PATH_IMAGE034
For comparison, if
Figure 18534DEST_PATH_IMAGE054
Then the selected feature is determinedThe index set is the finally selected characteristic index set if
Figure 876768DEST_PATH_IMAGE006
Then, the feature index set is sorted and traversed from large to small according to the feature number in the feature index set, and a feature index set is found, and the judgment accuracy rate is high
Figure 692278DEST_PATH_IMAGE003
Greater than a threshold of accuracy
Figure 370384DEST_PATH_IMAGE052
The accuracy threshold value is used for judging the accuracy according to
Figure 132803DEST_PATH_IMAGE055
And the required precision is selected, and the requirement that the number of the screened features is larger than the threshold value of the number of the features is met
Figure 845544DEST_PATH_IMAGE023
(ii) a Performing feature screening based on the feature index set to obtain a screened training data set;
the model training module is used for training data preprocessed by the data processing module by using an XGboost model to obtain a reference risk discrimination model and recording the discrimination accuracy of the model
Figure 831955DEST_PATH_IMAGE056
Or training the screened training data set generated by the feature calculation and screening module by using an XGboost model to obtain a final risk discrimination model;
and the risk evaluation module is used for giving a risk judgment result of the information data of the engineering project to be evaluated, which is input by the data input module, according to the final risk evaluation model.
The present invention is not limited to the above-described embodiments, and those skilled in the art can implement the present invention in other various embodiments based on the disclosure of the present invention. Therefore, the design of the invention is within the scope of protection, with simple changes or modifications, based on the design structure and thought of the invention.

Claims (7)

1. A method for identifying the risk of insurance for ensuring the performance of an engineering based on characteristic engineering is characterized by comprising the following steps:
s1: carrying out preprocessing operation on the engineering project information data to obtain engineering service data, and constructing an initial training data set according to the engineering service data;
s2: according to the initial training data set, a standard risk assessment model is obtained by utilizing XGboost model training, and the discrimination accuracy of the standard risk assessment model is recorded
Figure 575634DEST_PATH_IMAGE001
S3: and (3) performing feature screening by using feature engineering aiming at an initial training data set, wherein the feature engineering is a maximum correlation minimum redundancy combined maximum mutual information coefficient feature selection strategy, is recorded as MR-MIC, and is combined with a reference risk assessment model and the judgment accuracy rate thereof
Figure 23933DEST_PATH_IMAGE002
Obtaining a screened training data set; the method specifically comprises the following steps: firstly, calculating the maximum mutual information coefficient of each pair of characteristics and each characteristic and the corresponding class label in engineering service data, then constructing a characteristic index set, and recording the judgment accuracy rate of each characteristic index set
Figure 334829DEST_PATH_IMAGE003
Selecting the feature index set with the highest accuracy, and recording the highest discrimination accuracy
Figure 526776DEST_PATH_IMAGE004
Accuracy of discrimination from reference risk assessment model
Figure 872306DEST_PATH_IMAGE002
For comparison, if
Figure 727392DEST_PATH_IMAGE005
Then determining the selected feature index set as the finally selected feature index set, if so
Figure 525584DEST_PATH_IMAGE006
Then, the feature index set is sorted and traversed from large to small according to the feature number in the feature index set, and a feature index set is found, and the judgment accuracy rate is high
Figure 255642DEST_PATH_IMAGE003
Greater than the threshold of accuracy, the threshold of accuracy discriminates the accuracy according to
Figure 455680DEST_PATH_IMAGE002
And the required precision is selected, and the screened characteristic quantity is larger than the characteristic quantity threshold; performing feature screening based on the found feature index set to obtain a screened training data set;
s4: aiming at the screened training data set, using an XGboost model to train to obtain a final risk assessment model;
s5: and (4) performing data preprocessing on the information data of the engineering project to be evaluated and MR-MIC feature screening in the step S1, and inputting the preprocessed and feature screened engineering service data into the final risk evaluation model obtained in the step S4 to obtain a risk evaluation result of the project to be evaluated.
2. The method as claimed in claim 1, wherein the preprocessing operation in step S1 includes:
and carrying out one-hot coding processing on the class characteristics described in the form of characters in the engineering service data to obtain discrete numerical characteristics, and meanwhile, filling missing values in the characteristics described in the form of numerical values in the engineering service data by using a median filling method to finish data preprocessing.
3. The method as claimed in claim 1, wherein the feature screening strategy in step S3 specifically includes:
s31: setting a mesh partition size parameterBProduce a satisfactionmn<BVarious kinds of (A), (B), (Cm,n) A combination of positive integers of (a) is,mandnvalues for grid horizontal and vertical division;
s32: for each pair of characteristics in engineering service dataXAndYgo through each group (m,n) Will beXIs divided evenly intomShare and find the feature by using dynamic programmingXAndYfeatures with maximum mutual informationYIs then fixed to the featureYUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationXIs divided, then, the feature is fixedXUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationYAnd finally outputting each group of (m,n) Corresponding maximum mutual information valueI mn (X,Y);
S33: each pair is calculated according to the following formulaXAndYmaximum mutual information coefficient of
Figure 980202DEST_PATH_IMAGE007
Figure 265690DEST_PATH_IMAGE008
Method for calculating maximum mutual information coefficient between each feature and corresponding class label in engineering service data and each pair of featuresXAndYthe maximum mutual information coefficient calculation methods are consistent;
s34: constructing feature index setsS 1
Figure 533860DEST_PATH_IMAGE009
Wherein
Figure 791666DEST_PATH_IMAGE010
For the first in engineering business datakThe characteristics of the device are as follows,cis a category label;
Figure 752669DEST_PATH_IMAGE011
for the features calculated according to step S32 and step S33
Figure 15199DEST_PATH_IMAGE010
And its corresponding category labelcMaximum mutual information coefficient therebetween;
s35: generating the remaining feature index set by the following formula
Figure 87060DEST_PATH_IMAGE012
Figure 261689DEST_PATH_IMAGE013
WhereinTRepresenting the total number of features in the engineering business data;
Figure 128014DEST_PATH_IMAGE014
indexing sets for featuresS t Is indexed byiIs characterized in that it is a mixture of two or more of the above-mentioned components,
Figure 591357DEST_PATH_IMAGE015
indexing sets for unselected features
Figure 201329DEST_PATH_IMAGE016
The middle index isjThe features of (1);
s36: indexing each feature into a setS t Inputting the corresponding data set into the XGboost model, and recording the discrimination accuracy
Figure 230465DEST_PATH_IMAGE003
And selecting the feature index set with the highest accuracy
Figure 533271DEST_PATH_IMAGE017
Simultaneously recording the highest discrimination accuracy
Figure 280647DEST_PATH_IMAGE018
S37: will be provided with
Figure 195776DEST_PATH_IMAGE004
Determination accuracy of the reference risk assessment model in step S2
Figure 79418DEST_PATH_IMAGE019
Make a comparison if
Figure 756387DEST_PATH_IMAGE020
Then determine
Figure 725480DEST_PATH_IMAGE017
For the finally selected feature index set, if
Figure 677255DEST_PATH_IMAGE021
Then go from big to smalltFind onetThe accuracy of the discrimination
Figure 946563DEST_PATH_IMAGE003
Greater than a threshold of accuracy, i.e.
Figure 591171DEST_PATH_IMAGE022
And the number of features to be screened out is greater than the threshold number of features, i.e. the number of features to be screened out
Figure 47560DEST_PATH_IMAGE023
And determineS t As a final selected feature index set, wherein,aandbis based onParameters required to be set;
s38: and performing feature screening based on the finally selected feature index set to obtain a screened training data set.
4. A project performance guarantee insurance risk identification platform based on feature engineering is characterized by comprising a data input module, a data processing module, a feature calculation and screening module, a model training module and a risk assessment module:
the data input module is used for receiving engineering project information data needing risk identification, and the data input module comprises engineering project information data input for model training or engineering project information data to be evaluated;
the data processing module is used for executing preprocessing operation on the engineering project information data to obtain engineering service data, and generating an initial training data set or preprocessing the engineering project information data to be evaluated;
the characteristic calculation and screening module is used for carrying out characteristic screening on data preprocessed by the data processing module by utilizing characteristic engineering, the characteristic engineering is a maximum correlation minimum redundancy combined maximum mutual information coefficient characteristic selection strategy, is recorded as MR-MIC, and is combined with a reference risk discrimination model obtained by the model training module and the discrimination accuracy rate thereof
Figure 6289DEST_PATH_IMAGE019
And (3) carrying out feature screening to obtain a screened training data set, which specifically comprises the following steps: firstly, calculating the maximum mutual information coefficient of each pair of characteristics and each characteristic and the corresponding class label in engineering service data, then constructing a characteristic index set, and recording the judgment accuracy rate of each characteristic index set
Figure 598944DEST_PATH_IMAGE003
Selecting the feature index set with the highest accuracy, and recording the highest discrimination accuracy
Figure 915918DEST_PATH_IMAGE004
Accuracy of discrimination from reference risk assessment model
Figure 859603DEST_PATH_IMAGE019
For comparison, if
Figure 684340DEST_PATH_IMAGE024
Then determining the selected feature index set as the finally selected feature index set, if so
Figure 397081DEST_PATH_IMAGE006
Then, the feature index set is sorted and traversed from large to small according to the feature number in the feature index set, and a feature index set is found, and the judgment accuracy rate is high
Figure 117912DEST_PATH_IMAGE003
Greater than the threshold of accuracy, the threshold of accuracy discriminates the accuracy according to
Figure 752156DEST_PATH_IMAGE019
And the required precision is selected, and the screened characteristic quantity is larger than the characteristic quantity threshold; performing feature screening based on the found feature index set to obtain a screened training data set;
the model training module is used for training data preprocessed by the data processing module by using an XGboost model to obtain a reference risk discrimination model and recording the discrimination accuracy of the reference risk discrimination model
Figure 849425DEST_PATH_IMAGE025
(ii) a Or training the screened training data set generated by the feature calculation and screening module by using an XGboost model to obtain a final risk discrimination model;
and the risk evaluation module is used for giving a risk judgment result of the information data of the engineering project to be evaluated, which is input by the data input module, according to the final risk evaluation model.
5. The feature engineering-based project performance guarantee insurance risk identification platform according to claim 4, wherein the data input module receives data input from outside in a unified manner and stores the data in a database.
6. The feature engineering-based project performance guarantee insurance risk identification platform according to claim 4, wherein the data processing module comprises a word feature processing module and a numerical feature processing module;
the character feature processing module is used for carrying out one-hot coding processing on the class features described in the form of characters in the engineering service data to obtain discrete numerical features;
and the numerical value characteristic processing module is used for filling missing values by using a median filling method aiming at the characteristics described in a numerical value form in the engineering service data.
7. The feature engineering-based project performance guarantee insurance risk identification platform of claim 4, wherein the feature calculation and screening module comprises a maximum mutual information coefficient calculation module, a feature index set generation module and a feature screening module;
the maximum mutual information coefficient calculation module is used for calculating each pair of characteristics in the engineering service data obtained by the data processing moduleXAndYor the maximum mutual information coefficient between each feature and its corresponding class label; the method comprises the following specific steps:
a. setting a mesh partition size parameterBProduce a satisfactionmn<BVarious kinds of (A), (B), (Cm,n) A combination of positive integers of (a) is,mandnvalues for grid horizontal and vertical division;
b. for each pair of characteristics in engineering service dataXAndYgo through each group (m,n) Will beXIs divided evenly intomShare and find the feature by using dynamic programmingXAndYfeatures with maximum mutual informationYIs then fixed to the featureYUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationXIs divided, then, the feature is fixedXUsing dynamic programming to find the featuresXAndYfeatures with maximum mutual informationYAnd finally outputting each group of (m,n) Corresponding maximum mutual information valueI mn (X,Y);
c. Each pair is calculated according to the following formulaXAndYmaximum mutual information coefficient of
Figure 682252DEST_PATH_IMAGE007
Figure 573984DEST_PATH_IMAGE008
Method for calculating maximum mutual information coefficient between each feature and corresponding class label in engineering service data and each pair of featuresXAndYthe maximum mutual information coefficient calculation methods are consistent;
the feature index set generation module is configured to perform feature screening on the data preprocessed by the data processing module by using an MR-MIC feature selection policy according to the maximum mutual information coefficients between each pair of features calculated by the maximum mutual information coefficient module and between each feature and the corresponding category label thereof, and generate all feature index sets, specifically as follows:
a. constructing feature index setsS 1
Figure 259306DEST_PATH_IMAGE026
Wherein
Figure 160266DEST_PATH_IMAGE010
For the first in engineering business datakThe characteristics of the device are as follows,cis a category label;
Figure 582020DEST_PATH_IMAGE011
for features obtained from the maximum mutual information coefficient calculation module
Figure 910233DEST_PATH_IMAGE010
And its corresponding category labelcMaximum mutual information coefficient therebetween;
b. generating the remaining feature index set by the following formula
Figure 50227DEST_PATH_IMAGE012
Figure 754878DEST_PATH_IMAGE027
WhereinTRepresenting the total number of features in the engineering business data;
Figure 296718DEST_PATH_IMAGE028
indexing sets for featuresS t Is indexed byiIs characterized in that it is a mixture of two or more of the above-mentioned components,
Figure 795832DEST_PATH_IMAGE015
indexing sets for unselected features
Figure 626385DEST_PATH_IMAGE029
The middle index isjThe features of (1);
the feature screening module is used for selecting the feature index set with the highest accuracy value from all the feature index sets obtained by the feature index set generation module
Figure 370612DEST_PATH_IMAGE017
Simultaneously recording the highest discrimination accuracy
Figure 766959DEST_PATH_IMAGE004
Accuracy of discrimination from reference risk assessment model
Figure 702554DEST_PATH_IMAGE002
For comparison, if
Figure 817140DEST_PATH_IMAGE030
Then determine
Figure 598014DEST_PATH_IMAGE017
For the finally selected feature index set, if
Figure 114446DEST_PATH_IMAGE031
Then go from big to smalltFind onetThe accuracy of the discrimination
Figure 158626DEST_PATH_IMAGE003
Greater than a threshold of accuracy, i.e.
Figure 760508DEST_PATH_IMAGE032
And the number of features to be screened out is greater than the threshold number of features, i.e. the number of features to be screened out
Figure 345073DEST_PATH_IMAGE023
And determineS t As a final selected feature index set, wherein,aandband performing feature screening on the parameters set according to requirements and based on the finally selected feature index set to obtain a screened training data set.
CN202010739603.9A 2020-07-28 2020-07-28 Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering Active CN111738870B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010739603.9A CN111738870B (en) 2020-07-28 2020-07-28 Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010739603.9A CN111738870B (en) 2020-07-28 2020-07-28 Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering

Publications (2)

Publication Number Publication Date
CN111738870A true CN111738870A (en) 2020-10-02
CN111738870B CN111738870B (en) 2020-12-25

Family

ID=72656242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010739603.9A Active CN111738870B (en) 2020-07-28 2020-07-28 Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering

Country Status (1)

Country Link
CN (1) CN111738870B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113159568A (en) * 2021-04-19 2021-07-23 福建万川信息科技股份有限公司 System and method for estimating insurance risk
WO2022121217A1 (en) * 2020-12-07 2022-06-16 平安科技(深圳)有限公司 Quota prediction method and device, and computer-readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170193335A1 (en) * 2015-11-13 2017-07-06 Wise Athena Inc. Method for data encoding and accurate predictions through convolutional networks for actual enterprise challenges
CN108509388A (en) * 2018-01-30 2018-09-07 天津大学 Feature selection approach based on maximal correlation minimal redundancy and sequence
CN111401914A (en) * 2020-04-02 2020-07-10 支付宝(杭州)信息技术有限公司 Risk assessment model training and risk assessment method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170193335A1 (en) * 2015-11-13 2017-07-06 Wise Athena Inc. Method for data encoding and accurate predictions through convolutional networks for actual enterprise challenges
CN108509388A (en) * 2018-01-30 2018-09-07 天津大学 Feature selection approach based on maximal correlation minimal redundancy and sequence
CN111401914A (en) * 2020-04-02 2020-07-10 支付宝(杭州)信息技术有限公司 Risk assessment model training and risk assessment method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022121217A1 (en) * 2020-12-07 2022-06-16 平安科技(深圳)有限公司 Quota prediction method and device, and computer-readable storage medium
CN113159568A (en) * 2021-04-19 2021-07-23 福建万川信息科技股份有限公司 System and method for estimating insurance risk

Also Published As

Publication number Publication date
CN111738870B (en) 2020-12-25

Similar Documents

Publication Publication Date Title
CN109829631B (en) Enterprise risk early warning analysis method and system based on memory network
Gordini A genetic algorithm approach for SMEs bankruptcy prediction: Empirical evidence from Italy
McKee Rough sets bankruptcy prediction models versus auditor signalling rates
CN108459955B (en) Software defect prediction method based on deep self-coding network
CN111738870B (en) Method and platform for identifying insurance risk of engineering performance guarantee based on characteristic engineering
Ko et al. Prediction of corporate financial distress: An application of the composite rule induction system
Silva et al. Cross country relations in European tourist arrivals
CN112700319A (en) Enterprise credit line determination method and device based on government affair data
CN110930250A (en) Enterprise credit risk prediction method and system, storage medium and electronic equipment
CN112700324A (en) User loan default prediction method based on combination of Catboost and restricted Boltzmann machine
CN111754317A (en) Financial investment data evaluation method and system
CN116340726A (en) Energy economy big data cleaning method, system, equipment and storage medium
CN111104975B (en) Credit evaluation method based on breadth learning
CN1653486B (en) Pattern feature selection method, classification method, judgment method, program, and device
CN111626886B (en) Multi-party cooperation-based engineering performance guarantee insurance risk identification method and platform
CN112232944A (en) Scoring card creating method and device and electronic equipment
CN114926261A (en) Method and medium for predicting fraud probability of automobile financial user application
CN112766765A (en) Professional learning ability evaluation method and system based on interval middle intelligence theory
CN114170000A (en) Credit card user risk category identification method, device, computer equipment and medium
CN113191771A (en) Buyer account period risk prediction method
CN113065969A (en) Enterprise scoring model construction method, enterprise scoring method, medium and electronic device
CN114386647A (en) Method and system for predicting energy consumption of oil and gas field industry
Liu et al. RETRACTED ARTICLE: Company financial path analysis using fuzzy c-means and its application in financial failure prediction
Terzi et al. Comparison of financial distress prediction models: Evidence from turkey
CN115640335B (en) Enterprise portrait-based enterprise analysis method, system and cloud platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant