CN115526697A - Bond issuer default risk identification method, device, equipment, medium and product - Google Patents

Bond issuer default risk identification method, device, equipment, medium and product Download PDF

Info

Publication number
CN115526697A
CN115526697A CN202211203989.7A CN202211203989A CN115526697A CN 115526697 A CN115526697 A CN 115526697A CN 202211203989 A CN202211203989 A CN 202211203989A CN 115526697 A CN115526697 A CN 115526697A
Authority
CN
China
Prior art keywords
bond
target
default risk
issuer
line data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211203989.7A
Other languages
Chinese (zh)
Inventor
吴超荣
夏成扬
詹丽娟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
CCB Finetech Co Ltd
Original Assignee
China Construction Bank Corp
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp, CCB Finetech Co Ltd filed Critical China Construction Bank Corp
Priority to CN202211203989.7A priority Critical patent/CN115526697A/en
Publication of CN115526697A publication Critical patent/CN115526697A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Finance (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment, a medium and a product for identifying default risks of bond issuers, and relates to the technical field of credit risk assessment. The method comprises the following steps: respectively acquiring target characteristic variables of each bond issuer to be identified in the bond issuer set to be identified and each bond to be identified in the bond set to be identified; and respectively inputting the target intra-row characteristic variables into at least two target machine learning models to obtain corresponding default risk probabilities of the bond issuer, and generating corresponding default risk target lists of the bond issuer according to the default risk probabilities of the bond issuer and the bond default risk probabilities. The embodiment of the invention can effectively identify the default condition of the bond issuer, improve the prejudgment of the bond default high-risk issuer and reduce the risk of fund loss caused by bond default.

Description

Bond issuer default risk identification method, device, equipment, medium and product
Technical Field
The invention relates to the technical field of credit risk assessment, in particular to a method, a device, equipment, a medium and a product for identifying default risks of bond issuers.
Background
At present, the risk management of the position-holding bond of a commercial bank part depends on manual delivery, and the timeliness is poor. Meanwhile, the position holding information between each business plate and each line is dispersed, the product structure is complex, unified risk opening statistics and management cannot be realized at a client level, over-credit is easy to occur, and particularly, the phenomenon of 'the movement towards each other' of a main branch and a subsidiary company is easily caused at a group level. In the prior art, regarding the research on bond default correlation, the emphasis is on carrying out default prediction on bond granularity, in addition, the disclosure of credit and bond market information is highly transparent, once the bond has default or substantial risk, risk solution and treatment are difficult to carry out subsequently, the recovery difficulty is high, a potential risk signal of a client must be captured and prejudged before the default event occurs, and the requirements on timeliness, effectiveness and accuracy of risk early warning are far greater than those of credit business.
Disclosure of Invention
In view of this, the present invention provides a method, an apparatus, a device, a medium, and a product for identifying a default risk of a bond issuer, which can effectively identify a default situation of the bond issuer, can regularly generate a default list of the bond issuer, and improve a pre-judgment performance of the bond default risk issuer, thereby improving accuracy and timeliness of risk early warning, and reducing a risk of capital loss due to bond default.
According to an aspect of the present invention, an embodiment of the present invention provides a method for identifying a default risk of a bond issuer, where the method includes:
respectively acquiring target characteristic variables of each bond issuer to be identified in the bond issuer set to be identified and each bond to be identified in the bond set to be identified as target in-line characteristic variables and target out-of-line characteristic variables;
inputting the characteristic variables in the target row into at least two pre-established target machine learning models respectively to obtain corresponding default risk probability of the bond issuer;
inputting the target out-of-line characteristic variables into at least two pre-established target machine learning models respectively to obtain corresponding bond default risk probabilities;
and generating a corresponding bond issuer default risk target list according to the bond issuer default risk probability and the bond default risk probability.
According to another aspect of the present invention, an embodiment of the present invention further provides a device for identifying a default risk of a bond issuer, where the device includes:
the variable acquisition module is used for respectively acquiring a target characteristic variable of each bond to be identified in the bond issuer set to be identified and each bond to be identified in the bond set to be identified as a target in-line characteristic variable and a target out-of-line characteristic variable;
the first probability determining module is used for respectively inputting the characteristic variables in the target row into at least two pre-established target machine learning models to obtain corresponding default risk probabilities of the bond issuing entity;
the second probability determination module is used for respectively inputting the target extravehicular characteristic variables into at least two pre-established target machine learning models to obtain corresponding bond default risk probabilities;
and the list generating module is used for generating a corresponding bond issuer default risk target list according to the bond issuer default risk probability and the bond default risk probability.
According to another aspect of the present invention, an embodiment of the present invention further provides an electronic device, including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the method for identifying a bond issuer breach risk according to any embodiment of the present invention.
According to another aspect of the present invention, an embodiment of the present invention further provides a computer-readable storage medium, which stores computer instructions for causing a processor to implement the method for identifying a default risk of a bond issuer according to any embodiment of the present invention when executed.
According to another aspect of the present invention, an embodiment of the present invention further provides a computer program product, which includes a computer program, and when the computer program is executed by a processor, the computer program implements the method for identifying a default risk of a bond issuer according to any embodiment of the present invention.
According to the technical scheme of the embodiment of the invention, the target in-line characteristic variables are respectively input into the at least two target machine learning models to obtain the corresponding default risk probability of the bond issuer, and the target out-of-line characteristic variables are respectively input into the at least two target machine learning models to obtain the corresponding default risk probability of the bond issuer, so that the corresponding default risk target list of the bond issuer is generated according to the default risk probability of the bond issuer and the default risk probability of the bond, therefore, the default condition of the bond issuer can be effectively identified, the default list of the bond issuer can be regularly generated, the prequalification of the bond default risk issuer is improved, the accuracy and timeliness of risk early warning are improved, and the risk of fund loss caused by bond default is reduced.
It should be understood that the statements in this section are not intended to identify key or critical features of the embodiments of the present invention, nor are they intended to limit the scope of the invention. Other features of the present invention will become apparent from the following description.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of a method for identifying a default risk of a bond issuer according to an embodiment of the present invention;
fig. 2 is a flowchart of another method for identifying a default risk of a bond issuer according to an embodiment of the present invention;
fig. 3 is a flowchart of another method for identifying a default risk of a bond issuer according to an embodiment of the present invention;
fig. 4 is a block diagram illustrating a structure of a device for identifying a default risk of a bond issuer according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It is to be understood that the terms "target" and the like in the description and claims of the present invention and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
According to the technical scheme, the data acquisition, storage, use, processing and the like meet relevant regulations of national laws and regulations.
In an embodiment, fig. 1 is a flowchart of a method for identifying a default risk of a bond issuer according to an embodiment of the present invention, where the embodiment is applicable to a situation when identifying a default risk of a bond issuer, and the method may be implemented by a bond issuer default risk identification device, where the bond issuer default risk identification device may be implemented in a form of hardware and/or software, and the bond issuer default risk identification device may be configured in an electronic device.
As shown in fig. 1, the method specifically comprises the following steps:
s110, respectively acquiring target characteristic variables of each bond issuer to be identified in the bond issuer set to be identified and each bond to be identified in the bond set to be identified as target in-row characteristic variables and target out-of-row characteristic variables.
The bond issuer to be identified refers to a bond issuer needing to calculate the default risk probability, and may include related bond issuers of governments, banks, companies, and the like. The bond to be identified refers to a bond which needs to calculate the probability of the default risk, and is a securities which is issued by government, enterprise, bank and other bonds according to legal procedures and promises to debtors to pay money on a specified date, and the bond to be identified can include but is not limited to: government bonds, financial bonds, or corporate bonds.
In this embodiment, each to-be-identified bond issuer may constitute a to-be-identified bond issuer set, and each to-be-identified bond may constitute a to-be-identified bond set, and it is understood that the to-be-identified bond issuer set is composed of a plurality of to-be-identified bonds, and the to-be-identified bond set is composed of a plurality of to-be-identified bond sets. It should be noted that each to-be-identified bond issuer in the to-be-identified bond issuer set has corresponding inline data, the inline data may be based on customer granularity, each inline data corresponds to a corresponding target inline characteristic variable, each to-be-identified bond in the to-be-identified bond set has corresponding off-line data, the off-line data may be based on bond granularity, and each off-line data corresponds to a corresponding target off-line characteristic variable.
In this embodiment, the target characteristic variables may include target in-line characteristic variables and target out-of-line characteristic variables, the target in-line characteristic variables may be understood as values of data corresponding to internal data of the bank, and the target out-of-line characteristic variables may be understood as values of data corresponding to external data of a third party organization such as fund and government. Exemplary, target out-of-line feature variables may include, but are not limited to: operating cash flow net amount/interest debt, total asset turnover rate, profit value corresponding to net profit, and delayed disclosure days of financial report; the in-target-row feature variables may include, but are not limited to: deposit balance, loan product number, loan balance bad for the public client.
In this embodiment, the target characteristic variables of each to-be-identified bond issuer in the to-be-identified bond issuer set and each to-be-identified bond in the to-be-identified bond set may be respectively obtained as the target in-row characteristic variable and the target out-of-row characteristic variable. In some embodiments, the target characteristic variable corresponding to each bond issuer to be identified may be obtained from the set of bond issuers to be identified as the target in-line characteristic variable, and the target characteristic variable corresponding to each bond to be identified may be obtained from the set of bond to be identified as the target out-of-line characteristic variable.
And S120, inputting the characteristic variables in the target row into at least two pre-established target machine learning models respectively to obtain corresponding default risk probability of the bond issuer.
The target machine learning model can be understood as a machine learning model obtained by training corresponding data. The bond issuer default risk probability can be understood as the default issuer risk probability corresponding to the target intra-row characteristic variable output by the target machine learning model.
In some embodiments, the number of target machine learning models is two or more, and the target machine learning models may include: XGboost model; a random forest model; an SVM model; a neural network model; and (4) performing logistic regression model.
The XGBoost model may also be referred to as a gradient boost model. The random forest model refers to a classifier model which trains and predicts a sample by using a plurality of trees. The SVM model refers to a Support Vector Machine (SVM) which is a common discrimination method, and in the field of Machine learning, the SVM model is a supervised learning model and is generally used for pattern recognition, classification and regression analysis.
In this embodiment, two or more target machine learning models may be created in advance, the intra-target characteristic variables corresponding to each bond issuer to be identified may be input into the two or more target machine learning models, and the probability of debt issuer default risk corresponding to each intra-target characteristic variable may be output through the two or more target machine learning models. Illustratively, the pre-created target machine learning model is: and the XGboost model and the random forest model are used for inputting the characteristic variables in the target line into the random forest model and the XGboost model respectively so as to identify the characteristic variables respectively through the random forest model and the XGboost model, so that the default risk probability of the bond issuer is obtained.
In one embodiment, the creation process of the target machine learning model includes:
acquiring original in-line data and original out-of-line data;
performing data preprocessing on the original in-line data and the original out-of-line data to obtain corresponding target in-line data and target out-of-line data;
and respectively inputting the target in-line data and the target out-of-line data into the original machine learning model for training to obtain the corresponding target machine learning model.
The original inline data may be understood as original data inside the financial institution, and the original extra-line data may be understood as original extra-line data acquired from a third-party institution. Target inline data may be understood as inline data resulting from data processing of the original inline data. The target off-line data may be understood as off-line data obtained by performing data processing on the original off-line data.
In this embodiment, whether the intra-row data is the original intra-row data or the target intra-row data, the method may include, but is not limited to: the credit investigation information, the capital flow direction, the loan deposit data, the rating, the product holding condition and other data of the client, and in the same way, no matter the data is the original in-line data or the original out-of-line data, the basic data can be obtained from 3 dimensions, namely basic characteristic data of the bond, financial data and price data, wherein the basic characteristic data of the bond can be bond classification, industry classification, establishment years, stockholder holding ratio and the like; the financial data can be the financial factor data such as net asset profitability (ROE), asset Return Rate (ROA), sale gross interest rate, sale net interest rate and the like; the price data can be price factors such as interval opening price, interval closing price, highest price, lowest price and the like, and the interval mentioned here is generally subject to quarterly. It should be noted that, the data preprocessing process of the original in-line data and the original out-of-line data may be to perform data preprocessing on the original in-line data and the original out-of-line data by means of data cleaning, data derivation, data integration, data transformation, and the like to obtain the target in-line data and the target out-of-line data.
In this embodiment, the number of the original machine learning models corresponds to the number of the target machine learning models, the target in-line data is input into the original machine learning models for training to obtain corresponding target machine learning models, and the target out-of-line data is input into the original machine learning models for training to obtain corresponding target machine learning models.
In this embodiment, the original in-line data and the original out-of-line data may be obtained from the internal data of the bank and the external data of the third-party organization, the original in-line data and the original out-of-line data may be subjected to data preprocessing in the manners of data cleaning, data derivation, data integration, data transformation, and the like to obtain the target in-line data and the target out-of-line data, and the target in-line data and the target out-of-line data may be input into the original machine learning model respectively for training to obtain the corresponding target machine learning model.
In an embodiment, the data preprocessing the original in-line data and the original out-of-line data to obtain corresponding target in-line data and target out-of-line data includes:
sequentially carrying out data cleaning on the original in-line data and the original out-of-line data to obtain corresponding intermediate in-line data and intermediate out-of-line data;
and respectively carrying out data derivation on the middle in-line data and the middle out-of-line data according to a preset data derivation mode to obtain corresponding target in-line data and target out-of-line data.
Data cleansing, which may be understood as clearing data by filling in missing values, smoothing noisy data, identifying or deleting outliers and resolving inconsistencies, is primarily aimed at standardizing data formats, performing abnormal data cleanup, error correction, and removing duplicate data. The intermediate inline data refers to inline data obtained by data washing the original inline data. The middle off-line data refers to off-line data obtained by performing data cleaning along with the original off-line data.
In this embodiment, the preset data derivation manner may be understood as a preset data derivation index in consideration of some factors. Specifically, the preset data derivation manners corresponding to the middle in-line data and the middle out-of-line data may be set correspondingly through actual conditions, or may be set correspondingly according to experience, which is not limited herein.
In this embodiment, the original in-line data and the original out-of-line data are sequentially subjected to data cleaning to obtain intermediate in-line data corresponding to the original in-line data and intermediate out-of-line data corresponding to the original out-of-line data, and the intermediate in-line data and the intermediate out-of-line data are subjected to data derivation according to a preset data derivation manner to obtain corresponding target in-line data and target out-of-line data. It should be noted that, when data derivation is performed on the intermediate intra-line data and the intermediate extra-line data, corresponding data derivation indexes are considered first, for example, the data derivation indexes may be time parameters, financial parameters, and other factors, and specifically, time parameters corresponding to the intermediate intra-line data and the intermediate extra-line data are considered, so as to perform data derivation on the intermediate intra-line data and the intermediate extra-line data respectively through the time parameters and a preset data derivation manner.
In this embodiment, when data derivation is performed on the intermediate data, two categories, namely financial data and price data in the target intermediate data, are generally selected to perform variable derivation in the data, so as to obtain corresponding target data in line and target data out of line. It should be noted that after data derivation, the obtained target in-line data and target out-of-line data may be less than the original in-line data and the original out-of-line data, may be more than the original in-line data and the original out-of-line data, or may be as much as the original in-line data and the original out-of-line data.
Illustratively, for better understanding, data derivation is performed on the intermediate in-line data and the intermediate out-of-line data by a preset data derivation manner to obtain corresponding target in-line data and target out-of-line data, and table one shows data tables corresponding to the original in-line data and the original out-of-line data, respectively. And the second table shows data tables corresponding to the data in the target line and the data out of the target line for data derivation.
Table one: data tables corresponding to original in-line data and original out-of-line data respectively
Figure BDA0003872806730000091
Figure BDA0003872806730000101
Table two: data tables corresponding to data in the target line and data out of the target line respectively are derived
Figure BDA0003872806730000102
In an embodiment, the data derivation of the middle in-line data and the middle out-of-line data according to a preset data derivation manner to obtain corresponding target in-line data and target out-of-line data includes:
acquiring time parameters corresponding to the intermediate in-line data and the intermediate out-of-line data;
and respectively carrying out data derivation on the intermediate in-line data and the intermediate out-of-line data according to a preset data derivation mode based on the time parameter to obtain corresponding target in-line data and target out-of-line data.
The time parameter can be understood as a derivative index corresponding to data derivation.
In this embodiment, time parameters corresponding to the intermediate in-line data and the intermediate out-of-line data are obtained, and data derivation is performed on the intermediate in-line data and the intermediate out-of-line data based on the time parameters and according to a preset data derivation manner, so as to obtain corresponding target in-line data and target out-of-line data.
Illustratively, the third table is used for performing data derivation on the middle in-line data and the middle out-of-line data respectively by using a time parameter and a preset data derivation mode. And taking the financial factors and the time parameters based on the stage 1 of the ROE, respectively taking the values of the latest stages 1, 2, 3 and 4 of the ROE as basic variables, and then taking the latest stage 1 as a reference to generate a plurality of derived variables of the ROE variables, wherein the table III can be referred to specifically.
Table three: respectively carrying out data derivation on the middle in-line data and the middle out-of-line data in a time parameter and preset data derivation mode
Figure BDA0003872806730000111
Figure BDA0003872806730000121
And S130, inputting the target extravehicular characteristic variables into at least two pre-established target machine learning models respectively to obtain corresponding bond default risk probabilities.
The bond default risk probability can be understood as the default bond risk probability corresponding to the target extravehicular characteristic variable output by the target machine learning model.
In this embodiment, two or more target machine learning models may be created in advance, the characteristic variables in the target row corresponding to each bond to be identified are input to the two or more target machine learning models, and the probability of bond default risk corresponding to the characteristic variables in each target row is output through the two or more target machine learning models. Illustratively, the pre-created target machine learning model is: and the XGboost model and the random forest model respectively input the target extrarow characteristic variables into the random forest model and the XGboost model to identify through the random forest model and the XGboost model so as to obtain the debt default risk probability.
S140, generating a corresponding bond issuer default risk target list according to the bond issuer default risk probability and the bond default risk probability.
The target list of debt issuer default risk can be understood as the risk list of the obtained debt issuer with the highest default rate.
In this embodiment, the corresponding bond issuer default risk probability may be obtained according to the target in-line characteristic variables identified by the at least two target machine learning models, and the corresponding bond default risk probability may be obtained according to the target out-of-line characteristic variables identified by the at least two target machine learning models, so that the corresponding bond issuer default risk target list may be generated according to the obtained bond issuer default risk probability and the bond default risk probability. Specifically, a set of to-be-identified bond issuers can be formed by each to-be-identified bond issuer, and a corresponding bond issuer default risk initial list is generated according to the bond issuer default risk probability and the set of to-be-identified bond issuers; the method comprises the steps that a to-be-identified bond set is formed through each to-be-identified bond, a corresponding bond default risk initial list is generated according to bond default risk probability, a preset bond default risk probability threshold and the to-be-identified bond set, and a corresponding bond issuer default risk target list is generated through the bond issuer default risk initial list and the bond default risk initial list.
According to the technical scheme of the embodiment, the target in-line characteristic variables are respectively input into the at least two target machine learning models to obtain the corresponding default risk probability of the bond issuer, the target out-of-line characteristic variables are respectively input into the at least two target machine learning models to obtain the corresponding default risk probability of the bond issuer, and the corresponding target list of the bond issuer default risk is generated according to the default risk probability of the bond issuer and the bond default risk probability, so that the default condition of the bond issuer can be effectively identified, the default list of the bond issuer can be regularly generated, the prequalification of the bond default risk issuer is improved, the accuracy and timeliness of risk early warning are improved, and the risk of fund loss caused by bond default is reduced.
In an embodiment, before determining the target characteristic variable of each bond issuer to be identified and each bond to be identified respectively as the in-target characteristic variable and the out-target characteristic variable, the method further includes:
reading and acquiring a to-be-identified bond issuer and a to-be-identified bond in a target bond database at regular time;
and respectively combining the bond issuer to be identified and the bond to be identified to obtain a corresponding bond issuer set to be identified and a corresponding bond set to be identified.
The target bond database may be understood as a bond database for storing bond issuers and bonds.
In this embodiment, the to-be-identified bond issuer and the to-be-identified bond in the target bond database may be automatically obtained at regular time according to the time granularity, the to-be-identified bond issuer may be combined to obtain a corresponding to-be-identified right issuer set, and the to-be-identified bonds may be combined to obtain a corresponding to-be-identified bond set. It should be noted that the time granularity may be defined by days or months, and this embodiment is not limited herein.
In an embodiment, the method further comprises:
determining the average influence value of the target in-line data and the target out-of-line data on the default risk probability;
determining a risk data source matched with each target machine learning model according to the average influence value; wherein, the risk data source includes: a target in-line feature variable and a target out-of-line feature variable.
In this embodiment, the average influence Value may also be referred to as a sharp-Value, which may be used to identify a high risk factor, and may determine, through the average influence Value, the influence degree of each target inline data and target offline data with respect to a corresponding target machine learning model, so as to facilitate a business to quickly investigate a risk data source. It should be noted that the influence of the identification result of the target in-line characteristic variable and the target out-of-line characteristic variable can be visually displayed in a numerical manner.
In this embodiment, by determining the average influence value of the target in-line data and the target out-of-line data on the default risk probability, and determining the risk data source where the target in-line data and the target out-of-line data match with each corresponding target machine learning model according to the average influence value, it can be understood that whether the source of the risk data is the target in-line data or the target out-of-line data can be determined by the average influence value. Illustratively, the original in-line data is 1-10 data, the original out-of-line data is 1-10 data, and 6 and 5 data are sequentially selected from the original in-line data and the original out-of-line data in the original in-line data as the target in-line characteristic variable and the target out-of-line characteristic variable according to the average influence value of the target in-line data and the target out-of-line data on the default risk probability.
It should be noted that the greater the risk degree corresponding to the target in-line characteristic variable and the target out-of-line characteristic variable, the higher the average influence value corresponding to the target in-line characteristic variable and the target out-of-line characteristic variable, and conversely, the smaller the risk degree corresponding to the target in-line characteristic variable and the target out-of-line characteristic variable, the lower the average influence value corresponding to the target in-line characteristic variable and the target out-of-line characteristic variable.
In an embodiment, fig. 2 is a flowchart of another method for identifying a default risk of a bond issuer according to an embodiment of the present invention, and in this embodiment, on the basis of the foregoing embodiments, a target list of default risk of the bond issuer generated according to a bond issuer default risk probability and a bond default risk probability is further refined, as shown in fig. 2, the method for identifying a default risk of a bond issuer in this embodiment may specifically include the following steps:
s210, respectively acquiring target characteristic variables of each bond issuer to be identified in the bond issuer set to be identified and each bond to be identified in the bond set to be identified as target in-row characteristic variables and target out-of-row characteristic variables.
And S220, respectively inputting the characteristic variables in the target row into at least two pre-established target machine learning models to obtain corresponding default risk probability of the bond issuer.
And S230, inputting the target out-of-line characteristic variables into at least two pre-established target machine learning models respectively to obtain corresponding bond default risk probabilities.
S240, generating a corresponding default risk initial list of the bond issuer according to the default risk probability of the bond issuer and the bond issuer set to be identified.
In this embodiment, the initial list of default risks of the bond issuers refers to a list of default risks of the bond issuers preliminarily screened from the set of bond issuers to be identified.
In this embodiment, the target intra-row characteristic variables are respectively input into at least two target machine learning models to obtain corresponding bond issuer default risk probabilities, and the preliminarily screened corresponding bond issuer default risk initial list is generated through the obtained bond issuer default risk probabilities and the bond issuer set to be identified.
Specifically, the default risk probability of the bond issuer output by each target machine learning model can be screened according to the default risk probability of the bond issuer output by at least two target machine learning models to intercept the bond issuer meeting the requirements in the bond issuer set to be identified and generate a corresponding bond issuer default risk initial list, and the default risk probability of the bond issuer default output by each target machine learning model can be screened according to a preset bond issuer default risk probability threshold to obtain a corresponding bond issuer default initial list.
In one embodiment, generating a corresponding initial list of bond issuer default risks according to the bond issuer default risk probability and the set of bond issuers to be identified includes:
and intercepting the bond issuers with the first ratio in the bond issuer set to be identified according to the bond issuer default risk probability output by each target machine learning model, and generating a corresponding bond issuer default risk initial list.
The first proportion can be understood as the proportion of the bond issuers in the bond issuer set to be identified. The first ratio may be set according to experience or may be set by itself according to requirements, which is not limited herein.
In this embodiment, the bond issuer default risk probability output by each target machine learning model can be respectively intercepted to generate a corresponding bond issuer default risk initial list, and by selecting the bond issuer with the first percentage, an initial list with a higher bond issuer default risk can be obtained, thereby improving the accuracy. Illustratively, the first proportion is set to be 5%, 4 target machine learning models are respectively 2 XGBoost models and random forest models, and 2 XGBoost models respectively input target intra-row characteristic variables into the XGBoost models and the random forest models to obtain corresponding default risk probabilities of the bond issuers, respectively input target extra-row characteristic variables into the XGBoost models and the random forest models to obtain the default risk probabilities of the bonds, respectively intercept the bond issuers 5% before the default risk probabilities of the bond issuers in the bond issuer set to be identified, and generate corresponding default risk initial lists of the bond issuers.
In one embodiment, generating a corresponding initial list of bond issuer default risks according to the bond issuer default risk probability and the set of bond issuers to be identified includes:
and screening the default risk probability of the bond issuer output by each target machine learning model according to a preset bond issuer default risk probability threshold value to obtain a corresponding bond issuer default initial list.
The preset bond issuer default risk probability threshold refers to a threshold corresponding to the bond issuer default risk probability, and the threshold can be set manually or automatically according to experience and requirements, and the embodiment is not limited here. It should be noted that the preset bond issuer default risk probability threshold and the preset bond default risk probability threshold may be set the same or different. Illustratively, the threshold value of the default bond issuer default risk probability threshold value and the default bond default risk probability threshold value can be 0.5; or the default bond issuer default risk probability threshold value is 0.18; the default bond default risk probability threshold is 0.35, and the embodiment is not limited herein.
In this embodiment, the bond issuer default risk probability output by each target machine learning model may be screened according to a preset bond issuer default risk probability threshold, so as to obtain a corresponding bond issuer default initial list. Illustratively, the default debt issuer default risk probability threshold values are 0.35 and 0.11, and an XGBoost model (corresponding to the intra-target characteristic variable) is used as an example for explanation; selecting a default issuing body with bond issuing body default risk probability >0.35, and selecting a random forest model (corresponding target intra-row characteristic variables): and selecting a default issuer with the bond issuer default risk probability > 0.11.
S250, generating a corresponding bond default risk initial list according to the bond default risk probability, a preset bond default risk probability threshold and the bond set to be identified;
wherein, the initial list of bond default risks refers to a list of bond default risks preliminarily screened from the set of bonds to be identified.
In this embodiment, a corresponding bond default risk initial list may be generated according to the bond default risk probability, the preset bond default risk probability threshold, and the bond set to be identified. The preset bond default risk probability threshold in this embodiment refers to a threshold corresponding to the bond default risk probability, and the threshold may be set manually or may be set automatically according to experience and demand, which is not limited herein. For example, taking the preset bond default risk probability thresholds of 0.18 and 0.4 as examples, the XGBoost model (corresponding to the objective out-of-line feature variables): selecting a default issuer with bond default risk probability > 0.18; random forest model (corresponding to target out-of-line feature variables): selecting a default issuer with bond default risk probability >0.4,
s260, generating a corresponding default risk target list of the bond issuer according to the default risk initial list of the bond issuer and the default risk initial list of the bond.
In this embodiment, a corresponding bond issuer default risk candidate list is obtained from each bond issuer default risk initial list, and similarly, a corresponding bond default risk candidate list is obtained from each bond default risk initial list, so as to generate a corresponding bond issuer default risk target list according to the bond issuer default risk candidate list and the bond default risk candidate list.
In one embodiment, generating a corresponding target list of the default risk of the bond issuer according to the initial list of the default risk of the bond issuer and the initial list of the default risk of the bond, includes:
executing union operation on the bond issuers in each bond issuer default risk initial list to obtain a corresponding bond issuer default risk candidate list;
executing union operation on the bonds in each bond default risk initial list to obtain a corresponding bond default risk candidate list;
and generating a corresponding bond issuing body default risk target list according to the bond issuing body default risk candidate list and the bond default risk candidate list.
The bond issuer default risk candidate list refers to a candidate list obtained by selecting a union of bond issuers in each bond issuer default risk initial list. The bond default risk candidate list refers to a candidate list obtained by selecting a union set consisting of bonds in each bond default risk initial list.
In this embodiment, a union operation is performed on the bond issuers in each bond issuer default risk initial list to obtain a corresponding bond issuer default risk candidate list, a union operation is performed on the bonds in each bond default risk initial list to obtain a corresponding bond default risk candidate list, and an intersection between the bond issuer default risk candidate list and the bond default risk candidate list is selected to generate a corresponding bond issuer default risk target list according to the intersection.
According to the technical scheme in the embodiment, the corresponding initial list of the default risks of the bond issuers is generated through the default risk probability of the bond issuers and the set of the bond issuers to be identified, the corresponding initial list of the default risks of the bond is generated according to the default risk probability of the bond, the preset threshold of the default risk probability of the bond and the set of the bond to be identified, the corresponding target list of the default risks of the bond issuers is generated according to the initial list of the default risks of the bond and the initial list of the bond default risks, the bond issuers with the highest default rate can be identified, and the predictability of the high-risk issuers can be effectively improved.
In an embodiment, in order to better understand the method for identifying a default risk of a bond issuer, fig. 3 is a flowchart of another method for identifying a default risk of a bond issuer according to an embodiment of the present invention, and the method for identifying a default risk of a bond issuer will be further described with reference to the preferred embodiment. The XGBoost model and the random forest model in this embodiment are target machine learning models in the above embodiments.
The identification of the default risk of the bond issuer in this embodiment may specifically include the following steps:
a1, acquiring a target characteristic variable of each bond issuer to be identified in the bond issuer set to be identified and a target characteristic variable of each bond to be identified in the bond issuer set to be identified, and respectively taking the target characteristic variables as a target in-line characteristic variable and a target out-of-line characteristic variable.
and a2, inputting the characteristic variables in the target row into the XGboost model and the random forest model respectively to obtain corresponding default risk probability of the bond issuer.
and a3, respectively inputting the target extravehicular characteristic variables into the XGboost model and the random forest model to obtain corresponding bond default risk probabilities.
and a4, generating a corresponding bond issuer default risk target list according to the bond issuer default risk probability and the bond default risk probability.
In this embodiment, each target machine learning model corresponds to a different target feature variable, and the intra-target feature variable is based on the client granularity, so that the default issuing entity can be directly identified; and identifying default bonds by using the target out-of-bank characteristic variable as bond granularity, and obtaining the bond issuer with the highest default rate through the default issuer and the default bonds.
In an embodiment, fig. 4 is a block diagram of a device for identifying a default risk of a bond issuer according to an embodiment of the present invention, where the device is suitable for use in identifying a default risk of a bond issuer, and the device may be implemented by hardware/software. The method for identifying the default risk of the bond issuer can be configured in the electronic equipment. As shown in fig. 4, the apparatus includes: a variable acquisition module 410, a first probability determination module 420, a second probability determination module 430, and a list generation module 440.
The variable acquiring module 410 is configured to acquire a target characteristic variable of each bond issuer to be identified in the bond issuer set to be identified and each bond to be identified in the bond set to be identified respectively, and serve as a target intra-row characteristic variable and a target extra-row characteristic variable;
a first probability determining module 420, configured to input the intra-target characteristic variables into at least two pre-created target machine learning models respectively, so as to obtain corresponding default risk probabilities of bond issuing entities;
the second probability determining module 430 is configured to input the target extravehicular feature variables into at least two pre-created target machine learning models respectively to obtain corresponding bond default risk probabilities;
and a list generating module 440, configured to generate a corresponding target list of bond issuer default risk according to the bond issuer default risk probability and the bond default risk probability.
According to the embodiment of the invention, the target in-line characteristic variables are respectively input into at least two target machine learning models through a first probability determination module to obtain corresponding default risk probabilities of the bond issuers, the target out-of-line characteristic variables are respectively input into at least two target machine learning models through a second probability determination module to obtain corresponding default risk probabilities of the bond issuers, and a list generation module is used for generating corresponding default risk target lists of the bond issuers according to the default risk probabilities of the bond issuers and the default risk probabilities of the bond, so that the default conditions of the bond issuers can be effectively identified, the default lists of the bond issuers can be generated regularly, the prejudgment performance of the bond contract risk issuers is improved, the accuracy and timeliness of risk early warning are improved, and the risk of fund loss caused by bond default is reduced.
In one embodiment, the apparatus further comprises:
the information reading module is used for regularly reading and acquiring the bond issuer to be identified and the bond to be identified in the target bond database before respectively determining the target characteristic variable of each bond issuer to be identified and each bond to be identified as the target in-row characteristic variable and the target out-of-row characteristic variable;
and the collection obtaining module is used for respectively combining the to-be-identified bond issuer and the to-be-identified bond to obtain a corresponding to-be-identified bond right issuer collection and a corresponding to-be-identified bond collection.
In one embodiment, the list generation module 440 includes:
a first list generating unit, configured to generate a corresponding bond issuer default risk initial list according to the bond issuer default risk probability and the bond issuer set to be identified;
a second list generating unit, configured to generate a corresponding initial list of bond default risks according to the bond default risk probability, a preset bond default risk probability threshold, and the to-be-identified bond set;
and the target list generating unit is used for generating a corresponding bond issuer default risk target list according to the bond issuer default risk initial list and the bond default risk initial list.
In an embodiment, the first list generating unit includes:
and the first list generation subunit is used for intercepting the bond issuers with the first proportion in the bond issuer set to be identified according to the bond issuer default risk probability output by each target machine learning model, and generating a corresponding bond issuer default risk initial list.
In an embodiment, the first list generating unit includes:
and the second list generation subunit is used for screening the default risk probability of the bond issuer output by each target machine learning model according to a preset bond issuer default risk probability threshold value to obtain a corresponding bond issuer default initial list.
In one embodiment, the target list generating unit includes:
the first list determining subunit is used for executing union operation on the bond issuers in each bond issuer default risk initial list to obtain a corresponding bond issuer default risk candidate list;
the second list determining subunit is configured to perform union operation on the bonds in each bond default risk initial list to obtain a corresponding bond default risk candidate list;
and the target list determining subunit is used for generating a corresponding bond issuer default risk target list according to the bond issuer default risk candidate list and the bond default risk candidate list.
In one embodiment, the creation process of the target machine learning model includes:
the data acquisition unit is used for acquiring original in-line data and original out-of-line data;
a data obtaining unit, configured to perform data preprocessing on the original in-line data and the original out-of-line data to obtain corresponding target in-line data and target out-of-line data;
and the model determining unit is used for respectively inputting the target in-line data and the target out-of-line data into an original machine learning model for training to obtain a corresponding target machine learning model.
In one embodiment, the apparatus further comprises:
an influence value determination unit, configured to determine an average influence value of the target in-line data and the target out-of-line data on the default risk probability;
the data source determining unit is used for determining a risk data source matched with each target machine learning model according to the average influence value; wherein the risk data sources include: a target in-line feature variable and a target out-of-line feature variable.
In one embodiment, the data obtaining unit includes:
a first data obtaining subunit, configured to perform data cleaning on the original in-line data and the original out-of-line data in sequence to obtain corresponding intermediate in-line data and intermediate out-of-line data;
and the second data obtaining subunit is used for respectively carrying out data derivation on the intermediate in-line data and the intermediate out-of-line data according to a preset data derivation mode to obtain corresponding target in-line data and target out-of-line data.
In an embodiment, the second data obtaining subunit is specifically configured to:
acquiring time parameters corresponding to the intermediate in-line data and the intermediate out-of-line data;
and respectively carrying out data derivation on the intermediate in-line data and the intermediate out-of-line data according to a preset data derivation mode based on the time parameter to obtain corresponding target in-line data and target out-of-line data.
In one embodiment, the target machine learning model comprises: XGboost model; a random forest model; an SVM model; a neural network model; and (4) performing logistic regression model.
The device for identifying the default risk of the bond issuer provided by the embodiment of the invention can execute the method for identifying the default risk of the bond issuer provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.
In an embodiment, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. The electronic device 10 is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital assistants, cellular phones, smart phones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device 10 includes at least one processor 11, and a memory communicatively connected to the at least one processor 11, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, and the like, wherein the memory stores a computer program executable by the at least one processor, and the processor 11 can perform various suitable actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from a storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data necessary for the operation of the electronic apparatus 10 can also be stored. The processor 11, the ROM 12, and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.
A number of components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, or the like; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The processor 11 performs the various methods and processes described above, such as the bond issuer default risk identification method.
In some embodiments, the bond issuer default risk identification method may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into RAM 13 and executed by processor 11, one or more steps of the bond issuer default risk identification method described above may be performed. Alternatively, in other embodiments, the processor 11 may be configured to perform the bond issuer breach risk identification method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
A computer program for implementing the methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable bond issuer breach risk identification device, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be performed. A computer program can execute entirely on a machine, partly on a machine, as a stand-alone software package partly on a machine and partly on a remote machine or entirely on a remote machine or server.
In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. A computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
In an embodiment, the embodiment of the present invention further includes a computer program product, the computer program product includes a computer program, and the computer program, when executed by a processor, implements the method for identifying a default risk of a bond issuer according to any embodiment of the present invention.
Computer program product in implementing the computer program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + +, and including conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present invention may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solution of the present invention can be achieved.
The above-described embodiments should not be construed as limiting the scope of the invention. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (16)

1. A bond issuer default risk identification method is characterized by comprising the following steps:
respectively acquiring target characteristic variables of each bond issuer to be identified in the bond issuer set to be identified and each bond to be identified in the bond set to be identified as target in-line characteristic variables and target out-of-line characteristic variables;
inputting the characteristic variables in the target row into at least two pre-established target machine learning models respectively to obtain corresponding default risk probability of the bond issuer;
inputting the target extravehicular characteristic variables into at least two pre-established target machine learning models respectively to obtain corresponding bond default risk probabilities;
and generating a corresponding bond issuer default risk target list according to the bond issuer default risk probability and the bond default risk probability.
2. The method according to claim 1, before the determining the target characteristic variables of each bond issuer to be identified and each bond to be identified as the target in-row characteristic variable and the target out-row characteristic variable respectively, further comprising:
reading and acquiring a to-be-identified bond issuer and a to-be-identified bond in a target bond database at regular time;
and respectively combining the to-be-identified bond issuer and the to-be-identified bond to obtain a corresponding to-be-identified bond issuer set and a corresponding to-be-identified bond set.
3. The method according to claim 1 or 2, wherein the generating of the corresponding target list of bond issuer default risk according to the bond issuer default risk probability and the bond default risk probability comprises:
generating a corresponding default risk initial list of the bond issuers according to the default risk probability of the bond issuers and the bond issuer set to be identified;
generating a corresponding bond default risk initial list according to the bond default risk probability, a preset bond default risk probability threshold and the bond set to be identified;
and generating a corresponding bond issuer default risk target list according to the bond issuer default risk initial list and the bond default risk initial list.
4. The method of claim 3, wherein the generating of the corresponding initial list of bond issuer default risks according to the bond issuer default risk probability and the set of bond issuers to be identified comprises:
and intercepting the bond issuers with the first ratio in the bond issuer set to be identified according to the bond issuer default risk probability output by each target machine learning model, and generating a corresponding bond issuer default risk initial list.
5. The method of claim 3, wherein generating a corresponding initial list of bond issuer default risks according to the probability of bond issuer default risk and the set of bond issuers to be identified comprises:
and screening the default risk probability of the bond issuer output by each target machine learning model according to a preset bond issuer default risk probability threshold value to obtain a corresponding bond issuer default initial list.
6. The method of claim 3, wherein the generating of the corresponding target list of bond issuer default risk according to the initial list of bond issuer default risk and the initial list of bond default risk comprises:
executing union operation on the bond issuers in each bond issuer default risk initial list to obtain a corresponding bond issuer default risk candidate list;
performing union operation on the bonds in each bond default risk initial list to obtain a corresponding bond default risk candidate list;
and generating a corresponding bond issuer default risk target list according to the bond issuer default risk candidate list and the bond default risk candidate list.
7. The method of claim 1, wherein the creation of the target machine learning model comprises:
acquiring original in-line data and original out-of-line data;
performing data preprocessing on the original in-line data and the original out-of-line data to obtain corresponding target in-line data and target out-of-line data;
and respectively inputting the target in-line data and the target out-of-line data into an original machine learning model for training to obtain a corresponding target machine learning model.
8. The method of claim 7, further comprising:
determining an average influence value of the target in-line data and the target out-of-line data on the default risk probability;
determining a risk data source matched with each target machine learning model according to the average influence value; wherein the risk data sources include: target intra-row feature variables and target out-of-row feature variables.
9. The method of claim 7, wherein the pre-processing the original in-line data and the original out-of-line data to obtain corresponding target in-line data and target out-of-line data comprises:
sequentially carrying out data cleaning on the original in-line data and the original out-of-line data to obtain corresponding intermediate in-line data and intermediate out-of-line data;
and respectively carrying out data derivation on the intermediate in-line data and the intermediate out-of-line data according to a preset data derivation mode to obtain corresponding target in-line data and target out-of-line data.
10. The method according to claim 9, wherein the performing data derivation on the intermediate in-line data and the intermediate out-of-line data according to a preset data derivation manner to obtain corresponding target in-line data and target out-of-line data comprises:
acquiring time parameters corresponding to the intermediate in-line data and the intermediate out-of-line data;
and respectively carrying out data derivation on the intermediate in-line data and the intermediate out-of-line data according to a preset data derivation mode based on the time parameter to obtain corresponding target in-line data and target out-of-line data.
11. The method of any of claims 1-10, wherein the target machine learning model comprises: XGboost model; a random forest model; an SVM model; a neural network model; and (4) performing logistic regression model.
12. A bond issuer default risk identification device, comprising:
the variable acquisition module is used for respectively acquiring a target characteristic variable of each bond to be identified in the bond issuer set to be identified and each bond to be identified in the bond set to be identified as a target in-line characteristic variable and a target out-of-line characteristic variable;
the first probability determination module is used for respectively inputting the characteristic variables in the target rows into at least two pre-established target machine learning models to obtain corresponding default risk probabilities of the bond issuing entity;
the second probability determination module is used for respectively inputting the target extravehicular characteristic variables into at least two pre-established target machine learning models to obtain corresponding bond default risk probabilities;
and the list generating module is used for generating a corresponding bond issuer default risk target list according to the bond issuer default risk probability and the bond default risk probability.
13. The apparatus of claim 12, wherein the list generation module comprises:
a first list generating unit, configured to generate a corresponding bond issuer default risk initial list according to the bond issuer default risk probability and the bond issuer set to be identified;
a second list generating unit, configured to generate a corresponding bond default risk initial list according to the bond default risk probability, a preset bond default risk probability threshold, and the bond set to be identified;
and the target list generating unit is used for generating a corresponding bond issuer default risk target list according to the bond issuer default risk initial list and the bond default risk initial list.
14. An electronic device, characterized in that the electronic device comprises:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program executable by the at least one processor, the computer program being executable by the at least one processor to enable the at least one processor to perform the method of claim 1-11.
15. A computer-readable storage medium storing computer instructions for causing a processor to implement the method for bond issuer default risk identification of any one of claims 1-11 when executed.
16. A computer program product, characterized in that the computer program product comprises a computer program which, when executed by a processor, implements the bond issuer breach risk identification method according to any one of claims 1-11.
CN202211203989.7A 2022-09-29 2022-09-29 Bond issuer default risk identification method, device, equipment, medium and product Pending CN115526697A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211203989.7A CN115526697A (en) 2022-09-29 2022-09-29 Bond issuer default risk identification method, device, equipment, medium and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211203989.7A CN115526697A (en) 2022-09-29 2022-09-29 Bond issuer default risk identification method, device, equipment, medium and product

Publications (1)

Publication Number Publication Date
CN115526697A true CN115526697A (en) 2022-12-27

Family

ID=84699940

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211203989.7A Pending CN115526697A (en) 2022-09-29 2022-09-29 Bond issuer default risk identification method, device, equipment, medium and product

Country Status (1)

Country Link
CN (1) CN115526697A (en)

Similar Documents

Publication Publication Date Title
JP7522731B2 (en) SYSTEM AND METHOD FOR ANTI-MONEY LAUNDERING ANALYSIS - Patent application
US20150081378A1 (en) Transactional risk daily limit update alarm
US8768809B1 (en) Methods and systems for managing financial data
CN111008896A (en) Financial risk early warning method and device, electronic equipment and storage medium
US20150081524A1 (en) Analytics driven assessment of transactional risk daily limit exceptions
US20130325598A1 (en) Financial account related trigger feature for triggering offers based on financial information
US20150081542A1 (en) Analytics driven assessment of transactional risk daily limits
CN111179051A (en) Financial target customer determination method and device and electronic equipment
CN110689437A (en) Communication construction project financial risk prediction method based on random forest
US20130325698A1 (en) Financial account related trigger feature for risk mitigation
CN113034046A (en) Data risk metering method and device, electronic equipment and storage medium
CN114997975A (en) Abnormal enterprise identification method, device, equipment, medium and product
CN115545886A (en) Overdue risk identification method, overdue risk identification device, overdue risk identification equipment and storage medium
CN110930242A (en) Credibility prediction method, device, equipment and storage medium
CN114638504A (en) Enterprise risk assessment method, device, equipment, medium and product
EP4020364A1 (en) Method for calculating at least one score representative of a probable activity breakage of a merchant, system, apparatus and corresponding computer program
CN115545909A (en) Approval method, device, equipment and storage medium
CN115526697A (en) Bond issuer default risk identification method, device, equipment, medium and product
CN115759283A (en) Model interpretation method and device, electronic equipment and storage medium
EP3226192A1 (en) Security system monitoring techniques
CN111429257B (en) Transaction monitoring method and device
JP7298286B2 (en) Model providing program, model providing method and model providing apparatus
Hassan et al. Examining the contribution of fiscal policy on economic growth: Analytical insights from Pakistan
CN114971697A (en) Data processing method, device, equipment, storage medium and product
CN115797033A (en) Capital supervision method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination