CN113283583A - Method and device for predicting default rate of textile industry, storage medium and processor - Google Patents

Method and device for predicting default rate of textile industry, storage medium and processor Download PDF

Info

Publication number
CN113283583A
CN113283583A CN202110550772.2A CN202110550772A CN113283583A CN 113283583 A CN113283583 A CN 113283583A CN 202110550772 A CN202110550772 A CN 202110550772A CN 113283583 A CN113283583 A CN 113283583A
Authority
CN
China
Prior art keywords
data
layer
default rate
structured data
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110550772.2A
Other languages
Chinese (zh)
Inventor
赵振洪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Scenic Information Technology Co ltd
Original Assignee
Guangzhou Scenic Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Scenic Information Technology Co ltd filed Critical Guangzhou Scenic Information Technology Co ltd
Priority to CN202110550772.2A priority Critical patent/CN113283583A/en
Publication of CN113283583A publication Critical patent/CN113283583A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/04Manufacturing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Abstract

The embodiment of the invention provides a method and a device for predicting the default rate of textile industry, a processor and a storage medium. The method comprises the following steps: acquiring structural data of the order to be predicted in the textile industry; inputting the structured data into a default rate prediction model; and determining the default rate of the order to be predicted through the default rate prediction model. The prediction method effectively utilizes the structural data of the textile industry, also effectively utilizes the sparse characteristic data which is easily screened and filtered by the characteristic engineering in the traditional data analysis process, improves the score of model prediction, reduces the mean square error, and can effectively and directly predict the default rate of orders.

Description

Method and device for predicting default rate of textile industry, storage medium and processor
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for predicting the default rate of textile industry, a storage medium and a processor.
Background
The textile industry, due to the nature of the business, occasionally faces: the cloth buyer has insufficient funds to buy the cloth, and the cloth supplier is unwilling to bear the risk to provide the cloth to the cloth buyer in advance, and the cloth supplier is paid after the cloth buyer sells the cloth. Therefore, an integrator creditor is generated to carry out risk undertaking, and the upstream accepts the cloth supplier and pays for the cloth in advance; the downstream receives the material distributor, and the distributor provides the material first and collects the material at certain time intervals. However, this method of making financial loan on the goods faces the problem that the purchasing merchant of the material may not pay the repayment or the arrears on time.
The conventional assessment method for loan risk in the textile industry usually seeks the assistance of a bank to evaluate the risk, but the risk assessment is not accurate due to the lack of the business of purchasing the cloth by a cloth buyer or the lack of related business data in the textile industry.
Disclosure of Invention
The embodiment of the invention aims to provide a method and a device for predicting the default rate of the textile industry, a storage medium and a processor.
In order to achieve the above object, a first aspect of the present invention provides a default rate violation prediction method, including:
acquiring structural data of the order to be predicted in the textile industry;
inputting the structured data into a default rate prediction model;
and determining the default rate of the order to be predicted through the default rate prediction model.
Optionally, the default rate prediction model comprises: the device comprises an input layer, a normalization layer, a plurality of full-connection layers, a disposal layer and an output layer.
Optionally, the number of the fully-connected layers is at least 7, and determining the default rate of the order to be predicted by the default rate prediction model includes: after the input layer acquires the structured data, the input layer imports the structured data into the standardized layer; standardizing the structured data through a standardized layer, and outputting the standardized structured data to a first full connection layer; extending the standardized structured data through a first full connection layer, and selecting and abandoning a layer for the output value of the extended structured data; reducing the degree of overfitting of the default rate prediction model through the optional layer; extending the structured data again through a second full connection layer; concentrating the concentrated structured data through a third connecting layer; re-extending the concentrated structured data through a fourth connecting layer; the structural data after being amplified again is extended again through a fifth connecting layer; amplifying the structural data after the second amplification through a sixth connecting layer; re-concentrating the re-amplified structured data through a seventh connecting layer; transmitting the re-concentrated structured data to an output layer; and inputting a default rate obtained by predicting the structured data through an output layer.
Optionally, the fully connected layer comprises an activation function as in the following equation (1):
Figure BDA0003070745910000021
where x is the input value of the activation function and α has a value of 1.
Optionally, the method further comprises: acquiring historical structured data of the textile industry; cleaning the historical structured data, and removing the data which do not meet preset conditions; carrying out standardization processing on the rejected historical structured data; the standardized historical structured data is connected in series and merged; and carrying out statistics and collection on the historical structured data after the serial connection and combination, and dividing the historical structured data into a training data set and a test data set.
Optionally, the method further comprises: training a default rate prediction model by using the training data set; testing the trained default rate prediction model by using the test data set; and after the test is passed, determining that the default rate prediction model is completely trained.
The second aspect of the present invention provides a device for predicting a default rate of a textile industry, comprising:
the data acquisition module is used for acquiring the structural data of the order to be predicted in the textile industry;
a prediction module for inputting the structured data into a default rate prediction model; and determining the default rate of the order to be predicted through the default rate prediction model.
A third aspect of the invention provides a machine-readable storage medium having stored thereon instructions that, when executed by a processor, cause the processor to be configured to perform the above-described textile industry default rate prediction method.
A fourth aspect of the invention provides a processor configured to perform the above-mentioned method for predicting a default rate of a textile industry.
According to the method for predicting the default rate of the textile industry, the structured data of the order to be predicted in the textile industry are obtained and input into the default rate prediction model, and the default rate of the order to be predicted is determined through the default rate prediction model.
Additional features and advantages of embodiments of the invention will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the embodiments of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the embodiments of the invention without limiting the embodiments of the invention. In the drawings:
FIG. 1 schematically illustrates a flow diagram of a method for predicting a rate of default in a textile industry according to an embodiment of the invention;
FIG. 2 is a schematic diagram illustrating the structure of a default rate prediction model according to an embodiment of the invention;
FIG. 3 schematically shows a diagram of the output of an activation function according to an embodiment of the invention;
FIG. 4 is a block diagram schematically illustrating the structure of a device for predicting a default rate of textile industry according to an embodiment of the present invention;
fig. 5 schematically shows an internal structure diagram of a computer apparatus according to an embodiment of the present invention.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating embodiments of the invention, are given by way of illustration and explanation only, not limitation.
Fig. 1 schematically shows a flow chart of a method for predicting a default rate of a textile industry according to an embodiment of the present invention. As shown in fig. 1, in an embodiment of the present invention, a method for predicting a default rate of a textile industry is provided, which includes the following steps:
step 101, acquiring structural data of an order to be predicted in the textile industry.
Step 102, inputting the structured data into a default rate prediction model.
And 103, determining the default rate of the order to be predicted through the default rate prediction model.
The data of the textile industry is special and is usually structured data. For an order to be predicted, structured data of the order can be obtained and input into the default rate prediction model.
In one embodiment, the structured data includes at least one of cloth purchaser base information, cloth purchaser credit information, cloth purchaser historical order information, and market finance information; the basic information of the cloth buyer comprises at least one of registered capital, establishment date and staff number of the cloth buyer; the credit information of the cloth buyer comprises at least one of legal information, enterprise credit information and operation information; the cloth purchaser historical order information comprises at least one of order amount, order type, fabric data and repayment information; the market financial information includes at least one of cloth market information, financial market information, and contest information.
And determining the default rate of the order to be predicted through the default rate prediction model. The default rate refers to the default probability of overdue repayment when the cloth buyer uses the financial commodity.
In one embodiment, the default rate prediction model comprises: the device comprises an input layer, a normalization layer, a plurality of full-connection layers, a disposal layer and an output layer.
The default rate prediction model uses structured wide-table data for input and outputs data with one dimension according to different prediction targets. As shown in fig. 2, a schematic structural diagram of a default rate prediction model is provided. As can be seen from fig. 2, there are a plurality of connection layers (full connection layers) included in the default rate prediction model.
In one embodiment, the fully connected layer comprises at least 7 layers, and determining the default rate of the order to be predicted through the default rate prediction model comprises: after the input layer acquires the structured data, the input layer imports the structured data into the standardized layer; standardizing the structured data through a standardized layer, and outputting the standardized structured data to a first full connection layer; extending the standardized structured data through a first full connection layer, and selecting and abandoning a layer for the output value of the extended structured data; reducing the degree of overfitting of the default rate prediction model through the optional layer; extending the structured data again through a second full connection layer; concentrating the concentrated structured data through a third connecting layer; re-extending the concentrated structured data through a fourth connecting layer; the structural data after being amplified again is extended again through a fifth connecting layer; amplifying the structural data after the second amplification through a sixth connecting layer; re-concentrating the re-amplified structured data through a seventh connecting layer; transmitting the re-concentrated structured data to an output layer; and inputting a default rate obtained by predicting the structured data through an output layer. In this embodiment, the default rate prediction model includes at least 7 fully connected layers. Specifically, the first layer is an input layer: import data into the model, taking a wide table of 120 fields of structured input data as an example, where the data dimension is (120,). Second layer normalization layer: input data are standardized, the method is more suitable for model training and actual prediction, and the input data are converted into data with the average value of zero and the standard deviation of approximate 1. Taking the example of a wide table of 120 fields of structured input data, the data dimension here is (120,). The third layer is a full connection layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of structuring a wide table of 120 fields of input data, the data dimension here is about one quarter (32,) of the input data, and the purpose of the third layer is to extend the combination of data. Specifically using the Elu algorithm as the activation function. The fourth layer is a disposable layer: and connecting the upper layer data node and the lower layer data node, and randomly setting the input data to be zero, wherein the random frequency is 10%. Taking the example of a wide table of 120 fields of structured input data, where the data dimension is about (32,), the purpose of the fourth layer is to reduce the degree of model overfitting. The fifth layer is a full connecting layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of a wide table of 120 fields of structured input data, the data dimension here is equal to about one layer (32,), the purpose of the fifth layer is to continue the assembly of the extended data. Specifically using the Elu algorithm as the activation function. The sixth layer is a full connection layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking a wide table of 120 fields of structured input data as an example, the data dimension is about one fourth (6,) of the previous layer, and the purpose of the sixth layer is to perform the first stage of information enrichment. In particular, a Linear algorithm can be used as the activation function. The seventh layer is a full connection layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of structuring a wide table of 120 fields of input data, where the data dimension is about twice that of the input data (16,), the purpose of the seventh layer is to perform the second stage of data combination extension. The Softplus algorithm can be used in particular as an activation function. The eighth layer is a full connection layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of a wide table of 120 fields of structured input data, where the data dimension is equal to the input data (16,), the purpose of the eighth layer is to perform the second stage of data combination extension. The Relu algorithm is used specifically as an activation function. The ninth layer is a full connection layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of a wide table of 120 fields of structured input data, the data dimension here is about 16 times (256,) of the input data, and the purpose of the eighth layer is to perform the enlargement of the data combination. The Relu algorithm is used specifically as an activation function. The tenth layer is a full connection layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of a wide table of 120 fields of structured input data, where the data dimension is approximately equal to the input data (16,), the purpose of the eighth layer is to perform data combination enrichment before output. The Softplus algorithm is used in particular as an activation function. The eleventh layer is an output layer: and outputting the calculation result of the model. Taking the structured data output as an example to predict the number of overdue payment days for the payment of the buyer, the data dimension here is (1,), and specifically, Elu algorithm is used as the activation function.
In one embodiment, the default rate prediction model is trained before being put into practical use. The training step comprises: acquiring historical structured data of the textile industry; cleaning historical structured data, and removing data which do not meet preset conditions; carrying out standardization processing on the rejected historical structured data; the standardized historical structured data is connected in series and merged; carrying out statistics and collection on the historical structured data after serial connection and combination, and dividing the historical structured data into a training data set and a test data set; training the default rate prediction model by using a training data set; testing the training default rate prediction model by using a test data set; and after the test is passed, determining that the default rate prediction model is completely trained.
In one embodiment, the training steps for the default rate prediction model are as follows:
1. data acquisition: including automatically acquiring database data, network data, etc.
The data types are mainly divided into basic information of the cloth purchasers, credit information of the cloth purchasers, historical order information of the cloth purchasers, market financial information, various indexes and the like. The basic information of the cloth buyer comprises: registered capital, date of establishment, number of employees, etc.; the cloth purchaser credit information includes: legal information, enterprise credit information, business information, etc.; the cloth buyer historical order information comprises: the amount of the order, the types of the order, the related data of the fabric, the repayment information and the like; the market financial information includes: cloth market information, financial market information, competitive product information and the like; other indicators include exchange rate, price index, etc. Sources of data include, but are not limited to: network public information data, information data purchased from a third party, information data of the credit holder of the integrator with own history, and the like. When data is input into the model, the integrated broad form can be used as data input. After the data is acquired, the data can be cleaned and preprocessed.
2. Data cleaning and preprocessing: and the acquired information is sorted and cleaned, and data types such as data coding, missing value processing, word string processing, time processing and the like are processed.
Specifically, order data that cannot be defined whether the order is overdue or not, order data that cannot confirm economic loss, and data of missing order time and purchaser number can be removed. In the aspect of data processing, the structured order data can be used as the main dimension of the analysis data, namely, the prediction model predicts the probability of delayed payment of the order buyer and the economic loss caused by the delayed payment according to each order data. In the aspect of model output, the main output targets are as follows: predicting the probability of overdue payment. The time type data can be converted into numerical data by adopting a time subtraction mode; the classification data can be converted into a sparse numerical matrix by adopting a multi-dimensional independent heat conversion method, and different from the existing wind control data analysis, the classification data is characterized in that a deep learning algorithm is used, modeling is carried out by adopting richer data, excessive feature screening and filtering are not carried out, and the purpose is to reserve fine risk features which are easy to detect and filter, so that one or 100 types of classification data of which the weight removal ratio does not exceed the original data amount in the sparse matrix are reserved.
3. Data collection: and performing necessary data concatenation, data dimension conversion and calculation of derivative fields, and converting the original data into input data suitable for the algorithm model.
All data of different tables are connected to the order information table in series through main string table key values to form an initial wide table, and order time and purchaser code are reserved as main dimensions of follow-up data statistics. Then, carrying out data statistics on the initial width table, and carrying out statistics on the past one month, three months, half year and one year statistical information of the buyer according to each order, wherein the statistical information comprises the following information: the method comprises the steps of combining fields of more than 80 items including the order quantity, the average purchase amount, the purchase amount median, the minimum purchase amount, the maximum purchase amount, the average loan utilization rate, the average daily order amount, the average daily use loan order amount and the like, then combining conversion data of upper category type dimensions, wherein the conversion data comprises cloth types, weaving types, elastic types and cloth qualities, such as whether the fabric is net color cloth, whether the fabric is pattern cloth, whether the fabric is non-woven, whether the fabric is woven or not and whether the fabric is knitted or not, the fields of more than 40 items are totally more than 120 items and serve as data input.
4. Characteristic processing: filtering is performed according to the features screened during modeling, and derivative fields are calculated.
5. And (3) model prediction output: and calling the trained algorithm model, inputting input data conforming to the model, and outputting a prediction result by the model.
Data is divided into a training set and a test set, and two division modes are mainly adopted: firstly, time is taken as a segmentation main body, the whole data takes the time dimension of duplication removal as the basis of segmentation data, and the data is divided into training set data and test set data in a ratio of 80 to 20; and secondly, the client is taken as a segmentation main body, the overall data takes the de-duplicated client dimension as the basis of segmenting the data, and the training set data and the test set data are divided in a ratio of 80 to 20.
Specifically, the default rate prediction model comprises: the activation function specifically comprises a standardization layer, a full connection layer, an input layer, an output layer, a selection layer and the like, wherein the activation function specifically comprises: relu function, Elu function, Softplus function, Linear function, etc. The corresponding graph of the activation function output is shown in fig. 3.
The default rate prediction model uses structured wide-table data for input and outputs data with one dimension according to different prediction targets. Describing by the condition of not using batch training, wide table data of an order dimension is input once, if the number of fields is 120, the dimension of the input data is (120 '), prediction data with the dimension of (1') is output after calculation of each layer of the model, and the predicted overdue repayment default rate of a buyer using a financial commodity is predicted. The model training and prediction process can be carried out in batch.
Specifically, the first input layer: import data into the model, taking a wide table of 120 fields of structured input data as an example, where the data dimension is (120,).
Second layer normalization layer: input data are standardized and more suitable for model training, and the input data are converted into data with the average value of zero and the standard deviation of approximate 1. Taking the example of a wide table of 120 fields of structured input data, the data dimension here is (120,).
Third full-connected layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of structuring a wide table of 120 fields of input data, the data dimension here is about one quarter (32,) of the input data, and the purpose of the third layer is to extend the combination of data. Specifically using the Elu algorithm as the activation function.
A fourth optional layer: and connecting the upper layer data node and the lower layer data node, and randomly setting the input data to be zero, wherein the random frequency is 10%. Taking the example of a wide table of 120 fields of structured input data, where the data dimension is about (32,), the purpose of the fourth layer is to reduce the degree of model overfitting.
A fifth fully-connected layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of a wide table of 120 fields of structured input data, the data dimension here is equal to about one layer (32,), the purpose of the fifth layer is to continue the assembly of the extended data. Specifically using the Elu algorithm as the activation function.
Sixth full tie layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking a wide table of 120 fields of structured input data as an example, the data dimension is about one fourth (6,) of the previous layer, and the purpose of the sixth layer is to perform the first stage of information enrichment. In particular, the Linear algorithm is used as the activation function.
A seventh fully-connected layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of structuring a wide table of 120 fields of input data, where the data dimension is about twice that of the input data (16,), the purpose of the seventh layer is to perform the second stage of data combination extension. The Softplus algorithm is used in particular as an activation function.
The eighth layer is a full connection layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of a wide table of 120 fields of structured input data, where the data dimension is equal to the input data (16,), the purpose of the eighth layer is to perform the second stage of data combination extension. The Relu algorithm is used specifically as an activation function.
Ninth full tie layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of a wide table of 120 fields of structured input data, the data dimension here is about 16 times (256,) of the input data, and the purpose of the eighth layer is to perform the enlargement of the data combination. The Relu algorithm is used specifically as an activation function.
A tenth full connection layer: and connecting the upper layer data node and the lower layer data node, and calculating and transmitting the information. Taking the example of a wide table of 120 fields of structured input data, where the data dimension is approximately equal to the input data (16,), the purpose of the eighth layer is to perform data combination enrichment before output. The Softplus algorithm is used in particular as an activation function.
The eleventh output layer: and outputting the calculation result of the model. Taking the structured data output as an example to predict the number of overdue payment days for the payment of the buyer, the data dimension here is (1,), and specifically, Elu algorithm is used as the activation function. Unlike other deep learning model architectures, this model architecture is suitable for analyzing the structured data input, and has a lower mean square error in predicting textile industry buyer default rates for structured data than other algorithms.
Wherein, the part of the input data is compared with the depth learning model of the image classification, the depth learning model of the image classification with the length and the width of 256 multiplied by 256, if the image layer is four layers, 1 ten thousand strokes of input data, the dimensionality of the input data is (10000,256, 4); the input data of the prediction model of the buyer default rate is 120 dimensions of the wide table and 1 ten thousand input data as an example, the dimension of the input data is (10000,120), the input data structure is single-dimensional, and the displacement relation of each field does not exist like graph data, so the input data is not suitable for using a CNN convolutional neural network, and the input data is not time sequence or sequence, so the input data is not suitable for using an RNN cyclic neural network, so the model architecture is mainly constructed by using a plurality of layers of connection layers.
Specifically, in one embodiment, the Relu function is
Figure BDA0003070745910000111
Elu the function is:
Figure BDA0003070745910000112
the Softplus function is: (x) loge (1+ ex); the Linear function is: and f (x) x. Wherein, x specifically represents the input number of the activation function, namely the data after the deep learning node of each layer is operated. α is specifically a predetermined constant 1.
The calculation formula of each full connecting layer is as follows:
output ═ activation function (input matrix x coefficients + divergence rate); the input matrix is input data of each layer of deep learning, and the coefficient and the divergence rate are subjected to fitting calculation in a gradient descending mode during model training.
In addition, in the default rate prediction model, the loss function adopts a mean square error calculation mode. In this way, the model can be trained continuously to obtain better scoring effect. Specifically, after about 120 batches of training, the default rate prediction model scores better than the best-performing XGboost algorithm model, and after 200 batches of training, the mean square error still shows a descending trend, and the descending trend is gradually reduced after about 280 batches of training.
According to the method for predicting the default rate of the textile industry, the structured data of the order to be predicted in the textile industry are obtained and input into the default rate prediction model, and the default rate of the order to be predicted is determined through the default rate prediction model.
In one embodiment, as shown in fig. 4, there is provided a textile industry default rate prediction device, including:
the data acquisition module 401 is configured to acquire structural data of an order to be predicted in the textile industry;
a prediction module 402 for inputting the structured data into a default rate prediction model; and determining the default rate of the order to be predicted through the default rate prediction model.
In one embodiment, the default rate prediction model comprises: the device comprises an input layer, a normalization layer, a plurality of full-connection layers, a disposal layer and an output layer.
In one embodiment, the prediction module 402 is further configured to import the structured data into the normalization layer by the input layer after the input layer acquires the structured data; standardizing the structured data through a standardized layer, and outputting the standardized structured data to a first full connection layer; extending the standardized structured data through a first full connection layer, and selecting and abandoning a layer for the output value of the extended structured data; reducing the degree of overfitting of the default rate prediction model through the optional layer; extending the structured data again through a second full connection layer; concentrating the concentrated structured data through a third connecting layer; re-extending the concentrated structured data through a fourth connecting layer; the structural data after being amplified again is extended again through a fifth connecting layer; amplifying the structural data after the second amplification through a sixth connecting layer; re-concentrating the re-amplified structured data through a seventh connecting layer; transmitting the re-concentrated structured data to an output layer; and inputting a default rate obtained by predicting the structured data through an output layer.
In one embodiment, the device for predicting the default rate of textile industry further comprises a training module (not shown in the figure) for acquiring historical structured data of the textile industry; cleaning historical structured data, and removing data which do not meet preset conditions; carrying out standardization processing on the rejected historical structured data; the standardized historical structured data is connected in series and merged; and carrying out statistics and collection on the historical structured data after the serial connection and combination, and dividing the historical structured data into a training data set and a testing data set.
In one embodiment, the training module is further configured to train the violation rate prediction model using a training data set; testing the training default rate prediction model by using a test data set; and after the test is passed, determining that the default rate prediction model is completely trained.
In one embodiment, the structured data includes at least one of cloth purchaser base information, cloth purchaser credit information, cloth purchaser historical order information, and market finance information; the basic information of the cloth buyer comprises at least one of registered capital, establishment date and staff number of the cloth buyer; the credit information of the cloth buyer comprises at least one of legal information, enterprise credit information and operation information; the cloth purchaser historical order information comprises at least one of order amount, order type, fabric data and repayment information; the market financial information includes at least one of cloth market information, financial market information, and contest information.
The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. One or more than one kernel can be set, and the method for predicting the default rate of the textile industry is realized by adjusting the kernel parameters.
The memory may include volatile memory in a computer readable medium, Random Access Memory (RAM) and/or nonvolatile memory such as Read Only Memory (ROM) or flash memory (flash RAM), and the memory includes at least one memory chip.
An embodiment of the present invention provides a storage medium, on which a program is stored, and when the program is executed by a processor, the method for predicting the default rate of the textile industry is implemented.
The embodiment of the invention provides a processor, which is used for running a program, wherein the program executes the method for predicting the default rate of the textile industry when running.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 5. The computer device includes a processor a01, a network interface a02, a memory (not shown), and a database (not shown) connected by a system bus. Wherein processor a01 of the computer device is used to provide computing and control capabilities. The memory of the computer device comprises an internal memory a03 and a non-volatile storage medium a 04. The non-volatile storage medium a04 stores an operating system B01, a computer program B02, and a database (not shown in the figure). The internal memory a03 provides an environment for the operation of the operating system B01 and the computer program B02 in the nonvolatile storage medium a 04. The database of the computer device is used for storing structured data and the like of the textile industry. The network interface a02 of the computer device is used for communication with an external terminal through a network connection. The computer program B02 is executed by the processor a01 to implement a method for predicting a rate of default for the textile industry.
Those skilled in the art will appreciate that the architecture shown in fig. 5 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
The embodiment of the invention provides equipment, which comprises a processor, a memory and a program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the method for predicting the default rate of the textile industry.
The present application also provides a computer program product adapted to execute a program initialized with the steps of the method for predicting a rate of default for the textile industry as described above, when executed on a data processing device.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for predicting a default rate of a textile industry, the method comprising:
acquiring structural data of the order to be predicted in the textile industry;
inputting the structured data into a default rate prediction model;
and determining the default rate of the order to be predicted through the default rate prediction model.
2. The method of claim 1, wherein the default rate prediction model comprises: the device comprises an input layer, a normalization layer, a plurality of full-connection layers, a disposal layer and an output layer.
3. The method of claim 2, wherein the fully connected layer comprises at least 7, and wherein determining the default rate of the order to be forecasted by the default rate forecasting model comprises:
after the input layer acquires the structured data, the input layer imports the structured data into the normalization layer;
standardizing the structured data through the standardized layer, and outputting the standardized structured data to a first full connection layer;
extending the standardized structured data through the first full connection layer, and outputting the extended structured data to the option layer;
reducing a degree of overfitting of the penalty rate prediction model by the culling layer;
extending the structured data again through a second fully connected layer;
concentrating the concentrated structured data through a third connecting layer;
re-extending the concentrated structured data through a fourth connecting layer;
the structural data after being amplified again is extended again through a fifth connecting layer;
amplifying the structural data after the second amplification through a sixth connecting layer;
re-concentrating the re-amplified structured data through a seventh connecting layer;
transmitting the re-concentrated structured data to the output layer;
a default rate predicted from the structured data by the output layer input.
4. The method of claim 3, wherein the fully connected layer comprises an activation function according to the following equation (1):
Figure FDA0003070745900000021
where x is the input value of the activation function and α has a value of 1.
5. The method of claim 1, further comprising:
acquiring historical structured data of the textile industry;
cleaning the historical structured data, and removing the data which do not meet preset conditions;
carrying out standardization processing on the rejected historical structured data;
the standardized historical structured data is connected in series and merged;
and carrying out statistics and collection on the historical structured data after the serial connection and combination, and dividing the historical structured data into a training data set and a test data set.
6. The method of claim 5, further comprising:
training a default rate prediction model by using the training data set;
testing the trained default rate prediction model by using the test data set;
and after the test is passed, determining that the default rate prediction model is completely trained.
7. The method of claim 1, wherein the structured data comprises at least one of cloth buyer basic information, cloth buyer credit information, cloth buyer historical order information, and market finance information; the basic information of the cloth buyer comprises at least one of registered capital, establishment date and staff number of the cloth buyer; the credit information of the cloth buyer comprises at least one of legal information, enterprise credit information and operation information; the cloth purchaser historical order information comprises at least one of order amount, order type, fabric data and repayment information; the market financial information includes at least one of cloth market information, financial market information, and contest information.
8. An apparatus for predicting a default rate of a textile industry, the apparatus comprising:
the data acquisition module is used for acquiring the structural data of the order to be predicted in the textile industry;
a prediction module for inputting the structured data into a default rate prediction model; and determining the default rate of the order to be predicted through the default rate prediction model.
9. A machine-readable storage medium having instructions stored thereon, which when executed by a processor causes the processor to be configured to perform a textile industry default rate prediction method according to any one of claims 1 to 7.
10. A processor configured to perform the method of textile industry default rate prediction according to any one of claims 1 to 7.
CN202110550772.2A 2021-05-18 2021-05-18 Method and device for predicting default rate of textile industry, storage medium and processor Pending CN113283583A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110550772.2A CN113283583A (en) 2021-05-18 2021-05-18 Method and device for predicting default rate of textile industry, storage medium and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110550772.2A CN113283583A (en) 2021-05-18 2021-05-18 Method and device for predicting default rate of textile industry, storage medium and processor

Publications (1)

Publication Number Publication Date
CN113283583A true CN113283583A (en) 2021-08-20

Family

ID=77280266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110550772.2A Pending CN113283583A (en) 2021-05-18 2021-05-18 Method and device for predicting default rate of textile industry, storage medium and processor

Country Status (1)

Country Link
CN (1) CN113283583A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107464068A (en) * 2017-09-18 2017-12-12 前海梧桐(深圳)数据有限公司 Enterprise development trend forecasting method and its system based on neutral net
CN108416663A (en) * 2018-01-18 2018-08-17 阿里巴巴集团控股有限公司 The method and device of the financial default risk of assessment
US20190066130A1 (en) * 2017-08-31 2019-02-28 Paypal, Inc. Unified artificial intelligence model for multiple customer value variable prediction
CN112017025A (en) * 2020-08-26 2020-12-01 天元大数据信用管理有限公司 Enterprise credit assessment method based on fusion of deep learning and logistic regression
CN112668944A (en) * 2021-01-26 2021-04-16 天元大数据信用管理有限公司 Enterprise wind control method, device, equipment and medium based on big data credit investigation
CN112734570A (en) * 2020-12-31 2021-04-30 北京知因智慧科技有限公司 Credit default prediction method and device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190066130A1 (en) * 2017-08-31 2019-02-28 Paypal, Inc. Unified artificial intelligence model for multiple customer value variable prediction
CN107464068A (en) * 2017-09-18 2017-12-12 前海梧桐(深圳)数据有限公司 Enterprise development trend forecasting method and its system based on neutral net
CN108416663A (en) * 2018-01-18 2018-08-17 阿里巴巴集团控股有限公司 The method and device of the financial default risk of assessment
CN112017025A (en) * 2020-08-26 2020-12-01 天元大数据信用管理有限公司 Enterprise credit assessment method based on fusion of deep learning and logistic regression
CN112734570A (en) * 2020-12-31 2021-04-30 北京知因智慧科技有限公司 Credit default prediction method and device and electronic equipment
CN112668944A (en) * 2021-01-26 2021-04-16 天元大数据信用管理有限公司 Enterprise wind control method, device, equipment and medium based on big data credit investigation

Similar Documents

Publication Publication Date Title
CN110415119B (en) Model training method, bill transaction prediction method, model training device, bill transaction prediction device, storage medium and equipment
EP1361526A1 (en) Electronic data processing system and method of using an electronic processing system for automatically determining a risk indicator value
US8577791B2 (en) System and computer program for modeling and pricing loan products
KR102412433B1 (en) Automatic data analysis method and system using artificial intelligence
CN113283582A (en) Textile industry financial loss prediction method and device, storage medium and processor
KR102461415B1 (en) Method for credit evaluation based on external data and apparatus for performing the method
CN116401379A (en) Financial product data pushing method, device, equipment and storage medium
Fan Prediction of monetary fund based on ARIMA model
Zhang A deep learning model for ERP enterprise financial management system
CN111667307B (en) Method and device for predicting financial product sales volume
CN116703568A (en) Credit card abnormal transaction identification method and device
KR102464995B1 (en) Method for credit evaluation based on end-to-end data generated on process of purchase, sales, inventory, logistics, distribution and calculation on ECS(e-commerce solution) and apparatus for performing the method
CN113283583A (en) Method and device for predicting default rate of textile industry, storage medium and processor
CN115205011B (en) Bank user portrait model generation method based on LSF-FC algorithm
KR102464994B1 (en) Method for credit evaluation based on data generated on logistics movement process of WMS and apparatus for performing the method
KR102464993B1 (en) Method for credit evaluation based on order data generated between online seller and customer on OMS(order management system)
CN115034685A (en) Customer value evaluation method, customer value evaluation device and computer-readable storage medium
Farag A planning model for the divisionalized enterprise
CN114154682A (en) Customer loan yield grade prediction method and system
CN112508689A (en) Method for realizing decision evaluation based on multiple dimensions
CN114092265B (en) Method, device and storage medium for improving insurance policy new service value determination efficiency
Dholakia et al. Cognitive Demand Forecasting with Novel Features Using Word2Vec and Session of the Day
JP5592861B2 (en) Claim evaluation support system, claim evaluation support method and claim evaluation support program
Konda et al. An In-Depth Evaluation of Machine Learning Techniques for Anticipating Effective Human Health Outcomes
CN115564561A (en) Enterprise data processing method and device and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination