CN114119201B - Enterprise credit investigation method, equipment and medium - Google Patents
Enterprise credit investigation method, equipment and medium Download PDFInfo
- Publication number
- CN114119201B CN114119201B CN202111425171.5A CN202111425171A CN114119201B CN 114119201 B CN114119201 B CN 114119201B CN 202111425171 A CN202111425171 A CN 202111425171A CN 114119201 B CN114119201 B CN 114119201B
- Authority
- CN
- China
- Prior art keywords
- neural network
- data
- network model
- coding neural
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000011835 investigation Methods 0.000 title claims abstract description 54
- 238000000034 method Methods 0.000 title claims abstract description 54
- 238000003062 neural network model Methods 0.000 claims abstract description 68
- 230000006870 function Effects 0.000 claims description 22
- 210000002569 neuron Anatomy 0.000 claims description 15
- 238000003064 k means clustering Methods 0.000 claims description 13
- 238000012549 training Methods 0.000 claims description 8
- 230000004913 activation Effects 0.000 claims description 5
- 230000003213 activating effect Effects 0.000 claims description 4
- 230000000694 effects Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000004590 computer program Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000012545 processing Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 235000013361 beverage Nutrition 0.000 description 1
- 230000008033 biological extinction Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 230000035558 fertility Effects 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 230000007115 recruitment Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Technology Law (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Probability & Statistics with Applications (AREA)
- Marketing (AREA)
- Evolutionary Biology (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses enterprise credit investigation method, equipment and medium, wherein the method comprises the following steps: acquiring credit investigation data of a plurality of enterprises to be evaluated; reconstructing credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data; clustering the characteristic data to divide a plurality of enterprises to be evaluated into a plurality of clusters; and generating enterprise portraits of a plurality of enterprises to be evaluated according to the cluster labels of the plurality of clusters. The method provided by the application reconstructs credit investigation data by utilizing the stack-type self-coding neural network model, so that the characteristics of the credit investigation data are more obvious, more accurate enterprise portraits are constructed, and the financial institution is helped to provide one-stop financing service without mortgage, pure credit, online transaction and rapid approval for small and medium-sized enterprises and individuals.
Description
Technical Field
The application relates to the field of risk prevention and control, in particular to an enterprise credit investigation method, equipment and medium.
Background
With the development of technologies such as big data, machine learning, artificial intelligence, etc., the traditional financial operation service mode has also changed greatly. Among them, financial big data wind control technology is one of the important application technologies of internet finance. The financial big data wind control technology is to generate the enterprise portrait through the financial data related to the enterprise.
The current enterprise credit investigation method generally analyzes and clusters the credit investigation data of the enterprise directly, but because the characteristic saliency in the credit investigation data is generally low, if the existing enterprise credit investigation method is directly used, the image generation accuracy of the enterprise is reduced, and therefore risks are caused in the follow-up financial service process.
Based on the above, there is a need for an enterprise credit investigation method with higher accuracy based on technologies such as big data, machine learning, artificial intelligence and the like.
Disclosure of Invention
In order to solve the above problems, the present application proposes an enterprise credit investigation method, including:
acquiring credit investigation data of a plurality of enterprises to be evaluated; reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data; clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters; according to the meaning of the characteristic field in the characteristic data, determining the preferred range of the characteristic value corresponding to the characteristic field; generating a cluster label of the cluster according to the preferred range of the characteristic values and the characteristic values respectively corresponding to the central points of the preset number of clusters; and generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters.
In one example, after the clustering operation is performed on the feature data to divide the plurality of enterprises to be evaluated into a plurality of clusters, the method further includes: randomly generating the network layer neuron number of the stack type self-coding neural network model; determining the density degree of the plurality of clustering clusters corresponding to the stacked self-coding neural network models with different neuron numbers respectively; and selecting the stack type self-coding neural network model with the highest density degree, and clustering the characteristic data.
In one example, before the reconstructing the credit data of the enterprises to be evaluated by the pre-stacked self-coding neural network model, the method further includes: constructing an initial stack type self-coding neural network model by stacking the sparse self-coder; activating the initial stack type self-coding neural network model through an ELU activation function; taking the mean square error as a loss function of the initial stack type self-coding neural network model, and regularizing L2 as a penalty term of the loss function; training the initial stack type self-coding neural network model through sample credit data to obtain the stack type self-coding neural network model.
In one example, the initial stacked self-encoding neural network model is provided with weights and offsets; and training the weight and the bias through the sample credit data to obtain the stacked self-coding neural network model.
In one example, the determining to obtain credit information of several enterprises to be evaluated specifically includes: acquiring financial data of the enterprises to be evaluated under different portrait dimensions; the portrait dimension comprises a preset number of characteristic fields; and determining the missing value proportion corresponding to the characteristic values in the characteristic fields, and deleting the characteristic fields to obtain the credit investigation data if the missing value proportion is larger than a preset threshold value.
In one example, the clustering operation on the feature data specifically includes: determining a K value of a K-means clustering algorithm by a contour coefficient method and an elbow method; and carrying out initial clustering on the characteristic data through a hierarchical clustering algorithm, and then randomly selecting one point from k categories as an initial point of the-means clustering algorithm.
In one example, the different portrayal dimensions include at least one of enterprise context, enterprise stability, business capability, enterprise credibility, judicial risk, business risk, enterprise credit enhancement, credit risk, technological innovation capability.
The application also provides an enterprise credit investigation device, comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform: acquiring credit investigation data of a plurality of enterprises to be evaluated; reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data; clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters; and generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters.
The present application also provides a non-volatile computer storage medium storing computer executable instructions, characterized in that the computer executable instructions are configured to:
acquiring credit investigation data of a plurality of enterprises to be evaluated; reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data; clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters; and generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters.
By means of the method, the credit investigation data can be reconstructed by utilizing the stacked self-coding neural network model, so that characteristics of the credit investigation data are more obvious, more accurate enterprise portraits are built, and the financial institution is helped to provide one-stop financing service for quick approval for small and medium-sized enterprises and individuals, and satisfaction of users is improved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute an undue limitation to the application. In the drawings:
FIG. 1 is a schematic diagram of an enterprise credit investigation method in an embodiment of the present application;
fig. 2 is a schematic diagram of an enterprise credit investigation device in an embodiment of the present application.
Detailed Description
For the purposes, technical solutions and advantages of the present application, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a flow chart of an enterprise credit method according to one or more embodiments of the present disclosure. The process may be performed by a computing device in the relevant field (e.g., a wind control server or intelligent mobile terminal, etc.), with some input parameters or intermediate results in the process allowing for manual intervention adjustments to help improve accuracy.
The implementation of the analysis method according to the embodiment of the present application may be a terminal device or a server, which is not particularly limited in this application. For ease of understanding and description, the following embodiments will be described in detail with reference to a terminal device as an example.
As shown in fig. 1, an embodiment of the present application provides an enterprise credit investigation method, device and medium, where the method includes:
s101: and acquiring credit investigation data of a plurality of enterprises to be evaluated.
In order to generate an enterprise portrait of an enterprise to be evaluated, the terminal device needs to determine credit data of the enterprise to be evaluated, where the credit data includes a feature field and a feature value corresponding to the feature field. The feature field herein refers to the name of a certain data class in the credit data, for example, the registered capital of the enterprise is a feature field, and the feature value is the size of the feature field, that is, the registered capital size is a feature value. The credit investigation data of the enterprise to be assessed can be stored in the storage device of the computer equipment in advance, and when the enterprise to be assessed needs to be subjected to credit investigation, the computer equipment can select the credit investigation data of the enterprise to be assessed from the storage device. Of course, the computer device may also obtain credit information of the enterprise to be evaluated from other external devices. For example, credit investigation data of the enterprise to be evaluated is stored in the cloud, and when credit investigation of the enterprise to be evaluated is required, the computer device can acquire the credit investigation data of the enterprise to be evaluated from the cloud, and the characteristic data acquisition mode is not limited in this embodiment.
S102: and reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data.
After the credit investigation data is obtained, the terminal equipment can reconstruct the credit investigation data by using a preset stack type self-coding neural network model in order to make the characteristics of the credit investigation data more obvious and easier to distinguish. The stack type self-coding neural network model can transmit input enterprise information to the hidden layer through simple learning, then the hidden layer compresses the input and decompresses the input in the output layer, the information can be definitely lost in the whole process, but through training, the lost information can be reduced as much as possible, and the main characteristics of the lost information can be reserved to the greatest extent. After the credit investigation data is subjected to reconstruction processing of the stack-type self-coding neural network model, feature data are obtained, features in the feature data are more obvious, distinguishing is easy, and subsequent clustering operation is convenient.
S103: and clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters.
After the characteristic data of the enterprises are obtained, the enterprises to be evaluated are divided into a preset number of clustering clusters through clustering the characteristic data, and the enterprises to be evaluated are clustered because of the large number of the enterprises to be evaluated, so that the enterprises with similar images can be clustered together, and the enterprise portraits can be generated more conveniently.
S104: and determining the preferred range of the characteristic value corresponding to the characteristic field according to the meaning of the characteristic field in the characteristic data.
S105: and generating cluster labels of the clusters according to the preferred range of the characteristic values and the characteristic values respectively corresponding to the central points of the preset number of clusters.
In one embodiment, after the feature data is clustered by using a clustering algorithm, the enterprise to be evaluated is divided into a preset number of clusters, and then cluster labels are generated for the clusters. The preferred range of characteristic values for the beverage may be determined from the meaning of the characteristic field in the characteristic data. Such as the registered capital or operating time of the enterprise, the larger the corresponding characteristic value of the field, the more favorable the enterprise image. In this case, the larger the preferable range is, the more preferable. And then generating a clustering label of the cluster according to the preferred range of the characteristic values and the characteristic values corresponding to the central points in the cluster.
S106: and generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters.
After the enterprises to be evaluated are divided into the clusters, the labels of the clusters can be used to represent the images of the enterprises in the clusters only by generating the labels for the clusters. Therefore, enterprise portraits of the enterprises to be evaluated can be generated through the cluster labels preset by the cluster clusters.
In one embodiment, since the number of network layer neurons in the stacked self-coding neural network model also affects the feature extraction process, the number of neurons in each layer in the stacked self-coding neural network can be adjusted according to the final clustering result after the feature data is clustered to divide a plurality of enterprises to be evaluated into a plurality of clusters. The network layer neuron number of the stack-type self-coding neural network model can be randomly generated firstly, and then the density degree of the stack-type neural network model with different neuron numbers in a cluster after clustering is determined; and selecting the stack type self-coding neural network model with the highest density degree, and clustering the characteristic data. By changing the number of neurons of a network layer in the stack-type self-coding neural network model, the data feature ratio in the feature data is improved, so that the clustering density is improved, and the accuracy of division of enterprises to be evaluated is improved.
In one embodiment, the stacked self-encoding neural network model is also required to be constructed and trained before the credit data of the enterprise to be evaluated is reconstructed through the preset stacked self-encoding neural network model. The stack type self-coding neural network is constructed by stacking a sparse self-coder, the sparse self-coder is an unsupervised machine learning algorithm, and parameters of the self-coder are continuously adjusted by calculating errors of self-coding output and original input, so that a model is trained finally. The self-encoder may be used to compress the input information and extract useful input features. The self-encoder is based on the idea of dimension reduction, but when hidden layer nodes are more than input nodes, the self-encoder loses the capability of automatically learning sample characteristics, and certain constraint needs to be carried out on the hidden layer nodes, and as the starting point of the self-encoder for noise reduction, high-dimension and sparse expression is good, so that a limit value for some sparsity of the hidden layer nodes is proposed. The sparse self-encoder is obtained by adding some sparsity constraint on the basis of the traditional self-encoder. This sparsity is directed to the hidden layer neurons of the self-encoder, which achieve a sparse effect by suppressing most of the output of the hidden layer neurons. After the stack-type self-coding neural network model is constructed, part of sample credit data is required to be selected to train the neural network model.
Further, the stacked self-encoding neural network model may be activated by an ELU activation function. The ELU function combines the advantages of the sigmoid function and the ReLU function, with soft saturation of the origin of coordinates on the left and no saturation on the right of the origin of coordinates. The right-hand linear portion allows the ELU to mitigate gradient extinction, while the left-hand soft saturation allows the ELU to be more robust to input variations or noise. And the convergence speed is faster because the output average of the ELU is close to 0. Meanwhile, the loss function of the stack-type self-coding neural network model can be represented by a mean square error, and L2 regularization is used as a penalty term of the loss function. Thus, the performance of the self-encoder in the stacked self-encoding neural network is greatly improved compared with that of the linear self-encoder, and L2 can obtain parameters with small values, so that overfitting is prevented.
Furthermore, in order to improve the effect of extracting the characteristics in the reconstruction process of the credit investigation data, the stacked self-coding neural network model can be provided with weights and offsets, namely, each layer of sparse self-encoder of the stacked self-coding neural network model is provided with weights and offsets, and the training data is used for adjusting the weight parameters and offsets, so that the characteristic extraction effect is improved.
In one embodiment, in determining credit data for enterprises to be evaluated, financial data of the enterprises to be evaluated under different portrait dimensions are first acquired, wherein one portrait dimension contains a plurality of feature fields. Since a large amount of data may be missing in various collected financial data, the financial data is preprocessed first, and feature fields with larger missing proportion in the financial data are deleted. And determining the missing value proportion corresponding to the characteristic values in each characteristic field, and deleting the characteristic field with the larger missing value proportion if the missing value proportion is larger than a preset threshold value, so as to obtain credit investigation data.
For example, the portrait dimension may be eight dimensions of enterprise background, enterprise stability, enterprise credibility, judicial risk, business risk, enterprise trust enhancement, credit risk, technological innovation, and the like. And in the business context dimension may include feature fields such as register capital, practitioner, date of establishment, business type, business category, industry category, business hours, zip code, etc. The enterprise stability dimension may include feature fields such as the number of enterprise changes, the number of equity changes, etc. The business capability dimension can comprise the generation of characteristic fields such as the number of online shops, the number of branches of enterprises, the number of investment of enterprises, the number of external guarantees, the number of external investments, the number of winning bid in enterprises, the number of recruitment records and the like. The enterprise credibility dimension can comprise characteristic fields such as enterprise participating duration, unit price business and insurance accumulated payment amount, unit participating employee basic medical insurance accumulated payment amount, unit participating work injury insurance accumulated payment amount, unit participating town employee basic endowment insurance accumulated payment amount, unit participating fertility insurance accumulated payment amount, and qualification rate of enterprise products to be spot checked. The business risk dimension may include feature fields of whether to list business anomalies, whether to list business penalties, whether to list business equity records, the number of business equity records, the cumulative tax owed by the business, whether to list business penalties, and the like. The business trust enhancement dimension may include feature dimensions of whether to list a branded trademark, whether to list a famous trademark, whether to list a contract re-trusted business, a level, whether to list a new small and medium business in the city. The credit risk dimension may include a feature dimension of whether to be blacklisted for trust, whether to trust the business, etc. The dimension of technological innovation capability may include feature dimensions such as intellectual property, number of times of registration of enterprise software copyright, number of times of application of enterprise patent, whether the enterprise owns the intellectual property of domain name, etc.
In one embodiment, a K-means clustering algorithm may be used in selecting the clustering algorithm to cluster the feature data. The K-Means clustering algorithm is an unsupervised clustering algorithm, is simple to implement and good in clustering effect, and the overall idea is that for a given sample set, the sample set is divided into K clusters according to the distance between samples. The points in the clusters are connected as closely as possible, and the distance between the clusters is as large as possible. The K-means clustering algorithm is used, so that the feature data is simply clustered, the convergence speed is high, and the clustering effect is good.
Further, when the K-means clustering algorithm is used for clustering the feature data, the K value of the K-means clustering algorithm is firstly determined through a contour coefficient method and an elbow method. After the K value is determined, in order to make the distance between the initial points as large as possible, a hierarchical clustering algorithm can be selected for initial clustering, the enterprise to be evaluated is roughly divided into K categories, and then one point is randomly selected from the K categories to be used as the initial point of the K-means clustering algorithm. The hierarchical clustering algorithm is used for initial clustering, so that initial points are selected, the distance between the initial points can be ensured to be large enough, and the clustering effect is improved.
As shown in fig. 2, the embodiment of the present application further provides an enterprise credit investigation device, including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to: acquiring credit investigation data of a plurality of enterprises to be evaluated; reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data; clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters; and generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters.
The embodiments also provide a non-volatile computer storage medium storing computer executable instructions configured to: acquiring credit investigation data of a plurality of enterprises to be evaluated; reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data; clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters; and generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters.
All embodiments in the application are described in a progressive manner, and identical and similar parts of all embodiments are mutually referred, so that each embodiment mainly describes differences from other embodiments. In particular, for the apparatus and medium embodiments, the description is relatively simple, as it is substantially similar to the method embodiments, with reference to the section of the method embodiments being relevant.
The devices and media provided in the embodiments of the present application are in one-to-one correspondence with the methods, so that the devices and media also have similar beneficial technical effects as the corresponding methods, and since the beneficial technical effects of the methods have been described in detail above, the beneficial technical effects of the devices and media are not described in detail herein.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.
Claims (6)
1. An enterprise credit method, comprising:
acquiring credit investigation data of a plurality of enterprises to be evaluated;
reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data;
clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters;
according to the meaning of the characteristic field in the characteristic data, determining the preferred range of the characteristic value corresponding to the characteristic field;
generating a cluster label of the cluster according to the preferred range of the characteristic values and the characteristic values respectively corresponding to the central points of the preset number of clusters;
generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters;
after the clustering operation is performed on the feature data to divide the enterprises to be evaluated into a plurality of clusters, the method further includes:
randomly generating the network layer neuron number of the stack type self-coding neural network model;
determining the density degree of the plurality of clustering clusters corresponding to the stacked self-coding neural network models with different neuron numbers respectively;
selecting the stack type self-coding neural network model with the highest density degree, and clustering the characteristic data;
before the credit data is reconstructed through the preset stack type self-coding neural network model, the method further comprises the following steps:
constructing an initial stack type self-coding neural network model by stacking the sparse self-coder;
activating the initial stack type self-coding neural network model through an ELU activation function;
taking the mean square error as a loss function of the initial stack type self-coding neural network model, and regularizing L2 as a penalty term of the loss function;
training the initial stack type self-coding neural network model through sample credit data to obtain the stack type self-coding neural network model;
the clustering operation for the characteristic data specifically comprises the following steps:
determining a K value of a K-means clustering algorithm by a contour coefficient method and an elbow method;
and carrying out initial clustering on the characteristic data through a hierarchical clustering algorithm, and then randomly selecting one point from K categories to serve as an initial point of the K-means clustering algorithm.
2. The method of claim 1, wherein the initial stacked self-encoding neural network model is provided with weights and offsets;
and training the weight and the bias through the sample credit data to obtain the stacked self-coding neural network model.
3. The method of claim 1, wherein the acquiring credit data of the plurality of enterprises to be evaluated specifically comprises:
acquiring financial data of the enterprises to be evaluated under different portrait dimensions; the portrait dimension comprises a preset number of characteristic fields;
and determining the missing value proportion corresponding to the characteristic values in the characteristic fields, and deleting the characteristic fields to obtain the credit investigation data if the missing value proportion is larger than a preset threshold value.
4. The method of claim 1, wherein the different portrait dimensions include at least one of enterprise context, enterprise stability, business capability, enterprise credibility, judicial risk, business risk, enterprise credit enhancement, credit risk, technological innovation capability.
5. An enterprise credit investigation apparatus, comprising:
at least one processor; and a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform:
acquiring credit investigation data of a plurality of enterprises to be evaluated;
reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data;
clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters;
according to the meaning of the characteristic field in the characteristic data, determining the preferred range of the characteristic value corresponding to the characteristic field;
generating a cluster label of the cluster according to the preferred range of the characteristic values and the characteristic values respectively corresponding to the central points of the preset number of clusters;
generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters;
after the clustering operation is performed on the feature data to divide the enterprises to be evaluated into a plurality of clusters, the instructions further include:
randomly generating the network layer neuron number of the stack type self-coding neural network model;
determining the density degree of the plurality of clustering clusters corresponding to the stacked self-coding neural network models with different neuron numbers respectively;
selecting the stack type self-coding neural network model with the highest density degree, and clustering the characteristic data;
before the credit data is reconstructed through the preset stack type self-coding neural network model, the instructions further comprise:
constructing an initial stack type self-coding neural network model by stacking the sparse self-coder;
activating the initial stack type self-coding neural network model through an ELU activation function;
taking the mean square error as a loss function of the initial stack type self-coding neural network model, and regularizing L2 as a penalty term of the loss function;
training the initial stack type self-coding neural network model through sample credit data to obtain the stack type self-coding neural network model;
the clustering operation for the characteristic data specifically comprises the following steps:
determining a K value of a K-means clustering algorithm by a contour coefficient method and an elbow method;
and carrying out initial clustering on the characteristic data through a hierarchical clustering algorithm, and then randomly selecting one point from K categories to serve as an initial point of the K-means clustering algorithm.
6. A non-transitory computer storage medium storing computer-executable instructions, the computer-executable instructions configured to:
acquiring credit investigation data of a plurality of enterprises to be evaluated;
reconstructing the credit investigation data through a preset stack type self-coding neural network model to obtain characteristic data;
clustering the characteristic data to divide the enterprises to be evaluated into a plurality of clusters;
according to the meaning of the characteristic field in the characteristic data, determining the preferred range of the characteristic value corresponding to the characteristic field;
generating a cluster label of the cluster according to the preferred range of the characteristic values and the characteristic values respectively corresponding to the central points of the preset number of clusters;
generating enterprise portraits of the enterprises to be evaluated according to the cluster labels of the plurality of clusters;
after the clustering operation is performed on the feature data to divide the enterprises to be evaluated into a plurality of clusters, the instructions further include:
randomly generating the network layer neuron number of the stack type self-coding neural network model;
determining the density degree of the plurality of clustering clusters corresponding to the stacked self-coding neural network models with different neuron numbers respectively;
selecting the stack type self-coding neural network model with the highest density degree, and clustering the characteristic data;
before the credit data is reconstructed through the preset stack type self-coding neural network model, the instructions further comprise:
constructing an initial stack type self-coding neural network model by stacking the sparse self-coder;
activating the initial stack type self-coding neural network model through an ELU activation function;
taking the mean square error as a loss function of the initial stack type self-coding neural network model, and regularizing L2 as a penalty term of the loss function;
training the initial stack type self-coding neural network model through sample credit data to obtain the stack type self-coding neural network model;
the clustering operation for the characteristic data specifically comprises the following steps:
determining a K value of a K-means clustering algorithm by a contour coefficient method and an elbow method;
and carrying out initial clustering on the characteristic data through a hierarchical clustering algorithm, and then randomly selecting one point from K categories to serve as an initial point of the K-means clustering algorithm.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111425171.5A CN114119201B (en) | 2021-11-26 | 2021-11-26 | Enterprise credit investigation method, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111425171.5A CN114119201B (en) | 2021-11-26 | 2021-11-26 | Enterprise credit investigation method, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114119201A CN114119201A (en) | 2022-03-01 |
CN114119201B true CN114119201B (en) | 2024-01-23 |
Family
ID=80370634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111425171.5A Active CN114119201B (en) | 2021-11-26 | 2021-11-26 | Enterprise credit investigation method, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114119201B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117786346A (en) * | 2023-12-18 | 2024-03-29 | 深圳市悦融易数据科技有限公司 | Enterprise portrait generation method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107240014A (en) * | 2017-04-28 | 2017-10-10 | 天合泽泰(厦门)征信服务有限公司 | A kind of credit rating method based on enterprise's reference business |
WO2021000678A1 (en) * | 2019-07-04 | 2021-01-07 | 平安科技(深圳)有限公司 | Business credit review method, apparatus, and device, and computer-readable storage medium |
CN112767136A (en) * | 2021-01-26 | 2021-05-07 | 天元大数据信用管理有限公司 | Credit anti-fraud identification method, credit anti-fraud identification device, credit anti-fraud identification equipment and credit anti-fraud identification medium based on big data |
CN112785144A (en) * | 2021-01-18 | 2021-05-11 | 深圳前海微众银行股份有限公司 | Model construction method, device and storage medium based on federal learning |
CN113362158A (en) * | 2021-05-31 | 2021-09-07 | 中国银联股份有限公司 | Credit evaluation method, device and computer readable storage medium |
-
2021
- 2021-11-26 CN CN202111425171.5A patent/CN114119201B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107240014A (en) * | 2017-04-28 | 2017-10-10 | 天合泽泰(厦门)征信服务有限公司 | A kind of credit rating method based on enterprise's reference business |
WO2021000678A1 (en) * | 2019-07-04 | 2021-01-07 | 平安科技(深圳)有限公司 | Business credit review method, apparatus, and device, and computer-readable storage medium |
CN112785144A (en) * | 2021-01-18 | 2021-05-11 | 深圳前海微众银行股份有限公司 | Model construction method, device and storage medium based on federal learning |
CN112767136A (en) * | 2021-01-26 | 2021-05-07 | 天元大数据信用管理有限公司 | Credit anti-fraud identification method, credit anti-fraud identification device, credit anti-fraud identification equipment and credit anti-fraud identification medium based on big data |
CN113362158A (en) * | 2021-05-31 | 2021-09-07 | 中国银联股份有限公司 | Credit evaluation method, device and computer readable storage medium |
Non-Patent Citations (2)
Title |
---|
大数据背景下的中小企业信用风险评价与应用研究;张永春;陈岩;;《电子商务》(第12期);36-37+61 * |
论信用贷款难与征信业务创新;刘洪来;赵宇翔;;《东岳论丛》(第3期);147-151 * |
Also Published As
Publication number | Publication date |
---|---|
CN114119201A (en) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chernozhukov et al. | Debiased machine learning of global and local parameters using regularized Riesz representers | |
Arroyo et al. | Assessment of machine learning performance for decision support in venture capital investments | |
US20210303970A1 (en) | Processing data using multiple neural networks | |
CN112819604A (en) | Personal credit evaluation method and system based on fusion neural network feature mining | |
Sharaf et al. | StockPred: a framework for stock Price prediction | |
Fu | Combination of random forests and neural networks in social lending | |
CN111461225B (en) | Customer clustering system and method thereof | |
Jose et al. | An efficient system to predict and analyze stock data using Hadoop techniques | |
CN111882426A (en) | Business risk classifier training method, device, equipment and storage medium | |
CN114119201B (en) | Enterprise credit investigation method, equipment and medium | |
Vaddi et al. | Predicting crypto currency prices using machine learning and deep learning techniques | |
CN114463036A (en) | Information processing method and device and storage medium | |
Jiang et al. | An intelligent recommendation approach for online advertising based on hybrid deep neural network and parallel computing | |
Semiu et al. | A boosted decision tree model for predicting loan default in P2P lending communities | |
Hsieh et al. | Improve fidelity and utility of synthetic credit card transaction time series from data-centric perspective | |
Trinh | A comparative analysis of consumer credit risk models in Peer-to-Peer Lending | |
Li et al. | Restructuring performance prediction with a rebalanced and clustered support vector machine | |
CN112001425A (en) | Data processing method and device and computer readable storage medium | |
CN113706258A (en) | Product recommendation method, device, equipment and storage medium based on combined model | |
Roijmans | Macroeconomic factors in loan default prediction | |
Sudhakaran et al. | XGBoost Optimized by Adaptive Tree Parzen Estimators for Credit Risk Analysis | |
CN111612626A (en) | Method and device for preprocessing bond evaluation data | |
Mitsdorffer et al. | Rule extraction from technology IPOs in the US stock market | |
Li et al. | A Loan risk assessment model with consumption features for online finance | |
Wang | Forecasting Credit Card Defaults Using Light Gradient Boosting Machine with Dart Algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |