CN112465622A - Method, system, medium and computer equipment for checking enterprise comprehensive credit information - Google Patents
Method, system, medium and computer equipment for checking enterprise comprehensive credit information Download PDFInfo
- Publication number
- CN112465622A CN112465622A CN202010984312.6A CN202010984312A CN112465622A CN 112465622 A CN112465622 A CN 112465622A CN 202010984312 A CN202010984312 A CN 202010984312A CN 112465622 A CN112465622 A CN 112465622A
- Authority
- CN
- China
- Prior art keywords
- data
- enterprise
- information
- dynamic
- credit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000012545 processing Methods 0.000 claims abstract description 101
- 230000003068 static effect Effects 0.000 claims abstract description 70
- 238000005065 mining Methods 0.000 claims abstract description 13
- 230000010354 integration Effects 0.000 claims abstract description 10
- 230000004927 fusion Effects 0.000 claims abstract description 6
- 238000007418 data mining Methods 0.000 claims description 30
- 238000004422 calculation algorithm Methods 0.000 claims description 27
- 238000004458 analytical method Methods 0.000 claims description 26
- 230000009467 reduction Effects 0.000 claims description 22
- 238000004364 calculation method Methods 0.000 claims description 19
- 238000007405 data analysis Methods 0.000 claims description 18
- 238000010606 normalization Methods 0.000 claims description 14
- 238000003064 k means clustering Methods 0.000 claims description 12
- 239000002131 composite material Substances 0.000 claims description 10
- 238000007781 pre-processing Methods 0.000 claims description 10
- 238000004140 cleaning Methods 0.000 claims description 9
- 230000002159 abnormal effect Effects 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 7
- 238000013501 data transformation Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 5
- 230000000875 corresponding effect Effects 0.000 claims description 4
- 238000009826 distribution Methods 0.000 claims description 4
- 230000008520 organization Effects 0.000 claims description 4
- 238000013500 data storage Methods 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 238000013139 quantization Methods 0.000 claims description 3
- 238000003860 storage Methods 0.000 claims description 3
- 230000002596 correlated effect Effects 0.000 claims description 2
- 238000000638 solvent extraction Methods 0.000 claims description 2
- 238000004590 computer program Methods 0.000 claims 2
- 230000007547 defect Effects 0.000 abstract description 5
- 238000004883 computer application Methods 0.000 abstract description 2
- 238000007689 inspection Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 9
- 230000006870 function Effects 0.000 description 7
- 230000007246 mechanism Effects 0.000 description 5
- 238000013461 design Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000012550 audit Methods 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 239000000047 product Substances 0.000 description 3
- XEEYBQQBJWHFJM-UHFFFAOYSA-N Iron Chemical compound [Fe] XEEYBQQBJWHFJM-UHFFFAOYSA-N 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 238000013523 data management Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 239000002994 raw material Substances 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000012271 agricultural production Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000006227 byproduct Substances 0.000 description 1
- 239000003245 coal Substances 0.000 description 1
- 238000010219 correlation analysis Methods 0.000 description 1
- 239000010779 crude oil Substances 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 229910052742 iron Inorganic materials 0.000 description 1
- 210000001503 joint Anatomy 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000010959 steel Substances 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- Human Resources & Organizations (AREA)
- General Physics & Mathematics (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Educational Administration (AREA)
- Entrepreneurship & Innovation (AREA)
- Finance (AREA)
- General Business, Economics & Management (AREA)
- Marketing (AREA)
- Accounting & Taxation (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Game Theory and Decision Science (AREA)
- Technology Law (AREA)
- Tourism & Hospitality (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention belongs to the technical field of computer application, and discloses a method, a system, a medium and computer equipment for checking enterprise comprehensive credit information, which are used for collecting multi-source data and performing data fusion, classification processing and integration; dividing the integrated data into static data and dynamic data; analyzing the static data, and mining and dividing the dynamic data; and respectively processing the static data and the dynamic data, and generating an enterprise comprehensive credit checking report based on the processing result of the static data and the dynamic data. According to the invention, multi-source data are collected, classified, processed and integrated and uniformly stored in the data resource pool, each transaction platform can share data in the shared data pool, the defect that the existing bulk commodity transaction platform is vertical and single is overcome, and the data are comprehensively analyzed by combining dynamic and static data classification processing to generate an enterprise comprehensive credit check report.
Description
Technical Field
The invention belongs to the technical field of computer application, and particularly relates to a method, a system, a medium and computer equipment for checking enterprise comprehensive credit information.
Background
At present, bulk commodities refer to material commodities which can enter the circulation field but are not in the retail link, have commodity attributes and are used for large-scale buying and selling in industrial and agricultural production and consumption. In the financial investment market, bulk commodities refer to homogeneous, tradeable and widely used as industrial basic raw materials, such as crude oil, nonferrous metals, steel, agricultural products, iron ore, coal and the like. Including 3 categories, i.e. energy commodities, basic raw materials and agricultural by-products.
The existing trading platforms of the bulk commodity trading market have singleness, lack of relevance and the like. On one hand, the enterprise safety index is difficult to guarantee, and the transaction platform is vertical and single and is not easy to find the abnormal hidden behind the data; on the other hand, different large commodity types have different transaction platforms, mass data are difficult to share, and a lot of important information cannot be related and mined, so that data waste is inevitably caused. In the current big data era of rapid development, implicit information in mass data can be mined by using a data mining analysis technology, and an enterprise comprehensive credit checking method which can share cross-platform information, is associated with enterprise subject attributes and combines dynamic and static data needs to be researched based on a bulk commodity transaction platform. The system inspection result is finally presented in an enterprise comprehensive credit inspection report, and the inspection report comprises four parts: (1) the basic information refers to basic information of an enterprise, such as an enterprise name, an enterprise organization code, a unified social credit code and the like; (2) the related information is information of an enterprise related to the enterprise; (3) the comprehensive credit scoring information is obtained by combining a data mining analysis technology, a clustering dimension reduction algorithm and an AHP hierarchical analysis algorithm; (4) and other information, which is information having a large influence on the credit of an enterprise or important for the credit.
Through the above analysis, the problems and defects of the prior art are as follows: the existing trading platforms of the bulk commodity trading market have singleness and lack of relevance; the trading platforms of the existing bulk commodity trading market have the problems of non-standard, non-centralized, low safety and the like due to the existence of singleness and lack of relevance, and the problems of vertical singleness of modes between the platforms, difficulty in sharing data, lack of relevance of a main body and the like are caused.
The difficulty in solving the above problems and defects is: massive data is difficult to share, a lot of important information cannot be related and mined, and the abnormality hidden behind the data is difficult to discover.
The significance of solving the problems and the defects is as follows: the relevance between the platforms is tighter, the enterprise data credit is more accurate, the enterprise trading subject is safer, and the supervision on market management or each trading platform is higher.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides an enterprise comprehensive credit information checking method, system, medium and computer equipment.
The invention is realized in this way, a checking method for enterprise comprehensive credit information, the checking method for enterprise comprehensive credit information includes:
collecting multi-source data, fusing the multi-source data, storing the fused data in a data pool, and performing data classification processing and integration;
dividing the data integrated in the data pool into static data for generating basic information data and dynamic data for generating associated information data, comprehensive credit score data and other important data;
analyzing the static data, mining the correlation attributes and the related data of the enterprise main body on the dynamic data, and dividing index data of the dynamic data on the basis of the mined data;
and step four, respectively processing the static data and the dynamic data, and generating an enterprise comprehensive credit checking report based on the processing results of the static data and the dynamic data.
Further, in step one, the multi-source data includes: historical transaction data, authority data, third party interface data, and other data for each platform.
Further, in step two, the dynamic data includes:
the dynamic data is used for generating associated information data, comprehensive credit scoring data and other important data;
the associated information data is used for generating associated information;
the comprehensive credit scoring data is used for generating comprehensive credit scoring and the proportion of scoring dimensionality of each level of index data in the comprehensive credit scoring;
the other important data is used for generating relatively important information of the enterprise;
the comprehensive credit scoring data is obtained by processing dynamic data;
the associated information data and other important data are respectively generated according to third party interface data in the dynamic data or other source data.
Further, in step three, the index data partitioning of the dynamic data based on the mined data includes:
based on the relevance, namely the relevance coefficient, among the enterprise attributes determined by the enterprise subject relevance attributes obtained by mining, clustering dimension reduction division of dynamic data is carried out according to the determined relevance coefficient, the dynamic data is captured to generate three-level index data, then the three-level index data is clustered and subjected to dimension reduction to generate second-level index data, and then the second-level index data is clustered and subjected to dimension reduction to generate the first-level index data.
Further, in step four, the processing the static data includes: and extracting and arranging the collected static data into normalized basic information.
Further, in step four, the method for processing dynamic data includes:
(1) acquiring required massive dynamic data from a processing and integrating data pool, and integrating the dynamic data again to prepare for data structuring;
(2) the obtained dynamic data information is captured, sorted, dimension reduced and structured;
(3) carrying out data mining and analysis on the data which is well structured, analyzing the association information, the credit influence factor proportion and other data in the two-level index data, and determining the correlation coefficient among the attributes of the two-level index data, namely the correlation among the attributes of the enterprise body;
(4) calculating the weight of each attribute of the secondary index data by using a hierarchical analysis algorithm according to the correlation coefficient, performing K-means clustering dimensionality reduction according to the weight, and generating primary index data, namely determining the scoring dimensionality of the credit score;
(5) determining a correlation coefficient among all dimensions of the primary index data, determining the weight of each dimension by using an AHP algorithm, distributing the weight for each dimension, calculating a comprehensive credit score value by using a comprehensive credit score value calculation formula, and calculating the proportion of each dimension in the comprehensive credit score value to obtain comprehensive credit score data.
Further, the step (2) further comprises:
(1) acquiring required massive dynamic data from a processing and integrating data pool, and integrating the dynamic data again to prepare for data structuring;
(2) the obtained dynamic data information is captured, sorted, dimension reduced and structured;
(2.1) capturing attributes with relatively large enterprise attribute relevance to form necessary data, cleaning and preprocessing the data, and sorting out three-level indexes; the data preprocessing process comprises the following steps: data cleaning, data transformation and data specification; the data cleaning is mainly used for processing various abnormal conditions of various attributes in the data source on numerical values, wherein some abnormal values are discarded, and a representative median, a mean or other numerical values are used for replacing missing values, so that the data is completely supplemented; data transformation discretizes and binarizes data. The discretization mainly comprises the steps of performing discretization division on cleaned data extracted from each data source into different intervals, and performing dualization to distinguish two states, wherein the two states can be defined as 0 or 1 on the value, so that the true or false is represented, and preparation is made for a data protocol; the data specification reduces attributes in a data source and the data quantity of each attribute, primarily reduces the dimensionality of the data, deletes useless information or redundant data, and arranges three levels of index data;
(2.2) carrying out algorithm analysis and calculation on the three-level index data, quantizing the data, and determining a correlation coefficient of each attribute of the quantized three-level index data; the three-level index data are analyzed and calculated through an algorithm, and the data quantization comprises the following steps: normalizing the three-level index data, and unifying all the index data in one order;
the normalizing the three-level index data comprises the following steps: directly utilizing a normalization formula to normalize the numerical indexes and the discrete indexes in the three-level index data; converting part of grade data of character type indexes in the three-grade index data into corresponding numerical values, and then carrying out normalization processing by using a normalization formula;
the normalization formula is:
Yi=(Xi-Xmin)/(Xmax-Xmin);
wherein Xi and Yi respectively represent index items before and after processing, Xmin represents the minimum value of the three-level index data, and Xmax represents the maximum value of the three-level index data, so that quantized three-level index data are finally obtained;
the correlation coefficient between the attributes adopts a pearson correlation coefficient, which reflects the direction and degree of the variation trend between the two main attributes of the main body, and the value range is (-1, +1), 0 represents that the two main attributes are not correlated, a positive value represents a positive correlation, a negative value represents a negative correlation, a larger value represents a stronger correlation, and the formula of the pearson correlation coefficient is as follows:
P(X,Y)=cov(X,Y)/sX*sY;
where P (X, Y) is the correlation coefficient between variables X and Y, cov (X, Y) is the covariance between the two variables X, Y, sX by sY is the product of the standard deviations of the two attribute variables;
(2.3) carrying out K-means clustering dimensionality reduction according to the correlation coefficient to generate secondary index data, carrying out structural processing on the secondary index data to obtain a CSV file, an SQL file, a table file or a database table file for data analysis and mining, wherein the CSV file, the SQL file, the table file or the database table file is used for data mining and analysis; reducing the dimension of the three-level index data into two-level index data by k-means clustering, and reducing the dimension of the two-level index data into first-level index data; determining a clustering k value according to the number of attributes of each level of index;
(3) carrying out data mining and analysis on the data which is well structured, analyzing the association information, the credit influence factor proportion and other data in the two-level index data, and determining the correlation coefficient among the attributes of the two-level index data, namely the correlation among the attributes of the enterprise body;
(4) calculating the weight of each attribute of the secondary index data by using a hierarchical analysis algorithm according to the correlation coefficient, performing K-means clustering dimensionality reduction according to the weight, and generating primary index data, namely determining the scoring dimensionality of the credit score;
(5) determining a correlation coefficient among all dimensions of the primary index data, determining the weight of each dimension by using an AHP algorithm, distributing the weight for each dimension, calculating a comprehensive credit score value by using a comprehensive credit score value calculation formula, and calculating the proportion of each dimension in the comprehensive credit score value to obtain a comprehensive credit score;
the calculation formula of the comprehensive credit score value is as follows:
S=D1*W1+D2*W2+...+Dn*Wn;
wherein, 1, 2, 1, n is the nth dimension of the scoring dimension; d1 is the 1 st scoring dimension; w1 is the weight of the 1 st scoring dimension; d2 is the 2 nd scoring dimension; w2 is the weight of the 2 nd scoring dimension; dn is the nth scoring dimension; wn is the weight of the nth scoring dimension;
the calculation of the proportion of each dimension in the comprehensive credit score value comprises the following steps:
the calculation of the proportion of each dimension in the composite credit score value is performed using the following formula:
Pi=(Di*Wi)/S,(i=1,2,...,n);
wherein Di is the ith dimension; wi is the weight of the ith scoring dimension; s is a comprehensive score value; pi is the proportion of the ith dimension in the comprehensive score S;
determining the weight of each dimension according to an AHP hierarchical analysis algorithm, based on dimension weight division, replacing the traditional expert evaluation data with the correlation coefficient among the attributes, calculating the weight of each dimension by applying the AHP hierarchical analysis algorithm, and providing a digital mode of weight calculation.
Further, in step four, the integrated credit checking report of the enterprise comprises:
the enterprise comprehensive credit checking report comprises basic information, associated information, comprehensive credit scoring information and other information;
the basic information is basic information related to enterprises and comprises enterprise names, enterprise organization codes and unified social credit codes;
the related information is information of other enterprises related to the enterprise;
the other information is information which has a large influence on the credit of an enterprise or is crucial to the credit.
Another object of the present invention is to provide an enterprise general credit information checking system for implementing the method for checking enterprise general credit information, the system comprising:
the data acquisition module is used for acquiring multi-source data;
the data storage module is used for fusing the collected multi-source data and storing the fused data by using the data pool;
the data preprocessing module is used for carrying out data classification processing and integration on the fusion data in the data pool;
the data dividing module is used for dividing the data integrated in the data pool into static data used for generating basic information data and dynamic data used for generating associated information data, comprehensive credit scoring data and other important data;
the data analysis module is used for analyzing the static data, mining the correlation attributes and the related data of the enterprise main body on the dynamic data and dividing the index data of the dynamic data on the basis of the mined data;
the data processing module is used for respectively processing the static data and the dynamic data;
and the result generation module is used for generating an enterprise comprehensive credit checking report based on the data processing result.
Further, the data processing module comprises:
a static data processing unit and a dynamic data processing unit;
the static processing unit is used for extracting and sorting the collected static data into normalized basic information;
the dynamic data processing unit includes: the system comprises a data acquisition subunit, a structuring subunit, a data mining and analyzing subunit and a comprehensive credit scoring subunit; the system is used for processing dynamic data and generating associated information data, comprehensive credit scoring data and other important data.
Further, the system for checking the enterprise integrated credit information further comprises: the single roles of a highest authority manager, a platform manager and a bulk commodity transaction platform;
the highest authority manager is used for managing a distribution platform manager;
the platform administrator is used for managing the registration information of the transaction platform users, increasing, deleting, modifying and checking, and managing and checking each transaction platform;
the bulk commodity transaction platform is used for entering a system to check the comprehensive credit information of each enterprise through registration and login.
By combining all the technical schemes, the invention has the advantages and positive effects that: the invention aims to provide an enterprise comprehensive credit checking method based on cross-platform information sharing, market subject attribute association and dynamic and static data combination of a bulk commodity trading platform, and solves the problems of singleness, lack of association and the like of the trading platform of the bulk commodity trading market. According to the invention, multi-source data are collected, classified, processed and integrated and uniformly stored in the data resource pool, each transaction platform can share data in the shared data pool, the defect that the existing bulk commodity transaction platform is vertical and single is overcome, and the data are comprehensively analyzed by combining dynamic and static data classification processing to generate an enterprise comprehensive credit check report.
The invention utilizes data mining and analyzing technology to carry out correlation analysis on the enterprise body attributes, determines correlation coefficients among different body attributes, reduces dimensions by using clustering technology, and mines the incidence relation of the enterprise body attributes and implicit information behind mass data. The invention relates to an enterprise comprehensive credit information checking method, which designs a set of complete checking implementation flow and mechanism and realizes an enterprise comprehensive credit checking method with the functions of cross-platform information sharing, market subject attribute correlation and dynamic and static data combination.
The invention integrates multi-source data, analyzes dynamic and static data, performs data mining and analysis on the dynamic data, mines the association relationship of the enterprise body attributes, and realizes the enterprise comprehensive credit checking method and mechanism combining cross-platform data sharing, enterprise body attribute association and dynamic and static data. The invention realizes the enterprise comprehensive credit information checking system by researching the dynamic and static data processing method and mechanism and applying and practicing the method.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a flowchart of an enterprise comprehensive credit information checking method according to an embodiment of the present invention.
Fig. 2 is a flowchart of an enterprise comprehensive credit checking method according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of dynamic data processing provided by an embodiment of the present invention.
Fig. 4 is a flow chart of dynamic data processing provided by the embodiment of the invention.
Fig. 5 is a flow chart of a module core technology provided by the embodiment of the invention.
Fig. 6 is a diagram of an architecture of an enterprise integrated credit information checking system according to an embodiment of the present invention.
Fig. 7 is a flow chart of a functional interface of the enterprise integrated credit verification system according to an embodiment of the present invention.
Fig. 8 is a schematic diagram of an enterprise integrated credit checking system according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the following embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Aiming at the problems in the prior art, the invention provides an enterprise comprehensive credit information checking method, a system, a medium and computer equipment, and the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1, the method for checking the enterprise comprehensive credit information according to the embodiment of the present invention includes the following steps:
s101, collecting multi-source data, fusing the multi-source data, storing the fused data in a data pool, and performing data classification processing and integration;
s102, dividing data integrated in a data pool into static data used for generating basic information data and dynamic data used for generating associated information data, comprehensive credit score data and other important data;
s103, analyzing the static data, mining the correlation attributes and the related data of the enterprise main body on the dynamic data, and dividing the index data of the dynamic data on the basis of the mined data;
and S104, respectively processing the static data and the dynamic data, and generating an enterprise comprehensive credit checking report based on the processing result of the static data and the dynamic data.
Those skilled in the art can also implement the method of checking the comprehensive credit information of the enterprise provided by the present invention by using other steps, and the method of checking the comprehensive credit information of the enterprise provided by the present invention in fig. 1 is only one specific example.
In step S101, the multi-source data provided by the embodiment of the present invention includes: historical transaction data, authority data, third party interface data, and other data for each platform.
In step S102, the dynamic data provided in the embodiment of the present invention includes:
the dynamic data is used for generating associated information data, comprehensive credit scoring data and other important data;
the associated information data is used for generating associated information;
the comprehensive credit scoring data is used for generating comprehensive credit scoring and the proportion of scoring dimensionality of each level of index data in the comprehensive credit scoring;
the other important data is used for generating relatively important information of the enterprise;
the comprehensive credit scoring data is obtained by processing dynamic data;
the associated information data and other important data are respectively generated according to third party interface data in the dynamic data or other source data.
In step S103, the index data division of the dynamic data based on the mined data according to the embodiment of the present invention includes:
based on the relevance, namely the relevance coefficient, among the enterprise attributes determined by the enterprise subject relevance attributes obtained by mining, clustering dimension reduction division of dynamic data is carried out according to the determined relevance coefficient, the dynamic data is captured to generate three-level index data, then the three-level index data is clustered and subjected to dimension reduction to generate second-level index data, and then the second-level index data is clustered and subjected to dimension reduction to generate the first-level index data.
In step S104, the processing of the static data provided in the embodiment of the present invention includes: and extracting and arranging the collected static data into normalized basic information.
As shown in fig. 3 to fig. 5, in step S104, the method for processing dynamic data according to the embodiment of the present invention includes:
(1) acquiring required massive dynamic data from a processing and integrating data pool, and integrating the dynamic data again to prepare for data structuring;
(2) the obtained dynamic data information is captured, sorted, dimension reduced and structured;
(3) carrying out data mining and analysis on the data which is well structured, analyzing the association information, the credit influence factor proportion and other data in the two-level index data, and determining the correlation coefficient among the attributes of the two-level index data, namely the correlation among the attributes of the enterprise body;
(4) calculating the weight of each attribute of the secondary index data by using a hierarchical analysis algorithm according to the correlation coefficient, performing K-means clustering dimensionality reduction according to the weight, and generating primary index data, namely determining the scoring dimensionality of the credit score;
(5) determining a correlation coefficient among all dimensions of the primary index data, determining the weight of each dimension by using an AHP algorithm, distributing the weight for each dimension, calculating a comprehensive credit score value by using a comprehensive credit score value calculation formula, and calculating the proportion of each dimension in the comprehensive credit score value to obtain comprehensive credit score data.
The step (2) provided by the embodiment of the invention further comprises the following steps:
(2.1) capturing attributes with relatively large enterprise attribute relevance to form necessary data, cleaning and preprocessing the data, and sorting out three-level indexes;
(2.2) carrying out algorithm analysis and calculation on the three-level index data, quantizing the data, and determining a correlation coefficient of each attribute of the quantized three-level index data;
and (2.3) carrying out K-means clustering dimensionality reduction according to the correlation coefficient to generate secondary index data, carrying out structural processing on the secondary index data, and processing the secondary index data into a CSV file, an SQL file, a table file or a database table file for data analysis and mining for data mining and analysis.
In step (2.1), the data preprocessing process provided by the embodiment of the present invention includes: data cleaning, data transformation and data specification.
The data cleaning is mainly used for processing various abnormal conditions of numerical values of all attributes in the data source, wherein some abnormal values are discarded, and a representative median, an average or other numerical values are used for replacing missing values, so that the data is completely supplemented.
Data transformation is mainly to discretize and dualize data. The discretization is mainly to perform discretization division on the cleaned data extracted from each data source into different intervals. Binarization essentially distinguishes between two states, which may be defined numerically as 0 or 1, indicating true or false, in preparation for data reduction.
The data protocol mainly reduces attributes in a data source and data quantity of each attribute, primarily reduces dimensionality of data, deletes useless information or redundant data, and arranges three levels of index data.
In step (2.2), the data quantization method based on the three-level index data through algorithm analysis and calculation includes: normalizing the three-level index data, and unifying all the index data in one order;
the normalizing the three-level index data comprises the following steps: directly utilizing a normalization formula to normalize the numerical indexes and the discrete indexes in the three-level index data; converting part of grade data of character type indexes in the three-grade index data into corresponding numerical values, and then carrying out normalization processing by using a normalization formula;
the normalization formula is:
Yi=(Xi-Xmin)/(Xmax-Xmin);
wherein Xi and Yi respectively represent index items before and after processing, Xmin represents the minimum value of the three-level index data, and Xmax represents the maximum value of the three-level index data, so that quantized three-level index data are finally obtained.
In step (5), the calculation formula of the comprehensive credit score value provided by the embodiment of the invention is as follows:
S=D1*W1+D2*W2+...+Dn*Wn;
wherein, 1, 2, 1, n is the nth dimension of the scoring dimension; d1 is the 1 st scoring dimension; w1 is the weight of the 1 st scoring dimension; d2 is the 2 nd scoring dimension; w2 is the weight of the 2 nd scoring dimension; dn is the nth scoring dimension; wn is the weight of the nth scoring dimension.
In step (2.2), the correlation coefficient between the attributes provided by the embodiment of the present invention mainly uses pearson correlation coefficient, which mainly reflects the direction and degree of the variation trend between the two main attributes of the main body, and the value range is (-1, +1), where 0 indicates that the two main attributes are not related, positive values indicate positive correlation, negative values indicate negative correlation, and larger values indicate stronger correlation. The formula for pearson correlation coefficient is as follows:
P(X,Y)=cov(X,Y)/sX*sY,
where P (X, Y) is the correlation coefficient between variables X and Y, cov (X, Y) is the covariance between the two variables X, Y, and sX by sY is the product of the standard deviations of the two attribute variables.
In the step (2.3), the k-means clustering provided by the embodiment of the invention mainly reduces the dimension of the third-level index data into the second-level index data, and reduces the dimension of the second-level index data into the first-level index data. The clustering k value in the method is determined according to the number of attributes of each level of index.
In step (5), the calculating of the proportion of each dimension in the comprehensive credit score value provided by the embodiment of the present invention includes:
the calculation of the proportion of each dimension in the composite credit score value is performed using the following formula:
Pi=(Di*Wi)/S,(i=1,2,...,n);
wherein Di is the ith dimension; wi is the weight of the ith scoring dimension; s is a comprehensive score value; pi is the proportion of the ith dimension in the composite score S.
In the step (5), the weights of the dimensions are determined according to the AHP hierarchical analysis algorithm, based on dimension weight division, the correlation coefficient among the attributes is used for replacing the traditional expert evaluation data, the AHP hierarchical analysis algorithm is used for calculating the weights of the dimensions, and meanwhile, a digital mode of weight calculation is provided, so that the uncertain and unfair factors caused by artificial scoring are improved.
The enterprise comprehensive credit checking report provided by the embodiment of the invention comprises the following steps:
the enterprise comprehensive credit checking report comprises basic information, associated information, comprehensive credit scoring information and other information;
the basic information is basic information related to enterprises and comprises enterprise names, enterprise organization codes and unified social credit codes;
the related information is information of other enterprises related to the enterprise;
the other information is information which has a large influence on the credit of an enterprise or is crucial to the credit.
As shown in fig. 6 to 7, the system for checking the comprehensive credit information of the enterprise according to the embodiment of the present invention includes:
the data acquisition module is used for acquiring multi-source data;
the data storage module is used for fusing the collected multi-source data and storing the fused data by using the data pool;
the data preprocessing module is used for carrying out data classification processing and integration on the fusion data in the data pool;
the data dividing module is used for dividing the data integrated in the data pool into static data used for generating basic information data and dynamic data used for generating associated information data, comprehensive credit scoring data and other important data;
the data analysis module is used for analyzing the static data, mining the correlation attributes and the related data of the enterprise main body on the dynamic data and dividing the index data of the dynamic data on the basis of the mined data;
the data processing module is used for respectively processing the static data and the dynamic data;
and the result generation module is used for generating an enterprise comprehensive credit checking report based on the data processing result.
The data processing module provided by the embodiment of the invention comprises:
a static data processing unit and a dynamic data processing unit;
the static processing unit is used for extracting and sorting the collected static data into normalized basic information;
the dynamic data processing unit includes: the system comprises a data acquisition subunit, a structuring subunit, a data mining and analyzing subunit and a comprehensive credit scoring subunit; the system is used for processing dynamic data and generating associated information data, comprehensive credit scoring data and other important data.
The system for checking the enterprise comprehensive credit information provided by the embodiment of the invention further comprises: the single roles of a highest authority manager, a platform manager and a bulk commodity transaction platform;
the highest authority manager is used for managing a distribution platform manager;
the platform administrator is used for managing the registration information of the transaction platform users, increasing, deleting, modifying and checking, and managing and checking each transaction platform;
the bulk commodity transaction platform is used for entering a system to check the comprehensive credit information of each enterprise through registration and login.
The technical solution of the present invention is further described with reference to the following specific examples.
Example 1:
the method for checking the enterprise comprehensive credit information comprises the following steps: the method comprises the following steps:
1) multi-source data are collected, and cross-platform information sharing is achieved;
2) analyzing the dynamic and static data, performing data mining on the dynamic data, and mining the correlation attribute of the enterprise body;
3) completing the processing of dynamic and static data;
4) and forming an enterprise comprehensive credit checking report.
Collecting historical transaction data, authority data, third-party interface data, other multi-source data and the like of each platform into a whole, uniformly storing the data in a data pool, and carrying out data classification processing and integration.
In some illustrative embodiments, data is classified and processed, the data is mainly divided into static data and dynamic data, implicit information in historical transaction data and other mass data in the dynamic data is mined, enterprise subject relevance is analyzed, and index data division is performed according to the relevance.
And respectively carrying out static data processing and dynamic data processing operations on the static data and the dynamic data.
The static data processing is mainly to extract and arrange the static data into normalized basic information.
In some illustrative embodiments, as shown with reference to FIG. 3, the dynamic data processing is primarily comprised of four modules, including a data acquisition module, a structuring module, a data mining and analysis module, and a composite credit scoring module. The dynamic data ultimately forms associated information data, composite credit score data, and other important data. Wherein:
the associated information data is used for generating associated information;
the comprehensive credit scoring data is an output core of dynamic data processing and is used for determining enterprise comprehensive credit scoring and the proportion of each scoring dimension in the comprehensive credit scoring;
other important data is used to generate relatively important information for the business.
In some illustrative embodiments, the dynamic data processing detailed steps are shown with reference to FIG. 3, and include:
as can be seen from fig. 4, the data acquisition module mainly acquires a required large amount of dynamic data from a data pool in which multi-source data are collected, re-integrates the dynamic data, and prepares for data structuring;
the core technical flow chart of the data structuring module and the data mining analysis module is shown in FIG. 4;
as can be seen from fig. 4, the data structuring module mainly captures, sorts, reduces the dimension, and structures the acquired dynamic data information;
as can be seen from fig. 5, in the data structuring module, an attribute with a relatively large association of enterprise attributes is to be captured to form necessary data, and then the data is cleaned, preprocessed, and sorted into three-level indexes; the three-level index data is analyzed and calculated through an algorithm, the data is quantized, correlation coefficients of all attributes of the quantized three-level index data are determined, K-means clustering dimensionality reduction is carried out according to the correlation coefficients to generate second-level index data, the second-level index data are subjected to structural processing to be processed into CSV files, SQL files, table files, database table files or the like for data analysis and mining, and data mining and analysis are facilitated;
the algorithm quantification of the three-level indexes is mainly that the magnitude of the data is large, which is not beneficial to converting the final comprehensive credit score, so that the three-level index data needs to be normalized, and all the index data are unified in one magnitude. The numerical index can be directly preprocessed by applying a normalization formula; the scatter index, for example, only data of two values 0, 1 need not be preprocessed; character type indexes, for example, some grade data need to be converted into corresponding numerical values 1, 2 and 3 and then normalized; and using the formula: and (Xi-Xmin)/(Xmax-Xmin) normalizing the three-level index data, so that influence caused by magnitude is avoided, wherein Xi and Yi respectively represent index items before and after processing, Xmin represents the minimum value of the three-level index data, and Xmax represents the maximum value of the three-level index data, and finally quantized three-level index data are obtained.
As can be seen from fig. 4, the data mining and analyzing module mainly performs data mining and analysis on the structured data;
as can be seen from fig. 5, the data mining analysis module mainly analyzes the correlation information, the credit influence factor ratio, and the like in the two index data, determines Pearson correlation coefficients (correlation between enterprise subject attributes) between the attributes of the secondary index data, calculates the weight of each attribute of the secondary index data by applying a hierarchical analysis algorithm (AHP algorithm) according to the correlation coefficients, performs K-means clustering dimension reduction according to the weight, generates primary index data (i.e., determines the scoring dimension of the credit score), and prepares for the final credit score;
as can be seen from fig. 4, the comprehensive credit scoring module mainly determines Pearson correlation coefficients among dimensions of the primary index data, similarly determines weights of the dimensions by using an AHP algorithm, performs weight distribution for the dimensions, and finally calculates to obtain enterprise comprehensive credit scoring values;
in some demonstrative embodiments, the enterprise composite credit score may be calculated by: s ═ D1 × W1+ D2 × W2+ - + Dn × Wn; wherein, 1, 2, 1, n is the nth dimension of the scoring dimension; d1 is the 1 st scoring dimension; w1 is the weight of the 1 st scoring dimension; d2 is the 2 nd scoring dimension; w2 is the weight of the 2 nd scoring dimension; dn is the nth scoring dimension; wn is the weight of the nth scoring dimension;
finally, calculating the proportion of each dimension in the comprehensive score value;
in some illustrative embodiments, the proportion of each dimension in the composite score value is calculated by the formula: pi ═ (Di × Wi)/S, (i ═ 1, 2,. n); wherein Di is the ith dimension; wi is the weight of the ith scoring dimension; s is a comprehensive score value; pi is the proportion of the ith dimension in the composite score S.
Referring now to fig. 6, fig. 6 is a diagram illustrating an architecture of an enterprise integrated credit checking system, and as shown in the system architecture, discloses an enterprise integrated credit checking system, including: each trading platform 001 registers 002; the administrator user reviews the platform information 003; the trading platform carries out system login 004; the multi-source data 005 is collected into the data pool 006 for processing; classifying the data and performing static data processing and dynamic data processing 007 respectively; the transaction platform enters the checking function 008; the transaction platform carries out comprehensive credit check 009 for the enterprise; comparing the basic information with the data pool to check 010; the checked enterprise generates a comprehensive credit checking report 011; the comprehensive credit check report is presented and downloaded 012.
In some illustrative embodiments, the transaction platforms register and log in mainly integrates multiple platforms, multi-source data sharing is achieved, data are collected across platforms and are subjected to data mining analysis processing, and more representative enterprise comprehensive credit check is formed.
In some illustrative embodiments, the administrator users 003 include a highest-rights administrator and a platform administrator. The highest authority manager mainly distributes the authority of each platform manager; and the platform administrator performs management check, addition, deletion, modification check and other operations on the registration information of each transaction platform.
In some illustrative embodiments, the data classification processing performed by the multi-source data collection data pool is part of the application of the enterprise comprehensive credit checking method in the system.
In some illustrative embodiments, the data classification process is mainly divided into static data processing and dynamic data processing 007, and the dynamic data processing mainly has four modules including: a data acquisition module 0071; a structuring module 0072, a data mining analysis module 0073; a composite credit scoring module 0074.
In some illustrative embodiments, the ping function 008 mainly utilizes the whole process and mechanism provided by the method to ping and output the enterprise integrated credit information to form an enterprise integrated credit ping report.
In some illustrative embodiments, the comparing and checking 010 of the basic information with the data pool refers to a static data processing process, and mainly compares the static data with the data pool authority data to form the basic information in the comprehensive credit information.
In some illustrative embodiments, the flow chart of the functional interface of the enterprise integrated credit verification system is shown in fig. 7, and it can be seen that there are 5 main functional interfaces of the enterprise integrated credit verification system, including: the system comprises a registration interface, a login interface, an administrator checking interface, an inspection interface, a data interface and an inspection report display interface.
In some demonstrative embodiments, the registration interface may also relate to short message authentication or mailbox authentication.
In some demonstrative embodiments, the login interface may include a transaction platform user login interface, an administrator user login interface.
In some illustrative embodiments, the platform user may perform a ping function after logging in. The checking function mainly comprises a checking interface, a data interface and a checking report display interface.
In some illustrative embodiments, the administrator user will enter an administrator audit interface after logging in, and mainly performs management operations such as adding, deleting, modifying, checking and the like on the user platform.
In some illustrative embodiments, the data interface is important information for determining a business credit review report; the inspection report display interface can download the enterprise credit inspection report to be stored locally.
In dynamic data processing, the relevance analysis of the subject attributes, the data mining and analyzing technology and the clustering dimension reduction technology can be replaced by other technologies with the same effect.
In the dynamic data processing, the index data of each level obtained by clustering and dimensionality reduction is not limited to the setting of the content of the invention.
In the implementation of the enterprise comprehensive credit information checking system, each interface function can be replaced by other technologies with the same effect according to specific situations.
The technical solution of the present invention is further described with reference to the following specific examples.
The implementation frame roadmap of the enterprise comprehensive credit checking system realized based on the method of the invention is shown in figure 8. On the basis of the butt joint of all the data interfaces, an enterprise comprehensive credit checking report is generated by utilizing multi-key variable main body attribute fusion and cross clustering data mining analysis.
According to the method and the data processing mechanism, the implementation framework of the enterprise comprehensive credit checking system is mainly composed of four parts, including a dynamic and static data management base, a normalized data information base, an enterprise credit checking system design and a credit scoring system research.
Wherein the dynamic and static data management base mainly manages and processes the dynamic and static data; the normalized data information basis mainly comprises data preprocessing, data normalization and index data division at each level; the design of the enterprise credit checking system is mainly based on the method design realization system; the research of the credit scoring system is mainly based on dynamic data processing, and comprehensive credit scoring is generated by using multi-key variable subject attribute fusion and cross clustering data mining analysis. All parts are finally integrated into an enterprise comprehensive credit checking report, and the enterprise comprehensive credit checking system can be realized based on the method. As shown in fig. 8.
In the enterprise comprehensive credit checking system realized based on the method, the maximum authority administrator distributes and manages each platform administrator, and each platform administrator manages and operates each platform. The platform user registers and logs in, applies for the audit of a platform administrator, enters the system after the audit is passed, clicks the inspection function, inputs the enterprise full name to be inspected, outputs the system in the form of an enterprise comprehensive credit inspection report after the inspection is determined, the content of the enterprise comprehensive credit inspection report mainly comprises basic information, associated information, credit rating information and other information of the enterprise, and judges the legality and risk of the enterprise and knows all aspects of the enterprise according to the information. The user can download the comprehensive credit checking report, so that the comprehensive credit checking report is convenient for local storage for backup and storage.
It should be noted that the embodiments of the present invention can be realized by hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided on a carrier medium such as a disk, CD-or DVD-ROM, programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier, for example. The apparatus and its modules of the present invention may be implemented by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., or by software executed by various types of processors, or by a combination of hardware circuits and software, e.g., firmware.
The above description is only for the purpose of illustrating the present invention and the appended claims are not to be construed as limiting the scope of the invention, which is intended to cover all modifications, equivalents and improvements that are within the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. An enterprise integrated credit information checking method, characterized in that the enterprise integrated credit information checking method comprises:
collecting multi-source data, fusing the multi-source data, storing the fused data in a data pool, and performing data classification processing and integration;
dividing the data integrated in the data pool into static data for generating basic information data and dynamic data for generating associated information data, comprehensive credit score data and other important data;
analyzing the static data, mining the correlation attributes and the related data of the enterprise main body on the dynamic data, and dividing index data of the dynamic data on the basis of the mined data;
and respectively processing the static data and the dynamic data, and generating an enterprise comprehensive credit checking report based on the processing result of the static data and the dynamic data.
2. The method for checking integrated credit information for an enterprise of claim 1, wherein the multi-source data comprises: historical transaction data, authority data, third party interface data, and other data for each platform.
3. The method for checking integrated credit information for an enterprise of claim 1, wherein the dynamic data comprises:
the dynamic data is used for generating associated information data, comprehensive credit scoring data and other important data;
the associated information data is used for generating associated information;
the comprehensive credit scoring data is used for generating comprehensive credit scoring and the proportion of scoring dimensionality of each level of index data in the comprehensive credit scoring;
the other important data is used for generating relatively important information of the enterprise;
the comprehensive credit scoring data is obtained by processing dynamic data;
the associated information data and other important data are respectively generated according to third party interface data in the dynamic data or other source data.
4. The method for checking integrated enterprise credit information of claim 1, wherein said indexing data partitioning dynamic data based on mined data comprises:
based on the relevance, namely the relevance coefficient, among the enterprise attributes determined by the enterprise subject relevance attributes obtained by mining, clustering dimension reduction division of dynamic data is carried out according to the determined relevance coefficient, the dynamic data is captured to generate three-level index data, then the three-level index data is clustered and subjected to dimension reduction to generate second-level index data, and then the second-level index data is clustered and subjected to dimension reduction to generate the first-level index data.
5. The method for checking integrated credit information for an enterprise of claim 1, wherein the processing of the static data comprises: extracting and arranging the collected static data into normalized basic information;
the enterprise synthetic credit verification report includes:
the enterprise comprehensive credit checking report comprises basic information, associated information, comprehensive credit scoring information and other information;
the basic information is basic information related to enterprises and comprises enterprise names, enterprise organization codes and unified social credit codes;
the related information is information of other enterprises related to the enterprise;
the other information is information which has a large influence on the credit of an enterprise or is crucial to the credit.
6. The method for checking integrated credit information for an enterprise according to claim 1, wherein the method for processing dynamic data comprises:
(1) acquiring required massive dynamic data from a processing and integrating data pool, and integrating the dynamic data again to prepare for data structuring;
(2) the obtained dynamic data information is captured, sorted, dimension reduced and structured;
(2.1) capturing attributes with relatively large enterprise attribute relevance to form necessary data, cleaning and preprocessing the data, and sorting out three-level indexes; the data preprocessing process comprises the following steps: data cleaning, data transformation and data specification; the data cleaning is mainly used for processing various abnormal conditions of various attributes in the data source on numerical values, wherein some abnormal values are discarded, and a representative median, a mean or other numerical values are used for replacing missing values, so that the data is completely supplemented; data transformation discretizes and binarizes data. The discretization mainly comprises the steps of performing discretization division on cleaned data extracted from each data source into different intervals; dualization distinguishes two states, which can be defined as 0 or 1 in numerical value, represents true or false, and prepares for data specification; the data specification reduces attributes in a data source and the data quantity of each attribute, primarily reduces the dimensionality of the data, deletes useless information or redundant data, and arranges three levels of index data;
(2.2) carrying out algorithm analysis and calculation on the three-level index data, quantizing the data, and determining a correlation coefficient of each attribute of the quantized three-level index data; the three-level index data are analyzed and calculated through an algorithm, and the data quantization comprises the following steps: normalizing the three-level index data, and unifying all the index data in one order;
the normalizing the three-level index data comprises the following steps: directly utilizing a normalization formula to normalize the numerical indexes and the discrete indexes in the three-level index data; converting part of grade data of character type indexes in the three-grade index data into corresponding numerical values, and then carrying out normalization processing by using a normalization formula;
the normalization formula is:
Yi=(Xi-Xmin)/(Xmax-Xmin);
wherein Xi and Yi respectively represent index items before and after processing, Xmin represents the minimum value of the three-level index data, and Xmax represents the maximum value of the three-level index data, so that quantized three-level index data are finally obtained;
the correlation coefficient between the attributes adopts a pearson correlation coefficient, which reflects the direction and degree of the variation trend between the two main attributes of the main body, and the value range is (-1, +1), 0 represents that the two main attributes are not correlated, a positive value represents a positive correlation, a negative value represents a negative correlation, a larger value represents a stronger correlation, and the formula of the pearson correlation coefficient is as follows:
P(X,Y)=cov(X,Y)/sX*sY;
where P (X, Y) is the correlation coefficient between variables X and Y, cov (X, Y) is the covariance between the two variables X, Y, sX by sY is the product of the standard deviations of the two attribute variables;
(2.3) carrying out K-means clustering dimensionality reduction according to the correlation coefficient to generate secondary index data, carrying out structural processing on the secondary index data to obtain a CSV file, an SQL file, a table file or a database table file for data analysis and mining, wherein the CSV file, the SQL file, the table file or the database table file is used for data mining and analysis; reducing the dimension of the three-level index data into two-level index data by k-means clustering, and reducing the dimension of the two-level index data into first-level index data; determining a clustering k value according to the number of attributes of each level of index;
(3) carrying out data mining and analysis on the data which is well structured, analyzing the association information, the credit influence factor proportion and other data in the two-level index data, and determining the correlation coefficient among the attributes of the two-level index data, namely the correlation among the attributes of the enterprise body;
(4) calculating the weight of each attribute of the secondary index data by using a hierarchical analysis algorithm according to the correlation coefficient, performing K-means clustering dimensionality reduction according to the weight, and generating primary index data, namely determining the scoring dimensionality of the credit score;
(5) determining a correlation coefficient among all dimensions of the primary index data, determining the weight of each dimension by using an AHP algorithm, distributing the weight for each dimension, calculating a comprehensive credit score value by using a comprehensive credit score value calculation formula, and calculating the proportion of each dimension in the comprehensive credit score value to obtain a comprehensive credit score;
the calculation formula of the comprehensive credit score value is as follows:
S=D1*W1+D2*W2+...+Dn*Wn;
wherein, 1, 2, 1, n is the nth dimension of the scoring dimension; d1 is the 1 st scoring dimension; w1 is the weight of the 1 st scoring dimension; d2 is the 2 nd scoring dimension; w2 is the weight of the 2 nd scoring dimension; dn is the nth scoring dimension; wn is the weight of the nth scoring dimension;
the calculation of the proportion of each dimension in the comprehensive credit score value comprises the following steps:
the calculation of the proportion of each dimension in the composite credit score value is performed using the following formula:
Pi=(Di*Wi)/S,(i=1,2,...,n);
wherein Di is the ith dimension; wi is the weight of the ith scoring dimension; s is a comprehensive score value; pi is the proportion of the ith dimension in the comprehensive score S;
determining the weight of each dimension according to an AHP hierarchical analysis algorithm, based on dimension weight division, replacing the traditional expert evaluation data with the correlation coefficient among the attributes, calculating the weight of each dimension by applying the AHP hierarchical analysis algorithm, and providing a digital mode of weight calculation.
7. A computer device, characterized in that the computer device comprises a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to carry out the steps of:
collecting multi-source data, fusing the multi-source data, storing the fused data in a data pool, and performing data classification processing and integration;
dividing the data integrated in the data pool into static data for generating basic information data and dynamic data for generating associated information data, comprehensive credit score data and other important data;
analyzing the static data, mining the correlation attributes and the related data of the enterprise main body on the dynamic data, and dividing index data of the dynamic data on the basis of the mined data;
and respectively processing the static data and the dynamic data, and generating an enterprise comprehensive credit checking report based on the processing result of the static data and the dynamic data.
8. A computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:
collecting multi-source data, fusing the multi-source data, storing the fused data in a data pool, and performing data classification processing and integration;
dividing the data integrated in the data pool into static data for generating basic information data and dynamic data for generating associated information data, comprehensive credit score data and other important data;
analyzing the static data, mining the correlation attributes and the related data of the enterprise main body on the dynamic data, and dividing index data of the dynamic data on the basis of the mined data;
and respectively processing the static data and the dynamic data, and generating an enterprise comprehensive credit checking report based on the processing result of the static data and the dynamic data.
9. An enterprise general credit checking system for implementing the method according to any one of claims 1 to 6, wherein the enterprise general credit checking system comprises:
the data acquisition module is used for acquiring multi-source data;
the data storage module is used for fusing the collected multi-source data and storing the fused data by using the data pool;
the data preprocessing module is used for carrying out data classification processing and integration on the fusion data in the data pool;
the data dividing module is used for dividing the data integrated in the data pool into static data used for generating basic information data and dynamic data used for generating associated information data, comprehensive credit scoring data and other important data;
the data analysis module is used for analyzing the static data, mining the correlation attributes and the related data of the enterprise main body on the dynamic data and dividing the index data of the dynamic data on the basis of the mined data;
the data processing module is used for respectively processing the static data and the dynamic data;
and the result generation module is used for generating an enterprise comprehensive credit checking report based on the data processing result.
10. The integrated enterprise credit verification system of claim 9, wherein the data processing module comprises:
a static data processing unit and a dynamic data processing unit;
the static processing unit is used for extracting and sorting the collected static data into normalized basic information;
the dynamic data processing unit includes: the system comprises a data acquisition subunit, a structuring subunit, a data mining and analyzing subunit and a comprehensive credit scoring subunit; the system is used for carrying out dynamic data processing and generating associated information data, comprehensive credit scoring data and other important data;
the enterprise integrated credit information checking system further comprises: the single roles of a highest authority manager, a platform manager and a bulk commodity transaction platform;
the highest authority manager is used for managing a distribution platform manager;
the platform administrator is used for managing the registration information of the transaction platform users, increasing, deleting, modifying and checking, and managing and checking each transaction platform;
the bulk commodity transaction platform is used for entering a system to check the comprehensive credit information of each enterprise through registration and login.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010984312.6A CN112465622B (en) | 2020-09-16 | 2020-09-16 | Enterprise comprehensive credit information checking method, system, medium and computer equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010984312.6A CN112465622B (en) | 2020-09-16 | 2020-09-16 | Enterprise comprehensive credit information checking method, system, medium and computer equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112465622A true CN112465622A (en) | 2021-03-09 |
CN112465622B CN112465622B (en) | 2024-03-05 |
Family
ID=74833740
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010984312.6A Active CN112465622B (en) | 2020-09-16 | 2020-09-16 | Enterprise comprehensive credit information checking method, system, medium and computer equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112465622B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112785427A (en) * | 2021-03-15 | 2021-05-11 | 国网青海省电力公司西宁供电公司 | Enterprise credit analysis system based on electric power data |
CN112990946A (en) * | 2021-03-31 | 2021-06-18 | 建信金融科技有限责任公司 | Enterprise default prediction method, device, medium and electronic equipment |
CN113886867A (en) * | 2021-08-27 | 2022-01-04 | 浙江数秦科技有限公司 | Loan credit granting system based on multi-source data fusion |
CN114615207A (en) * | 2022-03-10 | 2022-06-10 | 四川三思德科技有限公司 | Method and device for oriented processing of data before plug flow |
CN115511506A (en) * | 2022-09-30 | 2022-12-23 | 中国电子科技集团公司第十五研究所 | Enterprise credit rating method, device, terminal equipment and storage medium |
CN115827934A (en) * | 2023-02-21 | 2023-03-21 | 四川省计算机研究院 | Enterprise portrait intelligent analysis system and method based on unified social credit code |
CN115859223A (en) * | 2023-02-27 | 2023-03-28 | 四川省计算机研究院 | Multi-source data industry fusion analysis method and system |
CN118228216A (en) * | 2024-05-27 | 2024-06-21 | 杭州易靓好车互联网科技有限公司 | Enterprise digital authority management method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030229580A1 (en) * | 2002-06-10 | 2003-12-11 | David Gass | Method for establishing or improving a credit score or rating for a business |
US20090096746A1 (en) * | 2007-10-12 | 2009-04-16 | Immersion Corp., A Delaware Corporation | Method and Apparatus for Wearable Remote Interface Device |
CN101576988A (en) * | 2009-06-12 | 2009-11-11 | 阿里巴巴集团控股有限公司 | Credit data interactive system and interactive method |
US20100010935A1 (en) * | 2008-06-09 | 2010-01-14 | Thomas Shelton | Systems and methods for credit worthiness scoring and loan facilitation |
CN102622552A (en) * | 2012-04-12 | 2012-08-01 | 焦点科技股份有限公司 | Detection method and detection system for fraud access to business to business (B2B) platform based on data mining |
-
2020
- 2020-09-16 CN CN202010984312.6A patent/CN112465622B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030229580A1 (en) * | 2002-06-10 | 2003-12-11 | David Gass | Method for establishing or improving a credit score or rating for a business |
US20090096746A1 (en) * | 2007-10-12 | 2009-04-16 | Immersion Corp., A Delaware Corporation | Method and Apparatus for Wearable Remote Interface Device |
US20100010935A1 (en) * | 2008-06-09 | 2010-01-14 | Thomas Shelton | Systems and methods for credit worthiness scoring and loan facilitation |
CN101576988A (en) * | 2009-06-12 | 2009-11-11 | 阿里巴巴集团控股有限公司 | Credit data interactive system and interactive method |
CN102622552A (en) * | 2012-04-12 | 2012-08-01 | 焦点科技股份有限公司 | Detection method and detection system for fraud access to business to business (B2B) platform based on data mining |
Non-Patent Citations (2)
Title |
---|
TE-CHENG HSU: "Enhanced Recurrent Neural Network for Combining Static and Dynamic Features for Credit Card Default Prediction", IEEE * |
王红刚;王征风;陈绥阳;: "云环境下企业信用管理系统设计与实现", 计算机技术与发展, no. 01 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112785427A (en) * | 2021-03-15 | 2021-05-11 | 国网青海省电力公司西宁供电公司 | Enterprise credit analysis system based on electric power data |
CN112785427B (en) * | 2021-03-15 | 2024-04-26 | 国网青海省电力公司西宁供电公司 | Enterprise credit analysis system based on power data |
CN112990946A (en) * | 2021-03-31 | 2021-06-18 | 建信金融科技有限责任公司 | Enterprise default prediction method, device, medium and electronic equipment |
CN112990946B (en) * | 2021-03-31 | 2024-05-14 | 建信金融科技有限责任公司 | Enterprise default prediction method, device, medium and electronic equipment |
CN113886867A (en) * | 2021-08-27 | 2022-01-04 | 浙江数秦科技有限公司 | Loan credit granting system based on multi-source data fusion |
CN114615207A (en) * | 2022-03-10 | 2022-06-10 | 四川三思德科技有限公司 | Method and device for oriented processing of data before plug flow |
CN114615207B (en) * | 2022-03-10 | 2022-11-25 | 四川三思德科技有限公司 | Method and device for oriented processing of data before plug flow |
CN115511506A (en) * | 2022-09-30 | 2022-12-23 | 中国电子科技集团公司第十五研究所 | Enterprise credit rating method, device, terminal equipment and storage medium |
CN115827934A (en) * | 2023-02-21 | 2023-03-21 | 四川省计算机研究院 | Enterprise portrait intelligent analysis system and method based on unified social credit code |
CN115827934B (en) * | 2023-02-21 | 2023-05-09 | 四川省计算机研究院 | Enterprise portrait intelligent analysis system and method based on unified social credit code |
CN115859223A (en) * | 2023-02-27 | 2023-03-28 | 四川省计算机研究院 | Multi-source data industry fusion analysis method and system |
CN118228216A (en) * | 2024-05-27 | 2024-06-21 | 杭州易靓好车互联网科技有限公司 | Enterprise digital authority management method and system |
Also Published As
Publication number | Publication date |
---|---|
CN112465622B (en) | 2024-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112465622B (en) | Enterprise comprehensive credit information checking method, system, medium and computer equipment | |
CN110383319B (en) | Large scale heterogeneous data ingestion and user resolution | |
CN107633265B (en) | Data processing method and device for optimizing credit evaluation model | |
CN110852856B (en) | Invoice false invoice identification method based on dynamic network representation | |
Wang et al. | A framework for analysis of data quality research | |
CN112182246B (en) | Method, system, medium, and application for creating an enterprise representation through big data analysis | |
CN112527774A (en) | Data center building method and system and storage medium | |
CN110544035A (en) | internal control detection method, system and computer readable storage medium | |
CN111754317A (en) | Financial investment data evaluation method and system | |
You et al. | An improved FMEA quality risk assessment framework for enterprise data assets | |
Zou | Research on data cleaning in big data environment | |
Nwankwo et al. | Knowledge discovery and analytics in process reengineering: a study of port clearance processes | |
CN117035572A (en) | Intelligent audit model construction method based on big data | |
CN108549672A (en) | A kind of intelligent data analysis method and system | |
Sun | Management Research of Big Data Technology in Financial Decision-Making of Enterprise Cloud Accounting | |
Burdick et al. | Financial analytics from public data | |
Kamley et al. | An Association Rule Mining Model for Finding the Interesting Patterns in Stock Market Dataset | |
Xiuli et al. | Electronic Commerce Data Mining using Rough Set and Logistic Regression. | |
Liu et al. | [Retracted] Census and Inventory Method of Pollution Sources Based on Big Data Technology under Machine Learning | |
Wang | Check for updates Study on the Impact of Cross-Border e-Commerce on the Competitiveness of Small and Medium-Sized Enterprises | |
LI | The Application of Big Data in Preventing Financial Risks of P2P Network Loan | |
Petrov | Conducting Remote Audits Using Integrated Information Analysis Systems | |
Chang et al. | The path of management of dispute cases of legal issues of webcasting bandwagon industry in the information age | |
CN117876067A (en) | Early warning tracking intelligentization method, device, equipment and storage medium for credit risk | |
Feng | Research on Building System of Audit Evidence Integration Model Driven by Computer Big Data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |