CN111967790A

CN111967790A - Credit score algorithm model method capable of automatic calculation and terminal

Info

Publication number: CN111967790A
Application number: CN202010882083.7A
Authority: CN
Inventors: 张美跃; 刘侃; 范章华
Original assignee: Hengruitong Fujian Information Technology Co ltd
Current assignee: Hengruitong Fujian Information Technology Co ltd
Priority date: 2020-08-28
Filing date: 2020-08-28
Publication date: 2020-11-20
Anticipated expiration: 2040-08-28
Also published as: CN111967790B

Abstract

The invention relates to a method and a terminal of a credit score algorithm model capable of automatic calculation, wherein a credit information list of each province is obtained according to a state-level credit information directory, original credit data of the credit information list is obtained, a first single-word weight of the original credit data is configured, a first algorithm model is determined according to the first single-word weight, the original credit data is substituted into the first algorithm model to obtain a first credit score, an actual credit score matched with the original credit data and actual policy information are obtained, the first algorithm model is corrected according to the actual credit score, the first credit score and the actual policy information to obtain a normative algorithm model, and the normative algorithm model is saved; cleaning and filtering the original credit data according to a preset rule to obtain cleaning credit data, verifying the cleaning credit data, and taking the cleaning credit data as normative credit data if verification is successful; and calculating the normative credit data according to the normative algorithm model to obtain the normative credit score.

Description

Credit score algorithm model method capable of automatic calculation and terminal

Technical Field

The invention relates to the field of computer software, in particular to a credit score algorithm model method capable of automatic calculation and a terminal.

Background

According to the construction requirements of 'construction work key points of social credit system of Fujian province in 2018': building a sound credit legal system and a standard system by taking government integrity, business integrity, social integrity and judicial public credit as main contents; the public credit information platform sharing and disclosing function is expanded and perfected; comprehensively deploying and implementing a credit-keeping combined incentive and loss-of-credit combined punishment system; the credit construction in the key industry field is promoted first, and the construction of government affairs, individuals and city credit systems is strengthened; the credit trial and demonstration project development in the industry and the region is supported; the credit information acquisition record and the product application range are expanded, the development and utilization of big data are promoted, and a novel market supervision system with credit as a core is constructed; the honest culture propaganda and education are deeply developed, a good credit environment is created, and the construction pace of 'credit Fujian' is accelerated. On the basis, the public credit information platform collects and integrates legal and personal credit information owned by government departments such as industry and commerce, tax administration, quality supervision, human society and the like, judicial authorities, organizations performing public management functions, public utilities and the like, and provides credit information inquiry service for the government departments and the public. The following is an analysis of the current situation of the existing credit informatization construction in Fujian province.

The collection of the corporate credit data is an important function in a corporate credit platform, and the collection of the corporate credit data is mainly used for pertinently extracting corporate credit data of different business types of 10 units of a city industry and commerce bureau, a city tax bureau, a city quality supervision bureau, a city food and drug supervision bureau, a city traffic bureau, a city government bureau, a city public security bureau, a city court and a city inspection and quarantine bureau.

The legal credit data acquisition is realized by relying on public information service platforms and public basic databases (hereinafter referred to as 'one database for short') of Fujian province and the province, the public information platforms are built one by one according to planning, the public information platforms are mainly responsible for gathering data of all committees and offices of various cities, meanwhile, the public basic databases, public services and the public service databases are built, and finally an information resource pool facing city sharing coordination is formed.

Due to various objective factors, credit data reported in various cities often have regional characteristics, collected data types, data dimensions, data frequency and the like can change due to specific conditions of the cities, and corresponding credit scores cannot be calculated by using the same algorithm model; in addition, aiming at the problems of ambiguous data, incompleteness, violation of business rules and the like which may occur in the public credit information of the legal person and the natural person reported in various places, the reasonable credit score cannot be calculated according to the collected credit data of the legal person.

Disclosure of Invention

Technical problem to be solved

In order to solve the above problems in the prior art, the present invention provides a method for automatically calculating a credit score algorithm model, which can establish a reasonable credit score algorithm model and ensure the reasonability of the calculated credit score.

(II) technical scheme

In order to achieve the purpose, the invention adopts a technical scheme that: a method of automatically calculable credit scoring algorithm model, comprising:

s1, obtaining credit information lists of each province according to a state-level credit information directory, obtaining original credit data of the credit information lists, configuring first single-word-segment weights of the original credit data, determining a first algorithm model according to the first single-word-segment weights, substituting the original credit data into the first algorithm model to obtain first credit scores, obtaining actual credit scores and actual policy information matched with the original credit data, correcting the first algorithm model according to the actual credit scores, the first credit scores and the actual policy information to obtain a normative algorithm model, and storing the normative algorithm model;

s2, cleaning and filtering the original credit data according to a preset rule to obtain cleaning credit data, verifying the cleaning credit data according to an integrity rule, a uniqueness rule, a consistency rule, a legality rule and an authority rule, and taking the cleaning credit data as normative credit data if verification is successful;

and S3, calculating the normative credit data according to the normative algorithm model to obtain the normative credit score.

The other technical scheme adopted by the invention is as follows: a terminal of an automatically calculable credit scoring algorithm model, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:

(III) advantageous effects

The invention has the beneficial effects that: configuring a first single-word-segment weight through original credit data to determine a first algorithm model, and correcting the first algorithm model by combining actual credit score and actual policy information to obtain a normative algorithm model, so that the accuracy and reliability of the normative algorithm model are ensured; in addition, the original credit data are cleaned and verified successfully to obtain the normative credit data, so that the problems of ambiguity, incompleteness, violation of business rules and the like of the original credit data are avoided, the validity of the data is guaranteed, and the rationality of the normative credit data can be guaranteed through the normative credit data and the normative credit score obtained by the normative algorithm model.

Drawings

FIG. 1 is a flow chart of a method of an automatically calculable credit scoring algorithm model of the present invention;

FIG. 2 is a schematic structural diagram of a terminal of an automatically calculable credit scoring algorithm model according to the present invention;

[ description of reference ]

1. A terminal of a credit score algorithm model capable of automatic calculation; 2. a memory; 3. a processor.

Detailed Description

For the purpose of better explaining the present invention and to facilitate understanding, the present invention will be described in detail by way of specific embodiments with reference to the accompanying drawings.

Referring to fig. 1, a method of an automatically calculated credit score algorithm model includes:

From the above description, the beneficial effects of the present invention are: configuring a first single-word-segment weight through original credit data to determine a first algorithm model, and modifying the first algorithm model by combining actual credit score and actual policy information to obtain a normative algorithm model, so that the accuracy and reliability of the normative algorithm model can be ensured according to actual conditions; in addition, the original credit data are cleaned and verified successfully to obtain the normative credit data, so that the problems of ambiguity, incompleteness, violation of business rules and the like of the original credit data are avoided, the validity of the data is guaranteed, and the rationality of the normative credit data can be guaranteed through the normative credit data and the normative credit score obtained by the normative algorithm model.

Further, the S3 includes before:

judging whether the normative credit data is missing or not, if so, sending the normative credit data to a modification terminal, configuring a second single-word segment weight of the normative credit data, determining a second algorithm model according to the second single-word segment weight, calculating the normative credit data according to the second algorithm model to obtain a second credit score, sending the second credit score to an application end, returning a modification suggestion returned by the application end, and modifying the normative algorithm model according to the modification suggestion and the second single-word segment weight to obtain a final normative algorithm model.

The S3 specifically includes:

and S3, calculating the normative credit data according to the final normative algorithm model to obtain the normative credit score.

As can be seen from the above description, because there is a difference in data amount between the original credit data of credit information lists of respective cities, there are cases where some of the original credit data of the legal people in the cities are missing, and it is impossible to calculate them by using a general normative algorithm model, and therefore, when the normative credit data is missing, a second single-field weight is needed to be configured correspondingly to obtain a second algorithm model, and the second algorithm model is used for obtaining a second credit score for the normative credit data, and the second algorithm model is modified according to a modification opinion returned by the application end (for example, if the second credit score obtained after the bank of the prefecture calculates the missing normative credit data of the prefecture by using the second algorithm model is not in a reasonable range, the modification opinion is sent), so that the final normative algorithm model is obtained, and the reliability of the algorithm model is ensured.

Further, the S2 further includes:

if the verification is unsuccessful, analyzing the cleaning credit data, and judging whether the cleaning credit data is data capable of automatically correcting errors;

if so, automatically correcting the cleaning credit data to obtain error correction data, and taking the error correction data as normative credit data;

if not, classifying and storing according to the zone attribution of the cleaning credit data, and simultaneously storing the error description and the error data field of the cleaning credit data.

As can be seen from the above description, after the verification of the cleaning credit data is unsuccessful, whether the data is automatically error-correctable or not is analyzed, if so, the cleaning credit data can be automatically error-corrected and then stored, the degree of automation is high, and the manpower consumption is reduced, and if not, the cleaning credit data is classified and stored according to the division attribution of the cleaning credit data (namely, the cleaning credit data is classified according to the specific city), and the error description and the error data field of the cleaning credit data are simultaneously stored, so that the cleaning credit data with problems can be classified, and the manual verification of the cleaning credit data by a worker can be facilitated according to the error description and the error data field of the cleaning credit data.

If not, classifying and storing according to the zone attribution of the cleaning credit data, and simultaneously storing the error description and the error data field of the cleaning credit data, the method further comprises the following steps:

and acquiring processing data obtained by processing the cleaning credit data according to the error description and the error data field of the cleaning credit data, and taking the processing data as normative credit data if the processing data is successfully processed.

From the above description, it can be known that, when the processed data after manual processing is acquired, whether the processing is successful or not is judged, and the processed data is used as normative credit data after the processing is successful, so that the data quality is ensured.

Further, the preset rules comprise basic preset rules and newly added preset rules;

counting and monitoring the data quality and the data quantity of cleaning credit data obtained by cleaning and filtering the original credit data according to a preset rule, respectively obtaining the front data quality, the front data quantity, the rear data quality and the rear data quantity of the cleaning credit data obtained by cleaning and filtering the original credit data within a preset time period before and after adding the newly added preset rule in a basic preset rule, respectively comparing the front data quality and the front data quantity with the rear data quality and the rear data quantity to obtain a comparison result, and judging whether to alarm or not according to the comparison result.

From the above description, after the preset rule is newly added to the basic preset rule, the change of the subsequent data quality and data quantity is counted to judge whether to alarm, so that the preset rule is convenient to adjust.

Referring to fig. 2, a terminal for cleaning data of a fixed asset investment project includes a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the following steps when executing the computer program:

From the above description, the beneficial effects of the present invention are: configuring a first single-word-segment weight through original credit data to determine a first algorithm model, and correcting the first algorithm model by combining actual credit score and actual policy information to obtain a normative algorithm model, so that the accuracy and reliability of the normative algorithm model are ensured; in addition, the original credit data are cleaned and verified successfully to obtain the normative credit data, so that the problems of ambiguity, incompleteness, violation of business rules and the like of the original credit data are avoided, the validity of the data is guaranteed, and the rationality of the normative credit data can be guaranteed through the normative credit data and the normative credit score obtained by the normative algorithm model.

Further, the S3 includes before:

The S3 specifically includes:

Further, the S2 further includes:

Example one

Specifically, acquiring the original credit data of the credit information list comprises: and acquiring original credit data of the credit information list, and formatting the original credit data, so that the disordered format is avoided after format conversion, and the uniformity of the data format is ensured.

Specifically, the following preset rules may be configured for the integrity rule, the uniqueness rule, the consistency rule, the validity rule, and the authority rule, respectively:

1. the preset rules for completeness, such as partial data missing in the original credit data, can be complemented by the following preset rules:

(1) completing by other information, for example, the original credit data lacks the administrative division code, and automatically completing the administrative division code according to the administrative division based on the administrative division code comparison table; if the time format is not correct, the error can be automatically corrected according to the time bits in the code.

(2) Through the completion of the previous and subsequent data, for example, the time series in the original credit data has short data, the average value before and after the data is used, the short data is more, the data can be omitted by using the smoothing treatment, and the cleaning of other data is not influenced.

2. The preset rules aiming at the legality rules can be processed through the following preset rules:

(1) setting a forced legal rule, and forcibly setting the rule to be a maximum value or judging the rule to be invalid and rejecting the rule if the rule is not in the rule range;

(2) field type legal rules: the date field format is "2010-10-10";

(3) the legal rule of the field content is as follows: the sex of the corporate legal person is 'male or female or unknown'; birth date is less than today;

(4) setting a warning rule, warning if the rule is not in the rule range, and then manually processing;

(5) and (4) carrying out manual special treatment on the outlier, and finding the outlier by using modes of box separation, clustering, regression and the like.

3. The preset rule for data uniqueness, i.e. removing duplicate records, only keeps one, for example:

(1) pressing a main key to remove duplicate, and triggering an instruction of 'removing duplicate records' in the sql or excel;

(2) and (4) removing the duplicate according to the rule, and compiling a corresponding series of rules according to the complex data in the repeated condition to remove the duplicate. For example, data from different channels can be matched through the same key information, and the duplication is removed through combination.

4. For the preset rule of data consistency, for example, corresponding names in each table are kept consistent, for example, the existence names of the "household locations" in the same type of information submitted in each city are filled in the tables are different, some are the "household locations", and others are the "family and native place", and the names are cleaned and filtered uniformly to be the "household locations"; for example, in the same type of information filling form submitted by each city, if the name of "the xi information filling form of the province of Fujian province" is different, if some are "xx information filling forms of the province of Fujian province", and if some are "xx information filling forms of the Fujian province", these names are cleaned and filtered uniformly to be "xx information filling forms of the province of Fujian province".

5. According to the preset rule of data authority, original credit data come from different channels, different authority levels are set for different channels, and when the original credit data input by the channel with the lower authority level and the original credit data input by the channels with the higher authority levels are the same group of data but have differences, the original credit data input by the channel with the lower authority level are cleaned and filtered.

Wherein the S3 previously comprises:

The S3 specifically includes:

Wherein the S2 further includes:

If not, classifying and storing according to the zone attribution of the cleaning credit data, and simultaneously storing the error description and the error data field of the cleaning credit data, and then the method further comprises the following steps:

acquiring processing data obtained by processing the cleaning credit data according to the error description and the error data field of the cleaning credit data, and taking the processing data as normative credit data if the processing data is successfully processed; if the processing data processing is unsuccessful, returning to the step of storing the error description and the error data field of the cleaning credit data to wait for the next data processing.

The preset rules comprise basic preset rules and newly added preset rules;

Example two

Referring to fig. 2, a terminal 1 of a credit score algorithm model capable of automatic calculation includes a memory 2, a processor 3 and a computer program stored in the memory 2 and capable of running on the processor 3, wherein the processor 3 implements the steps in the first embodiment when executing the computer program.

In summary, according to the method and the terminal for automatically calculating the credit score algorithm model provided by the invention, the first algorithm model is determined by configuring the first single-field weight with the original credit data, and the first algorithm model is corrected by combining the actual credit score and the actual policy information to obtain the normative algorithm model, so that the accuracy and the reliability of the normative algorithm model can be ensured according to the actual situation; in addition, the original credit data are cleaned and verified successfully to obtain the normative credit data, so that the problems of ambiguity, incompleteness, violation of business rules and the like of the original credit data are avoided, the validity of the data is guaranteed, and the rationality of the normative credit data can be guaranteed through the normative credit data and the normative credit score obtained by the normative algorithm model.

The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims

1. A method for automatically calculating a credit score algorithm model, comprising:

2. The method of automatically calculable credit scoring algorithm model of claim 1, wherein the S3 is preceded by:

judging whether the normative credit data is missing or not, if so, sending the normative credit data to a modification terminal, configuring a second single-word segment weight of the normative credit data, determining a second algorithm model according to the second single-word segment weight, calculating the normative credit data according to the second algorithm model to obtain a second credit score, sending the second credit score to an application end, returning a modification suggestion returned by the application end, and modifying the normative algorithm model according to the modification suggestion and the second single-word segment weight to obtain a final normative algorithm model;

the S3 specifically includes:

3. The method of automatically calculable credit scoring algorithm model according to claim 1, wherein the S2 further includes:

4. The method of an automatically calculable credit scoring algorithm model according to claim 3, wherein if not, after classifying and saving according to the zone attribution of the cleansing credit data and simultaneously saving the error description and error data field of the cleansing credit data, further comprising:

5. The method of automatically calculable credit scoring algorithm model according to claim 1, wherein the preset rules include basic preset rules and additional preset rules;

6. A terminal of an automatically calculable credit scoring algorithm model, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of:

7. The terminal of an automatically calculable credit scoring algorithm model according to claim 6, wherein the S3 is preceded by:

the S3 specifically includes:

8. The terminal of an automatically calculable credit scoring algorithm model according to claim 6, wherein the S2 further includes:

9. The terminal of an automatically calculable credit scoring algorithm model according to claim 8, further comprising, after classifying and saving the cleaning credit data according to the zone attribution of the cleaning credit data and simultaneously saving the error description and the error data field of the cleaning credit data, if not:

10. The terminal of an automatically calculable credit scoring algorithm model according to claim 6, wherein the preset rules include basic preset rules and new preset rules;

counting and monitoring the data quality and the data quantity of cleaning credit data obtained by cleaning and filtering the original credit data according to a preset rule, respectively obtaining the front data quality, the front data quantity, the back data quality and the back data quantity of the cleaning credit data obtained by cleaning and filtering the original credit data in a preset time period before and after adding the newly added preset rule in a basic preset rule, respectively comparing the front data quality and the front data quantity with the back data quality and the back data quantity to obtain a comparison result, and judging whether to alarm or not according to the comparison result.