CN114782123A - Credit assessment method and system - Google Patents

Credit assessment method and system Download PDF

Info

Publication number
CN114782123A
CN114782123A CN202210272417.8A CN202210272417A CN114782123A CN 114782123 A CN114782123 A CN 114782123A CN 202210272417 A CN202210272417 A CN 202210272417A CN 114782123 A CN114782123 A CN 114782123A
Authority
CN
China
Prior art keywords
training
credit
data
behavior data
credit score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210272417.8A
Other languages
Chinese (zh)
Inventor
胡一闻
杨旭
张爱华
黄逸珺
彭若弘
车培荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huizhi Xinda Technology Development Co ltd
Original Assignee
Beijing Huizhi Xinda Technology Development Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Huizhi Xinda Technology Development Co ltd filed Critical Beijing Huizhi Xinda Technology Development Co ltd
Priority to CN202210272417.8A priority Critical patent/CN114782123A/en
Publication of CN114782123A publication Critical patent/CN114782123A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0609Buyer or seller confidence or verification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06F18/2148Generating training patterns; Bootstrap methods, e.g. bagging or boosting characterised by the process organisation or structure, e.g. boosting cascade
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

A credit assessment method and system, comprising: acquiring training original behavior data in a preset period in telecommunication operation business data of a training user; loading the training original behavior data into an XGboost model or a linear regression model for training; predicting credit scores of the training original behavior data by using the trained XGboost model or linear regression model to generate initial predicted credit scores; and smoothing the predicted credit score of the current period to obtain an updated credit score of the preset period. The credit level of the customer is reasonably predicted by combining with the data of the telecom operator, and meanwhile, the credit level of the customer can be dynamically and smoothly adjusted, so that the credit level of the customer is ensured not to influence the normal use of each function due to overlarge floating.

Description

Credit assessment method and system
Technical Field
The invention relates to the technical field of credit assessment, in particular to a credit assessment method and a credit assessment system.
Background
The personal credit score means that the credit evaluation organization quantitatively analyzes the personal credit information of the consumer by using a credit score model and an evaluation algorithm and finally shows the personal credit information in the form of score.
The existing credit scoring technology mainly comprises the following steps:
the FICO (fair Issac company) credit score model is counted by using logistic regression, weights are distributed to different indexes, and finally, the user credit score is calculated through accumulation, wherein the index of the FICO credit score comprises personal credit, grade and payment capacity.
The Tencent credit score model is used for credit risk assessment by adding user interaction circle credit data on the basis of the FICO credit score model. The users in the circle of contact refer to social relations in the real society, such as relatives, colleagues, classmates and the like, and also can be friend relations or attention relations in a social tool, such as WeChat friends, QQ friends, microblog friends, friend relations in the same group, microblog attention relations and the like.
The Tencent credit score model uses the ith credit score, the ith relation credit score and default annotation information of the customer as model input data to calculate the (i + 1) th credit score of the customer.
When i is equal to 1, the calculation method is consistent with the FICO credit score model, and the specific relation credit score calculation formula is as follows:
Figure RE-GDA0003706278950000011
wherein: score _ fri _ avg refers to the client's ith relationship credit score; friend _ score _ j is the ith credit score of the jth user in the customer interaction circle; op _ j is the weight corresponding to the jth user of the circle of intersection.
China Unicom Wo credit score model and China telecom heaven Long credit score model, China Unicom Wo credit uses the calculated scores of the on-line time, flow usage, performance record, payment record and the like, and the higher the score is, more rights can be obtained; chinese telecom heaven credit score uses identity, consumption ability, social relations, historical credit, behavioral preferences to calculate credit score.
The above credit scoring technical scheme has the following defects:
on one hand, from the data perspective, telecom operators have the natural advantages of wide data coverage, capability of constructing a personal relationship network and the like in the credit investigation industry, and have the characteristics of large number of users, wide coverage, quick updating, comprehensiveness, timeliness, no substitution, high credibility and the like. The telecommunication big data with a great deal of value comprises user real-name system identity data, service data, data of a circle of contact, network operation data and service operation data. Specifically, the network operation data comprises basic resources and configuration data, signaling tracking data, service identification data, performance statistical data, monitoring and early warning data and the like; the service operation data includes user basic data, user service behavior, user auxiliary information, and the like. But for non-telecom operators, their data sources do not contain telecom data.
On the other hand, from the evaluation algorithm model, the existing credit evaluation algorithm is mostly measured and calculated by adopting an index system weighting method, and for telecommunication operators, on the aspect of mass data analysis owned by the telecommunication operators, the machine learning algorithm has the characteristics of intellectualization, higher accuracy, stronger expression capability and the like. However, for China Union and China telecom, no patent on credit score model exists at present.
Disclosure of Invention
Object of the invention
The invention aims to provide a credit assessment method and a credit assessment system, which can reasonably predict the credit level of a client by combining telecommunication operator data, dynamically and smoothly adjust the credit level of the client and ensure that the credit level of the client does not influence the normal use of each function due to overlarge floating.
(II) technical scheme
To solve the above problem, according to an aspect of the present invention, there is provided a credit evaluation method including: acquiring training original behavior data in a preset period in telecommunication operation business data of a training user; loading the training original behavior data into an XGboost model or a linear regression model for training; predicting credit scores of the training original behavior data by using the trained XGboost model or linear regression model to generate predicted credit scores of the preset period; and smoothing the predicted credit score to obtain an updated credit score of a preset period.
Further, the step of predicting the credit score of the training original behavior data by using the trained XGBoost model further comprises the following steps: storing the prediction result and detecting the stability of the prediction result; the prediction result is stable, and the XGboost model is used for calculating the prediction credit score; and (5) rebuilding the XGboost model for training when the prediction result is unstable.
Further, after the smoothing process is performed on the predicted credit score, obtaining an updated credit score of a preset period includes: smoothing the predicted credit score of the current preset period by using the credit score of the previous preset period to obtain an updated credit score of the current preset period; the formula of the smoothing process is:
Figure RE-GDA0003706278950000031
wherein, the first and the second end of the pipe are connected with each other,
Figure RE-GDA0003706278950000032
an updated credit score, S, representing the current preset periodn-1Credit score, S, representing the last predetermined periodn' denotes a prediction credit score of the current preset period, and α is a smoothing coefficient.
Further, before loading the training raw behavior data into the linear regression model, the method further includes: and preprocessing the training original behavior data.
Further, the pretreatment comprises: cleaning and dimension selection are carried out on training original behavior data; the cleaning comprises the following steps; filling missing values, correcting error values, eliminating repeated records and checking consistency of the training original behavior data; the dimension selection comprises the following steps: selecting training original behavior data based on different dimensions, wherein the dimensions comprise: identity information, consumption capacity, credit records, behavior habits, and circle of interaction information.
Further, the missing value filling comprises filling missing parts in the training original behavior data; correcting the error value comprises replacing data deviating from overall statistical distribution in the training original behavior data with boundary data of a normal interval according to mathematical statistics; eliminating repeated records comprises gathering completely consistent data in the training original behavior data, reserving one row of data, and deleting the rest data; the consistency check comprises checking whether the data are qualified according to the reasonable value range and the mutual relation of each data in the training original behavior data.
Further, the missing value padding includes: and filling partial missing values, and replacing the missing values with the average value, the maximum value, the minimum value, the median or the probability estimation value derived from the training original behavior data.
According to another aspect of the present invention, there is provided a credit evaluation system comprising: the data acquisition module is used for acquiring training original behavior data in a preset period in telecommunication operation business data of a training user; the training module is used for loading the training original behavior data into an XGboost model or a linear regression model for training; the credit generation and prediction score module is used for predicting the credit score of the training original behavior data by using the trained XGboost model or linear regression model to generate a prediction credit score; and the updating credit score module is used for smoothing the predicted credit score to obtain an updating credit score of a preset period.
Further, the method also comprises the following steps: and the preprocessing module is used for preprocessing the training original behavior data before loading the training original behavior data to the linear regression model.
Further, the preprocessing module comprises: the cleaning unit is used for filling missing values, correcting error values, eliminating repeated records and checking consistency of the training original behavior data; the dimensionality selection unit is used for selecting training original behavior data based on different dimensionalities, and the dimensionalities comprise: identity information, consumption capacity, credit records, behavior habits, and circle of interaction information.
(III) advantageous effects
The technical scheme of the invention has the following beneficial technical effects:
the problem that the prior art lacks sufficient data sources is solved, and the credit level of the client is reasonably predicted by combining data of telecommunication operators.
And introducing telecom operator data comprising monthly consumption, flow use, package price, account points, arrearage shutdown conditions, business handling conditions, contact circle user data and the like, and evaluating the credit level of the client from multiple angles. Meanwhile, the Xgboost algorithm is used for enhancing the reliability of weight distribution of each index, the credit score calculated by the Xgboost model is more accurate, the Xgboost model does not need to process the missing value, and compared with other algorithms, the Xgboost model avoids deviation caused by manual processing of the missing value and enables subsequent use to be more convenient.
The credit score of the customer circle of contact is not required to be collected additionally, the monthly update is carried out on the credit score only according to the data of the customer and the telecommunication operator of the circle of contact, the update credit score weight is set, the large fluctuation of the credit score is avoided through a smooth credit score update mechanism, and the normal use of each function cannot be influenced due to overlarge floating of the credit level of the customer.
Drawings
FIG. 1 is a flow chart of the steps of a credit evaluation method provided by the present invention;
FIG. 2 is a block flow diagram of a first embodiment of a credit evaluation method provided by the present invention;
FIG. 3 is a block diagram of a second embodiment of a credit evaluation method provided by the present invention;
table 1 is a parameter table required for training a model in the credit evaluation method provided by the present invention;
table 2 is a dimension table required in a standard service scenario in the credit evaluation method provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail with reference to the accompanying drawings in combination with the embodiments. It is to be understood that these descriptions are only illustrative and are not intended to limit the scope of the present invention. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present invention.
The present invention will be described in detail below with reference to the accompanying drawings and examples.
Fig. 1 is a flowchart illustrating steps of a credit evaluation method according to the present invention, and referring to fig. 1, the present invention provides a credit evaluation method, which includes the following steps:
s1: acquiring training original behavior data in a preset period in telecommunication operation business data of a training user;
s2: loading the training original behavior data into an XGboost model or a linear regression model for training;
s3: predicting credit scores of the training original behavior data by using the trained XGboost model or linear regression model to generate predicted credit scores;
s4: and smoothing the predicted credit score to obtain an updated credit score of a preset period.
Specifically, the XGboost model or the linear regression model can be selected and used according to different scenes, and the optimal credit evaluation result is output. The process of model prediction is as follows: and inputting the data into a model, calculating the data according to the determined parameters by the model, wherein the whole calculation process is a greedy algorithm, namely the data input model is subjected to first-layer calculation, the result is taken as input data of the next layer after the first-layer calculation, and the like, and the output result of the last layer is the prediction result.
The XGboost is one of boosting algorithms, a strong classifier is formed by integrating a plurality of weak classifiers, a plurality of tree models are integrated, the tree models comprise a plurality of characteristics, different characteristics correspond to different score weights, and the finally obtained classification result is a weighted value.
The linear regression model is to find a suitable hypothesis function h according to the data characteristicsθ(x) Constructing a suitable loss function J (theta) and evaluating the maximum likelihoodMaximum value, or deducing least square function from maximum likelihood, and then solving minimum value.
Specifically, in step S1: acquiring telecom operation business data of a training user; and screening training original behavior data in a preset period in the telecommunication operation business data. The service system of the telecom operator stores the identity information of the user and service ordering, consuming and paying behavior records and the like, and training original behavior data of the training user can be obtained by establishing a synchronous or asynchronous interface of the service system.
Preferably, the preset period of the present invention is one month, and hereinafter, the preset period is described as one month. Such as: and after the data of the telecom operators of the training users are obtained, screening out the training original behavior data of the last month.
The credit evaluation method comprises two different embodiments, wherein the model adopted by the credit evaluation method in the first embodiment is an XGboost model, and the model adopted by the credit evaluation method in the second embodiment is a linear regression model.
The first embodiment is as follows:
fig. 2 is a flowchart of a first embodiment of a credit evaluation method provided by the present invention, please refer to fig. 2, specifically, the steps of the first embodiment include:
the method comprises the following steps: screening out the service data of the telecom operator in the month of training the user, namely training original behavior data from the service system of the telecom operator;
step two: establishing an XGboost model, directly loading training original behavior data into the XGboost model for training, and storing the trained model;
step three: predicting credit of the training original behavior data by using the trained XGboost model to obtain a prediction result; is the prediction result stable judged? If the stability is high, performing the step four; if the XGboost model is unstable, reestablishing the XGboost model, and returning to the second step for training again;
step four: calculating a prediction credit score of training original behavior data by using an XGboost model with a stable prediction result to generate a prediction credit score;
step five: and smoothing the predicted credit score to obtain the updated credit score of the month.
Specifically, the XGboost model selected in this embodiment does not need to perform missing value processing on training original behavior data before training, and is more convenient than other algorithms, and meanwhile, avoids deviation caused by artificial processing of a missing value, and calculates a credit score more accurately.
Table 1 is a table of parameters required for training a model in the credit evaluation method provided in the present invention.
Parameter name Chinese interpretation Reference value
booster Iterative model 'gbtree'
objective Parameters for logistic regression 'reg:linear'
eta Learning rate 0.1
gamma Loss function reduction value critical point 20
max_depth Maximum depth of tree 4
lambda L2 regularization 10
subsample Constructing a sample rate per tree for a sample 0.7
colsample_bytree Characteristic sampling rate 0.7
min_child_weight Minimum weighted sum of all observations of a subset 10
silent Information output switch 0
seed Random number seed 1000
eval_metric Effective data measurement method 'rmse'
Looking at table 1, the XGboost model is trained according to the parameters in table 1.
Preferably, in the third step, a multiple ten-fold cross-validation method is used to judge the stability of the model prediction result. For example: the data set is randomly divided into 10 parts, then 1 part of the data set is selected as a test set, and the other 9 parts of the data set are selected as training sets, and 10 rounds of experiments are sequentially carried out. And (4) solving the average value of the prediction accuracy obtained by each round of experiment to obtain the model accuracy of single ten-fold cross validation. And 5 times of cross validation are carried out totally, the variance is calculated according to the accuracy obtained by 5 times of experiments, and if the variance is smaller than a given threshold value, the model is considered to be stable.
In step four, the calculation result of the credit score is determined by the following process: after data is input into the model, the model is calculated through a greedy algorithm according to the determined parameters, namely: after data is input into the model, the result is used as the input of the next layer after the first layer of calculation, and the output result of the last layer is the prediction result.
In step five, the smoothing of the prediction credit includes: and smoothing the predicted credit score by using the credit score of the previous month to obtain an updated credit score of the current month. To prevent the predicted result from being very different from the previous month result. If the preset period is the first month, the smoothing process is not required.
The formula of the smoothing process is:
Figure RE-GDA0003706278950000081
wherein, the first and the second end of the pipe are connected with each other,
Figure RE-GDA0003706278950000082
representing the smoothed updated credit score, Sn-1Represents the score of last month letter, Sn' represents the monthly prediction credit score calculated by the XGboost model, and alpha is a smoothing coefficient.
The formula of the smoothing process is further explained as follows:
the initial credit score represents the credit score of the first preset period (the first month), and the smoothing processing is not needed; the credits for all the preset periods thereafter are as follows:
a second period, smoothing the prediction credit score and the initial credit score of the current preset period (the second month);
in the nth period: and smoothing the predicted credit score of the current preset period (nth month) and the credit score of the last preset period (nth-1 month).
Example two:
fig. 3 is a block flow diagram of a second embodiment of the credit evaluation method provided by the present invention, please refer to fig. 3, specifically, the steps of the second embodiment include:
the method comprises the following steps: screening out the data of the telecom operator training the user in the current month from a service system of the telecom operator, namely training original behavior data;
step two: firstly, preprocessing training original behavior data, then establishing a linear regression model, and loading the preprocessed training original behavior data into the linear regression model for training;
step three: calculating the prediction credit score of the training original behavior data by using the trained linear regression model to generate a prediction credit score;
step four: and smoothing the predicted credit score to obtain the updated credit score of the month.
Specifically, in the second step, the preprocessing the training original behavior data includes: and cleaning and dimension selection are carried out on the training original behavior data.
The cleaning of the data refers to a process of rechecking and checking the original data due to possible defects of the original data, and the process aims at deleting repeated information, processing missing values, correcting errors, providing integrity and consistency of the data and ensuring effective application of an evaluation algorithm model. Which comprises the following steps of; and (4) filling missing values, correcting error values, eliminating repeated records and checking consistency of the training original behavior data.
Optionally, the cleaning is by a procedure or manual process.
Table 2 is a dimension table required in a standard service scenario in the credit evaluation method provided in the present invention.
Figure RE-GDA0003706278950000091
Figure RE-GDA0003706278950000101
Figure RE-GDA0003706278950000111
Please refer to table 2, wherein the missing value padding is to pad the missing part in the training original behavior data according to a certain method, which includes median computation, mean computation, mode computation, special value labeling, and so on.
See table 2 for missing value population rules.
Optionally, the missing value padding further includes: and filling partial missing values. Partial missing values are values from which an average, maximum, minimum, median or more complex probability estimate is derived to replace the missing values from the data source or other data sources.
Optionally, the partial missing values are filled in manually or cleaned manually.
Correcting the error value means that data deviating from the overall statistical distribution in the training original behavior data is replaced by boundary data of a normal interval according to mathematical statistics.
Eliminating repeated records refers to gathering completely consistent data in the training original behavior data, reserving one row of data, and deleting the rest data;
consistency check (consistency check) refers to checking whether data is satisfactory or not according to a reasonable value range and a mutual relation of each data in training original behavior data, and finding that data which exceeds a normal range, is logically unreasonable or contradicts with each other needs to be corrected, wherein the process is completed through a program or manual processing.
The dimension selection comprises the following steps: selecting training original behavior data based on different dimensions, wherein the dimensions comprise: identity information, consumption capacity, credit records, behavior habits, and circle of interaction information.
Specifically, the dimension selection is based on typical common service data related to credit evaluation by the customer, and mainly comprises the following steps for the telecommunication operator user: identity information (whether to black list, whether to contract a user and the like), consumption capacity (ARPU monthly mean value of nearly three months and the like), credit record (times of arrears of nearly three months, money amount and the like), behavior habit (international roaming places of nearly one year and the like) and charge circle information (number of mobile phone numbers of charge circle) are defined in five main dimensions.
The invention also discloses a credit evaluation system, which comprises the following modules: the system comprises a data acquisition module, a training module, a credit generation and prediction module and a credit updating module.
The data acquisition module is used for acquiring training original behavior data in a preset period in telecommunication operation business data of a training user. The training module is used for loading the training original behavior data into the XGboost model or the linear regression model for training. And the credit score generation and prediction module is used for predicting the credit score of the training original behavior data by using the trained XGboost model or linear regression model to generate an initial credit score. And the updating credit score module is used for smoothing the predicted credit score to obtain an updating credit score of a preset period.
In one embodiment, the method further comprises: and the preprocessing module is used for preprocessing the training original behavior data before loading the training original behavior data to the linear regression model.
Specifically, the method comprises the following steps: the pretreatment module comprises a cleaning unit and a dimension selection unit.
And the cleaning unit is used for filling missing values, correcting error values, eliminating repeated records and checking consistency of the training original behavior data. The dimensionality selection unit is used for selecting training original behavior data based on different dimensionalities, and the dimensionalities comprise: identity information, consumption capacity, credit records, behavioral habits, circle of interaction information.
The invention aims to protect a credit assessment method and a system, comprising the following steps: acquiring training original behavior data in a preset period in telecom operator data of a training user; selecting a model and preprocessing training original behavior data according to the model, wherein the model comprises an XGboost model or a linear regression model; loading the preprocessed training original behavior data into an XGboost model or a linear regression model for training, predicting credit scores of the training original behavior data by using the trained model, and generating initial prediction credit scores; and smoothing the predicted credit score to obtain an updated credit score of a preset period. The credit score of the customer contact circle is not required to be additionally acquired by combining the data of the telecommunication operator, and is monthly updated only according to the data of the customer and the telecommunication operator of the contact circle, so that the problem that the prior art is lack of sufficient data sources is solved; and evaluating the client credit level from multiple angles, and reasonably predicting the client credit level. Meanwhile, the Xgboost algorithm is used for enhancing the reliability of weight distribution of each index, and the normal use of each function cannot be influenced by excessive floating of the credit level of a client.
It is to be understood that the above-described embodiments of the present invention are merely illustrative of or explaining the principles of the invention and are not to be construed as limiting the invention. Therefore, any modification, equivalent replacement, improvement and the like made without departing from the spirit and scope of the present invention should be included in the protection scope of the present invention. Further, it is intended that the appended claims cover all such variations and modifications as fall within the scope and boundaries of the appended claims or the equivalents of such scope and boundaries.

Claims (10)

1. A credit evaluation method, comprising:
acquiring training original behavior data in a preset period in telecommunication operation business data of a training user;
loading the training original behavior data into the XGboost model or the linear regression model for training;
predicting credit scores of the training original behavior data by using the trained XGboost model or the linear regression model to generate predicted credit scores of the preset period;
and smoothing the predicted credit score to obtain an updated credit score of the preset period.
2. The method of claim 1,
predicting credit scores of the trained raw behavior data by using the trained XGboost model further comprises:
storing the prediction result and detecting the stability of the prediction result;
the prediction result is stable, and the XGboost model is used for calculating a prediction credit score;
and (5) reestablishing the XGboost model for training when the prediction result is unstable.
3. The method of claim 1,
obtaining the updated credit score of the preset period after smoothing the predicted credit score comprises:
smoothing the predicted credit score of the current preset period by using the credit score of the previous preset period to obtain an updated credit score of the current preset period;
the formula of the smoothing process is as follows:
Figure FDA0003554178270000011
wherein the content of the first and second substances,
Figure FDA0003554178270000012
an updated credit score, S, representing the current said predetermined periodn-1Credit score, S, representing the last predetermined periodn' represents the prediction credit score of the current preset period, and alpha is a smoothing coefficient.
4. The method of claim 1,
before loading the training raw behavior data into the linear regression model, further comprising: and preprocessing the training original behavior data.
5. The method of claim 4,
the pretreatment comprises the following steps: cleaning and dimension selection are carried out on the training original behavior data;
the cleaning comprises the following steps; carrying out missing value filling, error value correction, repeated record elimination and consistency check on the training original behavior data;
the dimension selection comprises the following steps: selecting the training original behavior data based on different dimensions, wherein the dimensions comprise: identity information, consumption capacity, credit records, behavioral habits, circle of interaction information.
6. The method of claim 5,
the missing value padding comprises padding missing parts in the training raw behavior data;
correcting the error value comprises replacing data deviating from overall statistical distribution in the training original behavior data with boundary data of a normal interval according to mathematical statistics;
the elimination of repeated records comprises the steps of gathering completely consistent data in the training original behavior data, reserving one row of data and deleting the rest data;
and the consistency check comprises checking whether the data meet the requirements or not according to the reasonable value range and the mutual relation of each data in the training original behavior data.
7. The method of claim 6,
the missing value padding further comprises: and filling partial missing values, and replacing the missing values with the average value, the maximum value, the minimum value, the median or the probability estimation value derived from the training original behavior data.
8. A credit evaluation system, comprising:
the data acquisition module is used for acquiring training original behavior data in a preset period in telecommunication operation business data of a training user;
the training module is used for loading the training original behavior data into the XGboost model or the linear regression model for training;
the credit generation and prediction score module is used for predicting the credit score of the training original behavior data by using the trained XGboost model or the linear regression model to generate a prediction credit score;
and the updating credit score module is used for performing smoothing processing on the predicted credit score to obtain the updating credit score of the preset period.
9. The system of claim 8, further comprising:
and the preprocessing module is used for preprocessing the training original behavior data before loading the training original behavior data to the linear regression model.
10. The system of claim 9, wherein the pre-processing module comprises:
the cleaning unit is used for filling missing values, correcting error values, eliminating repeated records and checking consistency of the training original behavior data;
a dimension selection unit, configured to select the training original behavior data based on different dimensions, where the dimensions include: identity information, consumption capacity, credit records, behavioral habits, circle of interaction information.
CN202210272417.8A 2022-03-18 2022-03-18 Credit assessment method and system Pending CN114782123A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210272417.8A CN114782123A (en) 2022-03-18 2022-03-18 Credit assessment method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210272417.8A CN114782123A (en) 2022-03-18 2022-03-18 Credit assessment method and system

Publications (1)

Publication Number Publication Date
CN114782123A true CN114782123A (en) 2022-07-22

Family

ID=82424432

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210272417.8A Pending CN114782123A (en) 2022-03-18 2022-03-18 Credit assessment method and system

Country Status (1)

Country Link
CN (1) CN114782123A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115393056A (en) * 2022-08-31 2022-11-25 重庆大学 Big data-based user information evaluation and wind control method, device and equipment
CN117575783A (en) * 2024-01-16 2024-02-20 中国电信股份有限公司深圳分公司 Multi-dimensional user credit assessment method, device, equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115393056A (en) * 2022-08-31 2022-11-25 重庆大学 Big data-based user information evaluation and wind control method, device and equipment
CN117575783A (en) * 2024-01-16 2024-02-20 中国电信股份有限公司深圳分公司 Multi-dimensional user credit assessment method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US20210224832A1 (en) Method and apparatus for predicting customer purchase intention, electronic device and medium
CN112633962B (en) Service recommendation method and device, computer equipment and storage medium
CN110163647A (en) A kind of data processing method and device
CN114782123A (en) Credit assessment method and system
KR100921618B1 (en) A technology appraisal method for startup company
CN101110699B (en) System with network satisfaction degree estimation and early warning function and implementing method thereof
CN110991474A (en) Machine learning modeling platform
CN112559900B (en) Product recommendation method and device, computer equipment and storage medium
CN111797320B (en) Data processing method, device, equipment and storage medium
CN106408325A (en) User consumption behavior prediction analysis method based on user payment information and system
CN110866767A (en) Method, device, equipment and medium for predicting satisfaction degree of telecommunication user
CN111340606A (en) Full-process income auditing method and device
CN111695084A (en) Model generation method, credit score generation method, device, equipment and storage medium
CN110599240A (en) Application preference value determination method, device and equipment and storage medium
WO2021174699A1 (en) User screening method, apparatus and device, and storage medium
CN114140152A (en) Cloud platform customer management system and method
CN115271282A (en) Customer value determination method and device based on fuzzy logic
CN113177837A (en) Loan amount evaluation method, device, equipment and storage medium for loan applicant
CN112950359A (en) User identification method and device
CN108074108A (en) A kind of display methods and its terminal of net recommendation
CN113052422A (en) Wind control model training method and user credit evaluation method
CN115630708A (en) Model updating method and device, electronic equipment, storage medium and product
CN115641198A (en) User operation method, device, electronic equipment and storage medium
CN108629506A (en) Modeling method, device, computer equipment and the storage medium of air control model
CN115204501A (en) Enterprise evaluation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination