CN115239481A - Credit card repayment reminding method and device - Google Patents

Credit card repayment reminding method and device Download PDF

Info

Publication number
CN115239481A
CN115239481A CN202210966354.6A CN202210966354A CN115239481A CN 115239481 A CN115239481 A CN 115239481A CN 202210966354 A CN202210966354 A CN 202210966354A CN 115239481 A CN115239481 A CN 115239481A
Authority
CN
China
Prior art keywords
overdue
historical
vector
customer
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210966354.6A
Other languages
Chinese (zh)
Inventor
吴欢
林慕云
殷富成
郑安妮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210966354.6A priority Critical patent/CN115239481A/en
Publication of CN115239481A publication Critical patent/CN115239481A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Genetics & Genomics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Physiology (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The invention provides a credit card repayment reminding method and a device, in particular to the field of artificial intelligence, and the method comprises the following steps: determining a first overdue type of the corresponding current client according to the decision tree and the qualitative feature vector of the current client; constructing a corresponding fitness function according to the historical customer quantitative feature vector; performing genetic iteration based on a fitness function, and determining a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vectors; determining a second overdue type of the corresponding current customer based on the quantitative feature vector, the final power coefficient and the final multiple coefficient of the current customer; and determining the future overdue rate of the current client according to the first overdue type and the second overdue type, and reminding the current client of the payment of the credit card based on the future overdue rate. The invention can prompt the repayment of the credit card in advance and improve the speed and the accuracy of the repayment of the credit card, thereby improving the repayment prompting efficiency of the credit card and further being beneficial to improving the income of banks.

Description

Credit card repayment reminding method and device
Technical Field
The invention relates to the technical field of information pushing, in particular to the field of artificial intelligence, and particularly relates to a credit card repayment reminding method and device.
Background
The credit card loan service is one of the mainstream services of banks, and plays a certain contribution role in the income of the banks by receiving loan interest paid by lenders together during repayment. Therefore, it is necessary to remind the lender of the credit card to pay in time and pay interest, so as to reduce the economic loss and income reduction of the bank due to overdue payment. However, in the prior art, the method for reminding repayment of the credit card mainly comprises the steps of inquiring the overdue condition of the lender client by a worker, manually analyzing the overdue condition of the client, and specifically and manually reminding the corresponding client of repayment. The processes of inquiring the overdue condition of the client, manually analyzing the overdue condition and manually reminding the client of repayment take a long time and are slow in repayment reminding speed. Moreover, because the accuracy of the process of carrying out payment reminding on corresponding clients in a targeted manner is usually based on manual experience, the situations that payment reminding is carried out on non-overdue credit card clients with strong strength or payment reminding is carried out on overdue credit card clients with weak strength instead may exist, so that the accuracy of the payment reminding of credit cards is low, the experience of clients is poor, and timely payment is not facilitated for some overdue clients. Moreover, the existing repayment reminding method usually reminds when the corresponding client has the condition that the payment is still not yet paid after overdue, but cannot remind the client in advance according to the condition of the client so that the client has sufficient time to prepare for repayment, and therefore, the speed and the efficiency of the repayment reminding of the credit card are not improved. In summary, in the prior art, there are problems that the repayment reminding of the credit card cannot be performed in advance, and the repayment reminding speed and accuracy of the credit card are low, so that the repayment reminding efficiency of the credit card is low, and the income of a bank is not improved.
Disclosure of Invention
The invention aims to provide a credit card repayment reminding method, which aims to solve the problems that in the prior art, the repayment reminding of a credit card cannot be carried out in advance, and the repayment reminding speed and accuracy of the credit card are low, so that the repayment reminding efficiency of the credit card is low, and the income of a bank is not improved. Another object of the present invention is to provide a credit card repayment reminding device. It is a further object of this invention to provide such a computer apparatus. It is a further object of the invention to provide a readable medium.
In order to achieve the above object, an aspect of the present invention discloses a credit card repayment reminding method, including:
determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector;
constructing a corresponding fitness function according to a preset historical customer quantitative feature vector; performing genetic iteration based on the fitness function, and determining a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vectors;
determining a second overdue type of the corresponding current customer based on a preset current customer quantitative feature vector, the final power coefficient and a final multiple coefficient; and determining the future overdue rate of the current client according to the first overdue type and the second overdue type, and carrying out credit card repayment reminding on the current client based on the future overdue rate.
Optionally, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
performing data cleaning, data extraction and data standardization processing on preset initial historical client information to obtain intermediate historical client information;
performing feature vectorization processing on the intermediate historical client information to obtain a historical client feature vector;
and splitting the historical client characteristic vector according to the property of the element type of the vector element to respectively obtain a corresponding historical client qualitative characteristic vector and a corresponding historical client quantitative characteristic vector.
Optionally, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
performing feature vectorization processing on preset current client information to obtain a current client feature vector;
and splitting the current customer characteristic vector according to the property of the element type of the vector element to respectively obtain the corresponding current customer qualitative characteristic vector and the current customer quantitative characteristic vector.
Optionally, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
and constructing a decision tree based on a plurality of historical customer qualitative feature vectors, element values of the historical customer qualitative feature vectors and corresponding historical overdue labels.
Optionally, the constructing a decision tree based on the plurality of historical customer qualitative feature vectors, the element values of the historical customer qualitative feature vectors, and the corresponding historical past due labels includes:
obtaining a complete information entropy according to all the historical client qualitative feature vectors and corresponding historical overdue labels; the value of the historical overdue label is a non-overdue label, a first-stage overdue label, a second-stage overdue label, a third-stage overdue label or a fourth-stage overdue label;
obtaining a root conditional entropy of each element type of the vector elements according to the history overdue labels corresponding to all the history client qualitative feature vectors and the element values of the vector elements;
obtaining a root information gain entropy corresponding to the element type according to the complete information entropy and the root condition entropy, and establishing a root node of a decision tree by taking the element type with the maximum root information gain entropy as a root node attribute; respectively establishing child nodes corresponding to each element value based on each element value which the root node attribute can adopt;
repeatedly executing the step of establishing the child nodes until the child nodes cannot be established so as to complete the construction of the decision tree, wherein the step of establishing the child nodes comprises the following steps:
determining a plurality of historical customer qualitative characteristic vectors with vector elements corresponding to the sub-element values as the sub-vectors of the sub-nodes according to the sub-element values corresponding to each sub-node;
respectively judging whether the historical overdue labels corresponding to the sub-vectors of each child node are the same, if so, taking the child nodes as leaf nodes; determining a plurality of historical customer qualitative feature vectors with vector elements corresponding to the leaf element values as the leaf vectors of the leaf nodes according to the leaf element values corresponding to each leaf node;
if not, obtaining the sub-conditional entropy of each element type in the child nodes according to the historical overdue labels corresponding to all the child vectors and the element values of the vector elements;
obtaining sub information gain entropies corresponding to the element types according to the complete information entropies and the sub condition entropies, and taking the element type with the maximum sub information gain entropies as a sub node attribute; and respectively establishing child nodes of the next layer of the child nodes based on each element value which the attribute of the child nodes can adopt.
Optionally, the obtaining a complete information entropy according to all the historical customer qualitative feature vectors and corresponding historical overdue labels includes:
obtaining a first number of the historical customer qualitative feature vectors with the historical overdue labels being non-overdue labels, a second number of the historical customer qualitative feature vectors with the historical overdue labels being first-stage overdue labels, a third number of the historical customer qualitative feature vectors with the historical overdue labels being second-stage overdue labels, a fourth number of the historical customer qualitative feature vectors with the historical overdue labels being third-stage overdue labels, and a fifth number of the historical customer qualitative feature vectors with the historical overdue labels being fourth-stage overdue labels according to all the historical customer qualitative feature vectors and corresponding historical overdue labels;
obtaining an unexpired rate based on the first number and the total number of all historical customer qualitative feature vectors;
obtaining a first-stage overdue rate based on the second quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a second stage overdue rate based on the third quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a third-stage overdue rate based on the fourth quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a fourth-order overdue rate based on the fifth number and the total number of all historical customer qualitative feature vectors;
and obtaining the complete information entropy based on the non-overdue rate, the first-stage overdue rate, the second-stage overdue rate, the third-stage overdue rate and the fourth-stage overdue rate.
Optionally, the obtaining, according to the historical past due labels corresponding to all the historical customer qualitative feature vectors and the element values of the vector elements, the root conditional entropy of each element type of the vector elements includes:
according to the element values of the vector elements, the division number of the vector elements with different element values in the element types is respectively obtained;
obtaining a partition rate according to the partition rate and the total number of all historical customer qualitative characteristic vectors;
respectively taking the corresponding historical client qualitative feature vectors when different element values are taken from the element types as corresponding division vectors, and obtaining division information entropies corresponding to the different element values based on the corresponding division vectors when the different element values are taken from the element types and the historical overdue labels corresponding to the division vectors;
and obtaining the root conditional entropy of the element type based on the partition rate and the partition information entropy corresponding to different element values of the element type.
Optionally, the obtaining the sub-conditional entropy of each element type in the child node according to the historical overdue labels and the element values of the vector elements corresponding to all the child vectors includes:
obtaining the sub-division number of vector elements with different element values in the element types of the sub-vectors respectively according to the element values of the vector elements of the sub-vectors;
obtaining a sub-division rate according to the sub-division number and the sub-vector number of the sub-vectors;
respectively taking the corresponding sub-vectors when different element values are taken from the element types as corresponding sub-division vectors, and obtaining sub-division information entropies corresponding to the different element values based on the corresponding sub-division vectors when the different element values are taken from the element types and the historical overdue labels corresponding to the sub-division vectors;
and obtaining the sub-condition entropy of the element type based on the sub-division rate and the sub-division information entropy corresponding to different element values of the element type.
Optionally, the determining a first overdue type of the corresponding current client according to the preset decision tree and the qualitative feature vector of the current client includes:
determining a corresponding path in a decision tree according to the element values of the vector elements of the current customer qualitative feature vector;
obtaining a corresponding leaf node according to the path;
and determining a historical overdue type corresponding to the leaf node according to a historical overdue label corresponding to the leaf vector of the leaf node, and determining the corresponding historical overdue type as the first overdue type.
Optionally, the constructing a corresponding fitness function according to the preset historical quantitative feature vector of the customer includes:
respectively setting a corresponding power coefficient variable and a corresponding multiple coefficient variable for each vector element in each historical customer quantitative feature vector;
obtaining a sub-fitness parameter corresponding to each vector element based on the power coefficient variable, the multiple coefficient variable and the element value of the corresponding vector element;
and constructing a fitness function of the corresponding historical customer quantitative feature vector according to the sub-fitness parameter corresponding to each vector element.
Optionally, the performing genetic iteration based on the fitness function to determine a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vector includes:
randomly setting an initial power coefficient of each power coefficient variable and an initial multiple coefficient of each multiple coefficient variable in the fitness function, and repeatedly executing the step of genetic iteration until all final power coefficients and final multiple coefficients of each fitness function are determined, wherein the step of genetic iteration comprises the following steps:
based on all the fitness functions, the fitness of each corresponding historical customer quantitative feature vector is obtained; obtaining a sub-fitting value corresponding to the historical customer quantitative feature vector based on the overdue value corresponding to the historical overdue label of the corresponding historical customer quantitative feature vector and the fitness; superposing the sub-fitting values corresponding to the quantitative feature vectors of all the historical customers to obtain fitting values;
judging whether the fitting value is smaller than or equal to a preset fitting threshold value, if so, taking an initial power coefficient of each power coefficient variable in the fitness function as the final power coefficient, and taking an initial multiple coefficient of each multiple coefficient variable as the final multiple coefficient;
if not, repeatedly executing cross mutation operation until all fitness functions are updated;
wherein the cross mutation operations comprise:
selecting one fitness function which is not updated from all fitness functions as a current fitness function, and selecting a plurality of fitness functions from other fitness functions except the current fitness function as cross operator functions;
obtaining a power coefficient cross operator of each corresponding power coefficient variable of the current fitness function according to a power coefficient of a corresponding power coefficient variable in a plurality of cross operator functions and a preset first random number; obtaining a multiple cross operator of each corresponding multiple coefficient variable of the current fitness function according to the multiple coefficient of the corresponding multiple coefficient variable in the multiple cross operator functions and the first random number;
obtaining a corresponding power mutation operator according to the power crossing operator and a preset second random number; obtaining a corresponding multiple mutation operator according to the multiple crossover operator and the second random number;
and taking the power variation operator corresponding to each power coefficient variable of the current fitness function as an initial power coefficient of the power coefficient variable, and taking the multiple variation operator corresponding to each multiple coefficient variable of the current fitness function as an initial multiple coefficient of the multiple coefficient variable to finish updating the current fitness function.
Optionally, the obtaining a sub-fitting value corresponding to the historical customer quantitative feature vector based on the overdue value corresponding to the historical overdue label of the corresponding historical customer quantitative feature vector and the fitness includes:
subtracting the fitness from the overdue value to obtain a fitness difference value;
and taking the square of the absolute value of the adaptive difference value as the sub-fitting value.
Optionally, the determining a second overdue type of the current client based on the preset current client quantitative feature vector, the final power coefficient, and the final multiple coefficient includes:
correspondingly substituting element values of vector elements of the current customer quantitative feature vector into a fitness function corresponding to each historical customer quantitative feature vector, and obtaining a current customer adaptive value corresponding to the historical customer quantitative feature vector according to the final power coefficient and the final multiple coefficient of the historical customer quantitative feature vector in the fitness function;
subtracting the corresponding current client adaptive value from the overdue value corresponding to the historical overdue label of the historical client quantitative feature vector to obtain an initial proximity value, and taking an absolute value of the initial proximity value to obtain a proximity difference value;
and determining a second overdue type of the current client according to the historical overdue label corresponding to the historical client quantitative feature vector with the minimum approach difference.
Optionally, correspondingly substituting element values of vector elements of the current customer quantitative feature vector into a fitness function corresponding to each historical customer quantitative feature vector, where the method includes:
and replacing the element value in the sub-fitness parameter of the corresponding vector element in the fitness function by the element value of the vector element in the current customer quantitative feature vector.
Optionally, the determining the future overdue rate of the current client according to the first overdue type and the second overdue type includes:
determining a first overdue rate corresponding to the first overdue type based on the historical overdue label corresponding to the first overdue type and the corresponding relation between the historical overdue labels with different preset values and the overdue rate;
determining a second overdue rate corresponding to the second overdue type based on the historical overdue label corresponding to the second overdue type and the corresponding relationship between the historical overdue label of different preset values and the overdue rate;
determining an average of the first and second overdue rates as the future overdue rate.
In order to achieve the above object, another aspect of the present invention discloses a credit card repayment reminding device, comprising:
the decision tree processing module is used for determining a first overdue type of the corresponding current client according to a preset decision tree and the qualitative characteristic vector of the current client; the decision tree is associated with a plurality of preset historical customer qualitative characteristic vectors and historical overdue labels corresponding to the historical customer qualitative characteristic vectors;
the fitness processing module is used for constructing a corresponding fitness function according to a preset historical customer quantitative feature vector; performing genetic iteration based on the fitness function, and determining a plurality of final power coefficients and final multiple coefficients of the quantitative feature vector corresponding to the historical customers;
the repayment reminding module is used for determining a second overdue type of the corresponding current customer based on a preset current customer quantitative characteristic vector, the final power coefficient and the final multiple coefficient; and determining the future overdue rate of the current client according to the first overdue type and the second overdue type, and carrying out credit card repayment reminding on the current client based on the future overdue rate.
The invention also discloses a computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method when executing the program.
The invention also discloses a computer-readable medium, on which a computer program is stored which, when executed by a processor, implements a method as described above.
According to the credit card repayment reminding method and device, the corresponding first overdue type of the current customer is determined according to the preset decision tree and the current customer qualitative feature vector, the first overdue type of the current customer can be quickly and accurately determined on the basis of the current customer qualitative feature vector by means of the characteristics that the decision tree is low in calculation complexity and high in classification accuracy and the vectors suitable for classifying the qualitative feature vectors, and therefore the speed and accuracy of overall repayment reminding are improved; constructing a corresponding fitness function by quantifying a feature vector according to a preset historical client; genetic iteration is carried out based on the fitness function, a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vectors are determined, the related processing process of the genetic iteration in the genetic algorithm can be improved, the fitness function associated with the historical customer characteristics is constructed on the basis of the historical customer quantitative feature vectors capable of fully reflecting the historical customer characteristics, the optimal solution of power coefficient variables and multiple coefficient variables specially set for the historical customer characteristics in the fitness function can be more quickly and accurately solved through the optimized genetic iteration processing process, the final power coefficients and the final multiple coefficients serving as the optimal solution can be higher in conformity with the historical customer characteristics, the accuracy and the speed of determining the second overdue type of the current customer based on the optimal solution in the subsequent steps can be improved, and the speed and the accuracy of overall repayment reminding can be improved; the corresponding second overdue type of the current client is determined based on the preset quantitative feature vector, the final power coefficient and the final multiple coefficient of the current client, so that the second overdue type can be determined quickly and accurately with lower calculation complexity based on the relevant principle of the genetic algorithm and the characteristic that the genetic algorithm is suitable for processing the quantitative features; the future overdue rate of the current client is determined according to the first overdue type and the second overdue type, credit card repayment reminding is carried out on the current client based on the future overdue rate, the future overdue rate of the current client can be determined by comprehensively considering the overdue type determined based on the qualitative characteristics of the current client and the overdue type determined based on the quantitative characteristics of the current client, so that the accuracy of determining the future overdue rate can be effectively improved, the whole steps of the repayment reminding method can be realized through related programs, algorithms, software or applications, the degree of manual intervention is greatly reduced, and the speed of the repayment reminding of the credit card can be greatly improved. In addition, the integral reminding step does not depend on data and characteristics generated after the client transacts credit card loan business, so that the future overdue rate of the client can be determined before the client is overdue and still does not pay, the current client can be reminded of paying in advance by selecting different reminding strengths according to different future overdue rates, and the client can pay money in time and pay interest. In summary, the method and the device for reminding the repayment of the credit card provided by the invention can remind the repayment of the credit card in advance, and improve the speed and the accuracy of the repayment reminding of the credit card, thereby improving the repayment reminding efficiency of the credit card and further being beneficial to improving the income of banks.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a method for reminding a user of repayment of a credit card according to an embodiment of the present invention;
FIG. 2 illustrates a schematic diagram of an exemplary decision tree of an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating an alternative step of determining a first overdue type according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating an alternative step of determining a second overdue type according to an embodiment of the present invention;
FIG. 5 is a schematic diagram illustrating an alternative step of determining a future overdue rate in accordance with embodiments of the present invention;
FIG. 6 is a schematic block diagram of a credit card repayment reminding device according to an embodiment of the present invention;
FIG. 7 illustrates a schematic diagram of a computer device suitable for use in implementing embodiments of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The terms "first," "second," "8230," "8230," and the like as used herein do not particularly denote any order or sequence, nor are they intended to limit the invention, but rather are used to distinguish one element from another or from another element described in the same technical term.
As used herein, the terms "comprising," "including," "having," "containing," and the like are open-ended terms that mean including, but not limited to.
As used herein, "and/or" includes any and all combinations of the described items.
It should be noted that, in the technical solution of the present invention, the acquisition, storage, use, processing, etc. of the data all meet the relevant regulations of the national laws and regulations.
The embodiment of the invention discloses a credit card repayment reminding method, which specifically comprises the following steps of:
s101: and determining a first overdue type of the corresponding current client according to a preset decision tree and the qualitative feature vector of the current client.
S102: constructing a corresponding fitness function according to a preset historical customer quantitative feature vector; and performing genetic iteration based on the fitness function, and determining a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vectors.
S103: determining a second overdue type of the corresponding current customer based on a preset current customer quantitative feature vector, the final power coefficient and a final multiple coefficient; and determining future overdue rate of the current customer according to the first overdue type and the second overdue type, and carrying out credit card repayment reminding on the current customer based on the future overdue rate.
The decision tree is associated with a plurality of preset historical customer qualitative characteristic vectors and historical overdue labels corresponding to the historical customer qualitative characteristic vectors.
For example, the specific implementation manner of the reminding of the payment of the credit card to the current customer based on the future overdue rate may be determined by those skilled in the art according to practical situations, and the embodiment of the present invention is not limited thereto. For example, for current customers with future overdue rate at [0%, 20%), no repayment reminders are made; for current customers with future overdue rate of [20%, 40%), carrying out payment reminding on the current customers in a mode of popping up payment reminding messages in related applications of the customers; for the current customers with future overdue rate of 40 percent and 60 percent, repayment reminding is carried out on the current customers in a short message mode; for current customers with future overdue rate of [60%, 80%), carrying out payment reminding on the current customers in a telephone mode; for the current customers with future overdue rate of 80%,100%, the payment reminding is carried out by the way of combining telephone and short message. The specific manner of reminding the current customer of the credit card repayment based on the future overdue rate can be, but is not limited to, determining the overdue rate of the current customer based on the future overdue rate, and further carrying out the credit card repayment reminding to the current customer correspondingly according to the overdue rate.
According to the credit card repayment reminding method and device, the corresponding first overdue type of the current client is determined according to the preset decision tree and the qualitative characteristic vector of the current client, the first overdue type of the current client can be rapidly and accurately determined on the basis of the qualitative characteristic vector of the current client by virtue of the characteristics that the decision tree is low in calculation complexity and high in classification accuracy and is suitable for classifying the qualitative characteristic vector, and therefore the speed and the accuracy of overall repayment reminding are improved; constructing a corresponding fitness function by quantifying a feature vector according to a preset historical client; genetic iteration is carried out based on the fitness function, a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vectors are determined, the related processing process of the genetic iteration in the genetic algorithm can be improved, the fitness function associated with the historical customer characteristics is constructed on the basis of the historical customer quantitative feature vectors capable of fully reflecting the historical customer characteristics, the optimal solution of power coefficient variables and multiple coefficient variables specially set for the historical customer characteristics in the fitness function can be more quickly and accurately solved through the optimized genetic iteration processing process, the final power coefficients and the final multiple coefficients serving as the optimal solution can be higher in conformity with the historical customer characteristics, the accuracy and the speed of determining the second overdue type of the current customer based on the optimal solution in the subsequent steps can be improved, and the speed and the accuracy of overall repayment reminding can be improved; the corresponding second overdue type of the current client is determined based on the preset quantitative feature vector, the final power coefficient and the final multiple coefficient of the current client, so that the second overdue type can be determined quickly and accurately with lower calculation complexity based on the relevant principle of the genetic algorithm and the characteristic that the genetic algorithm is suitable for processing the quantitative features; the future overdue rate of the current client is determined according to the first overdue type and the second overdue type, credit card repayment reminding is carried out on the current client based on the future overdue rate, the future overdue rate of the current client can be determined by comprehensively considering the overdue type determined based on the qualitative characteristics of the current client and the overdue type determined based on the quantitative characteristics of the current client, so that the accuracy of determining the future overdue rate can be effectively improved, the whole steps of the repayment reminding method can be realized through related programs, algorithms, software or applications, the degree of manual intervention is greatly reduced, and the speed of the repayment reminding of the credit card can be greatly improved. Moreover, the integral reminding step does not depend on data and characteristics generated after the client transacts the credit card loan business, so that the future overdue rate of the client can be determined before the client is overdue and still not paid, the current client can be reminded of paying in advance by selecting different reminding strengths according to different future overdue rates, and the client can pay the money in time and pay interest. In summary, the method and the device for reminding the repayment of the credit card provided by the invention can remind the repayment of the credit card in advance, and improve the speed and the accuracy of the repayment reminding of the credit card, thereby improving the repayment reminding efficiency of the credit card and further being beneficial to improving the income of banks.
In an optional embodiment, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
carrying out data cleaning, data extraction and data standardization processing on the preset initial historical client information to obtain intermediate historical client information;
performing feature vectorization processing on the intermediate historical client information to obtain a historical client feature vector;
and splitting the historical client characteristic vector according to the property of the element type of the vector element to respectively obtain a corresponding historical client qualitative characteristic vector and a corresponding historical client quantitative characteristic vector.
Illustratively, the source of the initial historical customer information may be, but is not limited to, a bank-related database, a system, a server, a record form, or the like.
For example, the data cleaning of the preset initial historical customer information may be, but is not limited to, replacing abnormal data in the historical customer information or deleting some abnormal data by a cleaning method such as spline interpolation, linear regression, and the like. The data extraction may be, but is not limited to, performing dimension reduction on a variable with strong correlation, for example, if two variables, namely, a credit card overdraft balance and a quasi-credit card overdraft balance, exist in the initial historical client information, since the embodiment of the present invention does not pay attention to the quasi-credit card overdraft balance, and the credit card overdraft balance and the quasi-credit card overdraft balance are the same in nature, the variable of the quasi-credit card overdraft balance is deleted (so that the subsequent related attributes and element types do not include the quasi-credit card overdraft balance), so as to complete the dimension reduction. The data normalization process may be, but is not limited to, converting the relevant data into various suitable formats, for example, for the relevant amount, converting the format into a format with granularity of two digits after decimal point, such as converting the amount of 10000 yuan into the amount of 10000.00 yuan. It should be noted that, for the specific implementation manner of performing data cleaning, data extraction, and data standardization processing on the preset initial historical customer information to obtain the intermediate historical customer information, the specific implementation manner may be determined by a person skilled in the art according to actual situations, and the foregoing description is only an example, and does not limit the present invention.
For example, the performing the feature vectorization processing on the intermediate historical customer information to obtain the historical customer feature vector may be, but is not limited to, performing corresponding feature extraction on each attribute information in the intermediate historical customer information to obtain a vector element corresponding to the attribute information, and then splicing and integrating each vector element to obtain a corresponding historical customer feature vector. For attribute information of which the attribute value is a number in the attribute information, when determining the vector element, the attribute value serving as the number may be directly determined as the corresponding vector element, or the attribute value is normalized to obtain the corresponding vector element; for attribute information (for example, characters or characters) whose attribute values are not numbers in the attribute information, when determining vector elements, the attribute information may be digitized to obtain corresponding vectorized elements, and the digitized products may be normalized to obtain corresponding vectorized elements, and the implementation of the digitization may be, but is not limited to, processing using a digitization coding (for example, one-hot coding) algorithm or referring to an ASCII code table. For attribute information (for example, a liability flag, presence or absence of housing, whether it is a gray list client, sex, etc.) whose attribute value is a category in attribute information, all the categories that can be used for the attribute information may be encoded to obtain a number corresponding to the category, and the attribute value in the attribute information may be replaced with the corresponding number, and for example, if the attribute information is sex information, and the types that can be used are male and female, male and female may be represented by a number 0 and female may be represented by a number 1. It should be noted that, a specific implementation manner of obtaining the history client feature vector by performing the feature vectorization processing on the intermediate history client information may be determined by a person skilled in the art according to actual situations, and the foregoing description is only an example, and does not limit the present invention.
Illustratively, the element types of the vector elements correspond to relevant attributes of the customer, including but not limited to gender (personal gender if the customer is an individual, president gender or CEO gender if the customer is a company, etc.), age group, presence or absence of children, presence or absence of housing, marital status, liability or liability, or blacklist customer, credit card overdraft preference, or cross-default customer, total asset balance, credit card overdraft balance, insurance balance, debit card consumption amount, total property price, and financing amount, etc. It should be noted that the element types of the vector elements can be determined by those skilled in the art according to practical situations, and the above description is only an example and is not limiting.
Illustratively, the acquisition and processing of the historical customer information may be implemented by, but is not limited to, a corresponding big data platform, such as, but not limited to, a Hadoop big data platform.
Illustratively, the historical customer feature vector is split according to the property of the element type of the vector element to obtain a corresponding historical customer qualitative feature vector and a corresponding historical customer quantitative feature vector, which may be, but not limited to, vector elements that determine all the element types in the historical customer feature vector as qualitative properties and vector elements that determine all the element types as quantitative properties, the vector elements of the qualitative properties are integrated and spliced to obtain a corresponding historical customer qualitative feature vector, and the vector elements of the quantitative properties are integrated and spliced to obtain a corresponding historical customer quantitative feature vector. The element types of qualitative properties may be, but are not limited to, element types with a limited range of desirable element values (the element types of qualitative properties generally correspond to, but are not limited to, attribute information of a class type), the element types of quantitative properties may be, but are not limited to, element types with an unlimited range of desirable element values (the element types of quantitative properties generally correspond to, but are not limited to, attribute information of a specific numerical type). For example, the element types of qualitative nature include, but are not limited to, gender, age group, presence of children, presence of living, marital status, whether debt is present, whether it is a grey list customer, credit card overdraft preferences, and whether it is a cross default customer, etc. For example, element types of quantitative nature including, but not limited to, total asset balance, credit card overdraft balance, insurance balance, debit card spending amount, total house property price, and financing amount, among others. For another example, if there exists a certain historical customer feature vector (gender a, age group B, total asset balance C, insurance balance D), the historical customer qualitative feature vector (gender a, age group B) and the historical customer quantitative feature vector (total asset balance C, insurance balance D) can be obtained after splitting. Wherein, a historical customer feature vector corresponds to a historical customer qualitative feature vector and a historical customer quantitative feature vector respectively. For example, the specific implementation manner of splitting the historical customer feature vector according to the property of the element type of the vector element to obtain the corresponding historical customer qualitative feature vector and the historical customer quantitative feature vector respectively may be determined by those skilled in the art according to actual situations, and the foregoing description is only an example, and does not limit the present invention.
Through the steps, on the basis of carrying out error correction and simplification on the historical customer information, the historical customer information can be quickly and accurately converted into a vector form which is convenient to participate in operation and processing, and the historical customer information is split into the quantitative feature vector and the qualitative feature vector, so that the operation of the subsequent steps in the process of carrying out related operation and processing is simpler, the speed of reminding repayment of the whole credit card is effectively improved, the subsequent steps can establish a decision tree based on the qualitative feature vector of the historical customer and determine a final power coefficient and a final multiple coefficient based on the quantitative feature vector of the historical customer, a foundation is laid for determining the overdue type based on the qualitative feature aspect and the overdue type based on the quantitative feature aspect of the current customer, the comprehensive determination of the future overdue rate based on a plurality of overdue types becomes feasible, and the accuracy of determining the future overdue rate is further improved.
In an optional embodiment, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
performing feature vectorization processing on preset current client information to obtain a current client feature vector;
and splitting the current customer characteristic vector according to the property of the element type of the vector element to respectively obtain the corresponding current customer qualitative characteristic vector and the current customer quantitative characteristic vector.
For example, as to a specific implementation manner of performing feature vectorization processing on preset current client information to obtain a current client feature vector, reference may be made to a description of a step of performing feature vectorization processing on the intermediate historical client information to obtain a historical client feature vector in the embodiment of the present invention, which is not described herein again. Wherein, the element type of the vector element of the current customer feature vector needs to be consistent with the element type of the vector element of the historical customer feature vector.
For example, as to a specific implementation manner of splitting the current customer feature vector according to the property of the element type of the vector element to obtain the corresponding current customer qualitative feature vector and current customer quantitative feature vector, reference may be made to an explanation of the step of splitting the historical customer feature vector according to the property of the element type of the vector element to obtain the corresponding historical customer qualitative feature vector and historical customer quantitative feature vector, which is not described herein again.
Through the steps, the current customer information can be quickly and accurately converted into a vector form which is convenient to participate in operation and processing, and the vector form is split into the quantitative characteristic vector and the qualitative characteristic vector, so that the operation of related operation and processing in the subsequent steps is simpler, the speed of the whole credit card repayment reminding is effectively improved, a foundation is laid for determining the overdue type of the current customer based on the qualitative characteristic aspect and the overdue type based on the quantitative characteristic aspect in the subsequent steps, the comprehensive determination of the future overdue rate based on a plurality of overdue types is feasible, and the accuracy of determining the future overdue rate is further improved.
In a preferred embodiment, before performing the feature vectorization processing on the preset current client information, data cleaning, data extraction and data standardization processing are performed on the current client information to perform error correction and simplification on the current client information, so as to improve the accuracy of the generated current client feature vector and make it more convenient to participate in the operation and processing.
In an optional embodiment, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
and constructing a decision tree based on a plurality of historical customer qualitative feature vectors, element values of the historical customer qualitative feature vectors and corresponding historical overdue labels.
Illustratively, the element values of the historical customer qualitative feature vectors include an element value corresponding to each vector element of each historical customer qualitative feature vector.
Through the steps, the decision tree can be constructed by taking a plurality of characteristic vector samples required for constructing the decision tree and the corresponding decision category labels as input, so that the constructed decision tree is complete, the condition of processing stagnation caused by the structure deficiency of the tree can be avoided when the decision tree is used for carrying out related processing, and the method is more favorable for smoothly executing the subsequent step of determining the first overdue type of the current client based on the decision tree.
In an optional embodiment, the constructing a decision tree based on a plurality of the historical customer qualitative feature vectors, element values of the historical customer qualitative feature vectors, and corresponding historical overdue labels includes:
obtaining a complete information entropy according to all the historical client qualitative feature vectors and corresponding historical overdue labels; wherein the value of the history overdue label is a non-overdue label, a first-stage overdue label, a second-stage overdue label, a third-stage overdue label or a fourth-stage overdue label;
obtaining a root conditional entropy of each element type of the vector elements according to the history overdue labels corresponding to all the history client qualitative feature vectors and the element values of the vector elements;
obtaining a root information gain entropy corresponding to the element type according to the complete information entropy and the root conditional entropy, and establishing a root node of a decision tree by taking the element type with the maximum root information gain entropy as a root node attribute; respectively establishing child nodes corresponding to each element value based on each element value which the root node attribute can adopt;
repeatedly executing the step of establishing the child nodes until the child nodes cannot be established so as to complete the construction of the decision tree, wherein the step of establishing the child nodes comprises the following steps:
determining a plurality of historical customer qualitative characteristic vectors with vector elements corresponding to the sub-element values as the sub-vectors of the sub-nodes according to the sub-element values corresponding to each sub-node;
respectively judging whether historical overdue labels corresponding to the sub-vectors of each child node are the same, if so, taking the child nodes as leaf nodes; determining a plurality of historical customer qualitative feature vectors with vector elements corresponding to the leaf element values as the leaf vectors of the leaf nodes according to the leaf element values corresponding to each leaf node;
if not, obtaining the sub-conditional entropy of each element type in the child nodes according to the historical overdue labels corresponding to all the child vectors and the element values of the vector elements;
obtaining sub information gain entropies corresponding to the element types according to the complete information entropies and the sub condition entropies, and taking the element type with the maximum sub information gain entropies as a sub node attribute; and respectively establishing child nodes of the next layer of the child nodes based on each element value which the attribute of the child nodes can adopt.
Illustratively, the first-stage overdue tag corresponds to the client being in the overdue state of the credit card N1, the second-stage overdue tag corresponds to the client being in the overdue state of the credit card N2, the third-stage overdue tag corresponds to the client being in the overdue state of the credit card N3, and the fourth-stage overdue tag corresponds to the client being in the overdue state of the credit card N3 +. It should be noted that, the specific meaning of the value of the historical overdue label can be determined by those skilled in the art according to the actual situation, and the above description is only an example, and is not limited thereto.
Illustratively, the root information gain entropy corresponding to the element type is obtained according to the complete information entropy and the root conditional entropy, and the root information gain entropy is obtained by subtracting the root conditional entropy from the complete information entropy.
Illustratively, the root node of the decision tree is established based on the root node attribute, which is a conventional technical means in the art and is not described here again. For example, as shown in fig. 2, if the element type with the largest root information entropy is a gray list client or not, a root node of the decision tree, node 1, is established by using whether the element type is a gray list client or not as a root node attribute.
For example, the child node corresponding to each element value is respectively established based on each element value that the root node attribute may be, and the child node has the following example:
as shown in fig. 2, a node 1 is a root node, and the attribute of the root node is whether the node is a gray list client, and at this time, it is known whether the element is a gray list client or not, and the element may take three element values of 0,1 and 2 (0 corresponds to a gray list client, 1 corresponds to not a gray list client, and 2 corresponds to a previous gray list client), so that the root node extends out three paths, one end of each path, which is far away from the root node, is provided with a child node, and the element value (child element value) corresponding to each child node is different, for example, the child element value corresponding to the node 2 as a child node is 0, the child element value corresponding to the node 3 is 1, and the child element value corresponding to the node 4 is 2. It should be noted that, for each element value that is desirable based on the attribute of the root node, a specific implementation manner of respectively establishing a child node corresponding to each element value may be determined by a person skilled in the art according to an actual situation, and the foregoing description is only an example, and does not limit this.
Illustratively, the determining, according to the sub-element value corresponding to each of the sub-nodes, a plurality of historical customer qualitative feature vectors having vector elements corresponding to the sub-element values as the sub-vectors of the sub-nodes includes:
for node 2, the corresponding sub-element value is 0, and of all the historical client qualitative feature vectors, the historical client qualitative feature vector with the element value of 0 taken by the vector element corresponding to the element type "whether the element is a gray list client" (i.e. the historical client qualitative feature vector with the element value of 0 taken by the element representing "whether the element is a gray list client" in the vector) is a, b and c, then a, b and c are determined as the sub-vectors of node 2.
For the node 3, the corresponding sub-element value is 1, and in all the historical client qualitative feature vectors, the historical client qualitative feature vector of which the element value is 1 is taken as the T, the E and the Hen of the vector element corresponding to the element type 'whether the element type is the gray list client', and the T, the E and the Hen are determined as the sub-vectors of the node 3.
For the node 4, the corresponding sub-element value is 2, and in all the historical client qualitative feature vectors, the historical client qualitative feature vector of which the element value is 2 and which is taken by the vector element corresponding to the element type "whether the element is a gray list client" is heptyl, octyl, nonyl and decyl, the heptyl, octyl, nonyl and decyl are determined as the sub-vectors of the node 4.
It should be noted that, for a specific implementation manner of determining, according to a sub-element value corresponding to each sub-node, that a plurality of historical customer qualitative feature vectors having vector elements corresponding to the sub-element value are sub-vectors of the sub-node, the specific implementation manner may be determined by a person skilled in the art according to an actual situation, and the foregoing description is only an example, and does not limit this.
Exemplarily, the determining whether the historical overdue labels corresponding to the sub-vectors of each child node are the same or not is performed, and if yes, the child nodes are taken as leaf nodes; according to the leaf element value corresponding to each leaf node, determining a plurality of historical customer qualitative feature vectors with the vector elements corresponding to the leaf element values as the leaf vectors of the leaf nodes, as follows:
the subvectors of the node 2 are a, b and c, wherein the historical overdue labels corresponding to the first, b and c are the second stage overdue labels, and if the historical overdue labels of the subvectors of the node 2 are the same, the node 2 is taken as a leaf node, the corresponding leaf element value is 0, and the leaf vectors are the first, b and c correspondingly.
It should be noted that, for respectively judging whether the historical overdue labels corresponding to the sub-vectors of each child node are the same, if yes, the child nodes are taken as leaf nodes; the specific implementation manner of determining, according to the leaf element value corresponding to each leaf node, that the plurality of historical customer qualitative feature vectors having the vector element corresponding to the leaf element value are leaf vectors of the leaf nodes may be determined by those skilled in the art according to actual situations, and the foregoing description is only an example, and does not limit this.
Illustratively, the sub-conditional entropy of each element type in the sub-nodes is obtained according to the history overdue labels and the element values of the vector elements corresponding to all the sub-vectors, that is, the conditional entropy is obtained for all the sub-vectors corresponding to the current node (for example, for node 3, all the sub-vectors are t, and z), but not for all the history client qualitative feature vectors, and therefore, the sub-conditional entropy is not equal to the root conditional entropy. The principle of solving the sub-conditional entropy is the same as that of solving the root conditional entropy, and both the sub-conditional entropy and the root conditional entropy are calculated by using a standard conditional entropy calculation method, a calculation formula and the like.
Illustratively, the sub information gain entropy corresponding to the element type is obtained according to the complete information entropy and the sub condition entropy, and the element type with the maximum sub information gain entropy is used as the attribute of the child node; respectively establishing a specific implementation mode of a child node of a next layer of the child node based on each element value which is desirable for the attribute of the child node, wherein a root information gain entropy corresponding to the element type can be obtained according to the complete information entropy and the root conditional entropy in the embodiment of the invention, and the element type with the maximum root information gain entropy is used as a root node of a decision tree established by the attribute of the root node; based on each element value that the root node attribute may want, a description of a step of establishing a child node corresponding to each element value is respectively established, which is not described herein again.
For example, as shown in fig. 2, if the child node attribute of node 3 is "debt" and its corresponding desirable element values are 10 and 11 (10 for debt and 11 for not), then node 5 corresponding to element value 10 and node 6 corresponding to element value 11 are established, respectively. Similarly, if the child node attribute of the node 4 is the presence or absence of a house and the corresponding desirable element values are 20 and 21 respectively (20 corresponds to the presence of a house and 21 corresponds to the absence of a house), the node 7 corresponding to the element value 20 and the node 8 corresponding to the element value 21 are respectively established. And the element values corresponding to the historical customer qualitative feature vectors contained in the nodes 5, 6, 7 and 8 are consistent, the nodes 5, 6, 7 and 8 are determined to be leaf nodes.
It should be noted that, the specific implementation manner of each step for constructing the decision tree can be determined by those skilled in the art according to practical situations, and the above description is only an example, and is not limited thereto.
Through the steps, the decision tree can be constructed by a standard method for establishing the ID3 decision tree, and because the structure and parameters contained in the ID3 type decision tree are not complex, the processing and calculating speed is higher when the ID3 decision tree is used for determining the first overdue type of the current client, so that the speed of the overall repayment reminding can be increased. In addition, the steps required for constructing the ID3 decision tree are not complicated, so that the speed of constructing the decision tree is high, and the speed of overall repayment can be indirectly improved.
In an optional embodiment, the obtaining a complete information entropy according to all the historical qualitative feature vectors of the customer and the corresponding historical overdue labels includes:
obtaining a first number of the historical customer qualitative feature vectors with the historical overdue labels being non-overdue labels, a second number of the historical customer qualitative feature vectors with the historical overdue labels being first-stage overdue labels, a third number of the historical customer qualitative feature vectors with the historical overdue labels being second-stage overdue labels, a fourth number of the historical customer qualitative feature vectors with the historical overdue labels being third-stage overdue labels, and a fifth number of the historical customer qualitative feature vectors with the historical overdue labels being fourth-stage overdue labels according to all the historical customer qualitative feature vectors and corresponding historical overdue labels;
obtaining an unexpired rate based on the first number and the total number of all historical customer qualitative feature vectors;
obtaining a first-stage overdue rate based on the second quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a second stage overdue rate based on the third quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a third-stage overdue rate based on the fourth quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a fourth-order overdue rate based on the fifth number and the total number of all historical customer qualitative feature vectors;
and obtaining the complete information entropy based on the non-overdue rate, the first-stage overdue rate, the second-stage overdue rate, the third-stage overdue rate and the fourth-stage overdue rate.
Illustratively, the non-overdue rate is obtained based on the first number and a total number of all historical customer qualitative feature vectors, and the non-overdue rate is obtained by dividing the first number by the total number.
Illustratively, the first-stage overdue rate is obtained based on the second number and a total number of all historical customer qualitative feature vectors, and the first-stage overdue rate is obtained by dividing the second number by the total number.
Illustratively, the second stage overdue rate is obtained based on the third number and a total number of all historical customer qualitative feature vectors, and the second stage overdue rate is obtained by dividing the third number by the total number.
Illustratively, the third-stage overdue rate is obtained based on the fourth number and a total number of all historical customer qualitative feature vectors, and the third-stage overdue rate is obtained by dividing the fourth number by the total number.
Illustratively, the fourth order overdue rate is obtained based on the fifth number and a total number of all historical customer qualitative feature vectors, and the fourth order overdue rate is obtained by dividing the fifth number by the total number.
Illustratively, the full information entropy obtained based on the non-overdue rate, the first-stage overdue rate, the second-stage overdue rate, the third-stage overdue rate, and the fourth-stage overdue rate may be represented by the following formula:
Figure BDA0003795150410000191
wherein H (p) represents the complete information entropy, p i When i =1, the non-overdue rate, p, corresponding to the historical client qualitative feature vector is shown i Representing the first-stage overdue rate, p, corresponding to the historical customer qualitative feature vector when i =2 i Table at i =3Indicating the second stage overdue rate, p, corresponding to the historical customer qualitative feature vector i Represents the third stage overdue rate corresponding to the historical customer qualitative feature vector at i =4, and p i And when i =5, the fourth-stage overdue rate corresponding to the historical customer qualitative feature vector is represented. The specific value of the historical overdue label of each historical client qualitative feature vector can be manually set by related workers according to related historical conditions. It should be noted that, for the specific implementation manner of obtaining the complete information entropy based on the non-overdue rate, the first-stage overdue rate, the second-stage overdue rate, the third-stage overdue rate, and the fourth-stage overdue rate, the specific implementation manner may be determined by those skilled in the art according to the actual situation, and the above description is only an example, and does not limit this.
Through the steps, the complete information entropy can be determined by a standard information entropy calculation method, so that the accuracy of the obtained complete information entropy is improved, the accuracy of the decision tree constructed in the subsequent steps and the accuracy of relevant processing based on the decision tree are improved, and the accuracy of the whole repayment reminding is improved.
In an optional embodiment, the obtaining a root conditional entropy of each element type of a vector element according to the history overdue labels and the element values of the vector elements corresponding to all the history client qualitative feature vectors includes:
according to the element values of the vector elements, the division number of the vector elements with different element values in the element types is respectively obtained;
obtaining a partition rate according to the partition rate and the total number of all historical customer qualitative characteristic vectors;
respectively taking the corresponding historical client qualitative feature vectors when different element values are taken from the element types as corresponding division vectors, and obtaining division information entropies corresponding to the different element values based on the corresponding division vectors when the different element values are taken from the element types and the historical overdue labels corresponding to the division vectors;
and obtaining the root conditional entropy of the element type based on the partition rate and the partition information entropy corresponding to different element values of the element type.
For example, the number of divisions of the vector elements with different element values in the element types is the number of the historical client qualitative feature vectors with different element values for a certain element type, for example, if the element type is a gray list client at this time, if there are 2000 historical client qualitative feature vectors with a gray list client element value of 0, 3000 historical client qualitative feature vectors with a gray list client element value of 1, and 5000 historical client qualitative feature vectors with a gray list client element value of 2. It can be seen that the number of divisions of the element value 0 is 2000, the number of divisions of the element value 1 is 3000, and the number of divisions of the element value 2 is 5000 for the element type of whether it is a gray list client.
Illustratively, the score is obtained according to the score amount and the total amount of all historical customer qualitative feature vectors, specifically, the score amount is obtained by dividing the score amount by the total amount, for example, for an element type of whether the element is a gray list customer, the score amount with an element value of 0 is 2000, and the score amount is 10000, the score amount is 2000/10000=20%.
For example, based on the specific principle that the division information entropies corresponding to different element values are obtained based on the division vectors corresponding to the different element values in the element types and the history overdue labels corresponding to the division vectors, reference may be made to the description of the step of obtaining the complete information entropy in the embodiments of the present invention, which is not repeated here, but only different in that all the division vectors corresponding to a certain element value are used as a range when the division information entropy is obtained, and all the history client qualitative feature vectors are used as a range when the complete information entropy is obtained.
For example, the dividing rate and the dividing information entropy corresponding to different desirable element values of the element type are used to obtain the root conditional entropy of the element type, which may be expressed as the following equation:
Figure BDA0003795150410000211
wherein H (Y | X) represents the root conditional entropy corresponding to the element type, p i For example, if the element type is a gray list client, the desirable element values are 0,1, and 2, and then m has a value of 3.H (Y | X = X) i ) And representing the corresponding division information entropy of the element type X when a certain element value Xi is taken. Wherein one element type corresponds to one root conditional entropy. It should be noted that, for the specific implementation manner of obtaining the root conditional entropy of the element type based on the partition rate and the partition information entropy corresponding to different element values that may be obtained for the element type, the implementation manner may be determined by those skilled in the art according to practical situations, and the foregoing description is only an example, and does not limit this.
Through the steps, the root conditional entropy of each element type can be obtained by a standard conditional entropy obtaining method, so that the obtained root conditional entropy has higher accuracy, the accuracy of relevant processing of the decision tree constructed in the subsequent steps and based on the decision tree is improved, and the accuracy of the whole repayment reminding is improved.
In an optional embodiment, the obtaining the sub-conditional entropy of each element type in the child node according to the history overdue labels and the element values of the vector elements corresponding to all the child vectors includes:
obtaining the sub-division number of vector elements with different element values in the element types of the sub-vectors respectively according to the element values of the vector elements of the sub-vectors;
obtaining a sub-division rate according to the sub-division number and the sub-vector number of the sub-vectors;
respectively taking the corresponding sub-vectors when different element values are taken from the element types as corresponding sub-division vectors, and obtaining sub-division information entropies corresponding to the different element values based on the corresponding sub-division vectors when the different element values are taken from the element types and the historical overdue labels corresponding to the sub-division vectors;
and obtaining the sub-condition entropy of the element type based on the sub-division rate and the sub-division information entropy corresponding to different element values of the element type.
For example, the specific principle of obtaining the sub-division numbers of the vector elements with different element values in the element types of the sub-vectors according to the element values of the vector elements of the sub-vectors may refer to the description of the step of obtaining the division numbers of the vector elements with different element values in the element types according to the element values of the vector elements in the embodiment of the present invention, and details are not repeated here.
For example, the specific principle of obtaining the sub-division rate according to the sub-division number and the sub-vector number of the sub-vectors may refer to the description of obtaining the division rate according to the division number and the total number of all historical customer qualitative feature vectors in the embodiment of the present invention, and details are not repeated here.
For example, the specific principle that the sub-partition information entropies corresponding to different element values are obtained based on the sub-partition vectors corresponding to different element values in the element types and the historical overdue labels corresponding to the sub-partition vectors, may refer to the description of the step of obtaining the partition information entropies corresponding to different element values by using the historical customer-specific feature vectors corresponding to different element values in the element types as the corresponding partition vectors, and based on the partition vectors corresponding to different element values in the element types and the historical overdue labels corresponding to the partition vectors in the embodiment of the present invention, and this description is not repeated here.
For example, the specific principle of obtaining the sub-condition entropy of the element type based on the sub-partition rate and the sub-partition information entropy corresponding to different element values that may be recommended for the element type may refer to the description of the step of obtaining the root condition entropy of the element type based on the partition rate and the partition information entropy corresponding to different element values that may be recommended for the element type in the embodiment of the present invention, and details are not repeated here.
Through the steps, the sub-conditional entropy of each element type in the specific sub-node can be obtained by a standard method for solving the conditional entropy, so that the obtained sub-conditional entropy has higher accuracy, the accuracy of relevant processing of the decision tree constructed in the subsequent steps and based on the decision tree is improved, and the accuracy of the whole push is improved.
In an alternative embodiment, as shown in fig. 3, the determining a first overdue type of the corresponding current client according to the preset decision tree and the qualitative feature vector of the current client includes the following steps:
s301: and determining a corresponding path in a decision tree according to the element values of the vector elements of the current customer qualitative feature vector.
S302: and obtaining the corresponding leaf node according to the path.
S303: and determining a historical overdue type corresponding to the leaf node according to a historical overdue label corresponding to the leaf vector of the leaf node, and determining the corresponding historical overdue type as the first overdue type.
Illustratively, the determining a corresponding path in the decision tree according to the element values of the vector elements of the current customer qualitative feature vector includes:
if the current customer qualitative feature vector a is (0, 6, 2), where 0 indicates the vector element with the element type of gender in vector a (and the element value of the vector element is 0), 6 indicates the vector element with the element type of age in vector a (and the element value of the vector element is 6), and 2 indicates whether the element type in vector a is the vector element of the gray list customer (and the element value of the vector element is 2). Inputting element values corresponding to elements of the current client qualitative feature vector A into a decision tree, and if the root node attribute of the decision tree is known to be 'whether the root node attribute is a gray list client', advancing to a first child node along a path of the root node attribute with the element value of 2; if the child node attribute of the first child node is 'gender', the method proceeds to a second child node along a path of which the element value of the child node attribute is 0; if the child node attribute of the second child node is "age group", the path along which the element value of the child node attribute is 6 is advanced to the third child node, and if the third child node is found to be a leaf node, it may be determined that the corresponding path is: root node- > first child node- > second child node- > a certain leaf node.
It should be noted that, for the element values of the vector elements of the current customer qualitative feature vector, the corresponding path is determined in the decision tree, which is a basic implementation manner of existing classification using the decision tree, and the specific content thereof may be determined by those skilled in the art according to the actual situation, and the above description is only an example, and does not limit this.
For example, the history overdue type corresponding to the leaf node is determined according to the history overdue tag corresponding to the leaf vector of the leaf node, the corresponding history overdue type is determined as the first overdue type, which may be, but is not limited to, determining the corresponding leaf vector according to the leaf node, obtaining the corresponding history overdue tag (the values of the history overdue tags corresponding to the leaf vectors in one leaf node are consistent) based on the leaf vector as the history overdue tag corresponding to the leaf node, and then determining the history overdue type corresponding to the history tag of the leaf node based on the preset correspondence between the history overdue tag and different history overdue types when different values are taken. For example, the correspondence between the history overdue tag and the history overdue type when the history overdue tag takes different values may be expressed as, but not limited to, the following:
the non-overdue label has extremely low overdue probability, the first-stage overdue label has lower overdue probability, the second-stage overdue label has medium overdue probability, the third-stage overdue label has higher overdue probability and the fourth-stage overdue label has extremely high overdue probability.
It should be noted that, for a specific implementation manner that the history overdue type corresponding to the leaf node is determined according to the history overdue tag corresponding to the leaf vector of the leaf node, and the corresponding history overdue type is determined as the first overdue type, which can be determined by a person skilled in the art according to an actual situation, the above description is only an example, and does not limit this.
Through the steps, the overdue type of the current client corresponding to the qualitative characteristic vector of the current client can be predicted by using the decision tree in a standard method, the prediction speed is high, the accuracy is high, and therefore the speed and the accuracy of the overall pushing are improved.
In an optional embodiment, the constructing a corresponding fitness function according to the preset historical quantitative feature vector of the customer includes:
respectively setting a corresponding power coefficient variable and a corresponding multiple coefficient variable for each vector element in each historical customer quantitative feature vector;
obtaining a sub-fitness parameter corresponding to each vector element based on the power coefficient variable, the multiple coefficient variable and the element value of the corresponding vector element;
and constructing a fitness function of the corresponding historical customer quantitative feature vector according to the sub-fitness parameter corresponding to each vector element.
For example, the sub-fitness parameter corresponding to each vector element is obtained based on the power coefficient variable, the multiple coefficient variable and the element value of the corresponding vector element, and may be, but is not limited to, an index in which the power coefficient variable is used as the element value of the corresponding vector element, to obtain a sub-fitness constituting parameter, and the sub-fitness constituting parameter is multiplied by the multiple coefficient variable to obtain the sub-fitness parameter corresponding to the vector element. For example, a certain historical customer quantitative feature vector B is (1000000, 3000000, 200000), where 1000000 represents a vector element whose element type is total asset balance in vector B (and the element value of the vector element is 1000000, which may represent but is not limited to the total asset balance of the customer is 1000000 yuan), 3000000 represents a vector element whose element type is insurance balance in vector B (and the element value of the vector element is 3000000, which may represent but is not limited to the insurance balance of the customer is 3000000 yuan), and 200000 represents a vector element whose element type is financial balance in vector B (and the element value of the vector element is 200000, which may represent but is not limited to the financial balance of the customer is 200000 yuan). And the power coefficient variable of the total asset balance vector element corresponding to vector B is denoted as k 1 The multiple coefficient variable is expressed as e 1 . The power coefficient variable of the insurance balance vector element corresponding to vector B is represented as k 2 The multiple coefficient variable is denoted as e 2 . The power coefficient variable of the financial balance vector element corresponding to the vector B is represented as k 3 The multiple coefficient variable is denoted as e 3 . Then for vector B, the sub-fitness constituting parameter of the total asset balance vector element is
Figure BDA0003795150410000241
The sub-fitness parameter is
Figure BDA0003795150410000242
The sub-fitness of the insurance balance vector element constitutes a parameter of
Figure BDA0003795150410000243
The sub-fitness parameter is
Figure BDA0003795150410000244
The sub-fitness forming parameter of the financing balance vector element is
Figure BDA0003795150410000245
The sub-fitness parameter is
Figure BDA0003795150410000246
It should be noted that, for a specific implementation manner of obtaining the sub-fitness parameter corresponding to each vector element based on the power coefficient variable, the multiple coefficient variable, and the element value of the corresponding vector element, the implementation manner may be determined by those skilled in the art according to actual situations, and the above description is only an example, and does not limit this.
For example, the fitness function of the corresponding historical customer quantitative feature vector is constructed according to the sub-fitness parameter corresponding to each vector element, and may be, but is not limited to, adding all the sub-fitness parameters, and additionally superimposing (or not superimposing) a constant to construct the fitness function of the corresponding historical customer quantitative feature vector. For example, corresponding to the above example, the fitness function of vector B may be, but is not limited to:
Figure BDA0003795150410000251
the value of C may be determined by those skilled in the art according to practical situations, and is not limited in this embodiment of the present invention, for example, C may be, but is not limited to, 0,1, 2, 3, or 50. f (x) represents the fitness obtained when the vector parameter x is set as the vector B.
The above description is only given by way of example for the vector B, and in actual processing, the number of elements of the vector does not need to be 3 as the vector B, but one vector element corresponds to one power coefficient variable e and one multiple coefficient variable k, and one vector element corresponds to one sub-fitness parameter. Wherein the fitness function is generally represented by the following equation:
Figure BDA0003795150410000252
wherein f (x) represents a fitness variable corresponding to the vector parameter x, n represents the number of elements (namely the maximum element number) of the input historical customer quantitative feature vector, the corner labels '1', '2', '3' and 'n' and the like represent the element numbers of the vector, and x 1 、x 2 、x 3 And x n Etc. represent the element value variables of the vector elements corresponding to the corresponding element numbers. And participating in processing how many vector elements exist in the historical customer quantitative feature vector and how many e, k and x exist in the corresponding fitness function.
It should be noted that the specific implementation manner of constructing the fitness function of the corresponding historical customer quantitative feature vector according to the sub-fitness parameter corresponding to each vector element may be determined by those skilled in the art according to actual situations, and the above description is only an example, and does not limit the present invention.
Through the steps, the fitness function can be fully associated with the characteristics of each vector element, the power coefficient variable and the multiple coefficient variable are set for each vector element, the association degree of the characteristics of the vector elements and the variables to be solved in the fitness function can be improved from the angle of multiplication and power extraction, the conformity degree of the final power coefficient and the final multiple coefficient obtained after subsequent iteration solving and the characteristics of historical customers is improved, the accuracy of a second overdue type determined based on the final power coefficient and the final multiple coefficient can be improved, and the accuracy of the subsequently determined future overdue rate is improved.
In an optional embodiment, the performing genetic iteration based on the fitness function to determine a plurality of final power coefficients and final multiple coefficients corresponding to the historical quantitative feature vector of the customer includes:
randomly setting an initial power coefficient of each power coefficient variable and an initial multiple coefficient of each multiple coefficient variable in the fitness function, and repeatedly executing the step of genetic iteration until all final power coefficients and final multiple coefficients of each fitness function are determined, wherein the step of genetic iteration comprises the following steps:
based on all the fitness functions, the fitness of each corresponding historical customer quantitative feature vector is obtained; obtaining a sub-fitting value corresponding to the historical customer quantitative feature vector based on the overdue value corresponding to the historical overdue label of the corresponding historical customer quantitative feature vector and the fitness; superposing sub-fitting values corresponding to all the historical customer quantitative feature vectors to obtain fitting values;
judging whether the fitting value is smaller than or equal to a preset fitting threshold value, if so, taking an initial power coefficient of each power coefficient variable in the fitness function as the final power coefficient, and taking an initial multiple coefficient of each multiple coefficient variable as the final multiple coefficient;
if not, repeatedly executing cross mutation operation until all fitness functions are updated;
wherein the cross mutation operations comprise:
selecting one fitness function which is not updated from all fitness functions as a current fitness function, and selecting a plurality of fitness functions from other fitness functions except the current fitness function as cross operator functions;
obtaining a power coefficient cross operator of each corresponding power coefficient variable of the current fitness function according to a power coefficient of a corresponding power coefficient variable in a plurality of cross operator functions and a preset first random number; obtaining a multiple cross operator of each corresponding multiple coefficient variable of the current fitness function according to the multiple coefficient of the corresponding multiple coefficient variable in the multiple cross operator functions and the first random number;
obtaining a corresponding power variation operator according to the power crossing operator and a preset second random number; obtaining a corresponding multiple mutation operator according to the multiple crossover operator and the second random number;
and taking the power variation operator corresponding to each power coefficient variable of the current fitness function as an initial power coefficient of the power coefficient variable, and taking the multiple variation operator corresponding to each multiple coefficient variable of the current fitness function as an initial multiple coefficient of the multiple coefficient variable, so as to finish updating the current fitness function.
Illustratively, because the power coefficient variable and the multiple coefficient variable have been set to exact values, the values of the variables in the fitness function are known, so the fitness of each corresponding historical customer quantitative feature vector can be obtained directly based on all the fitness functions.
Illustratively, one historical custom quantitative feature vector corresponds to one sub-fit value and one historical custom quantitative feature vector corresponds to one overdue value.
For example, the overdue value corresponding to the historical overdue tag may be determined based on a correspondence relationship between a preset historical overdue tag and different overdue values when the preset historical overdue tag takes different values. For example, the correspondence between the historical overdue tag and different overdue values when the historical overdue tag takes different values may be expressed as, but not limited to, the following:
non-overdue tag >1000000, first-stage overdue tag >20000000, second-stage overdue tag >3000000, third-stage overdue tag >4000000, and fourth-stage overdue tag >5000000.
The overdue value corresponding to the history overdue tag when the history overdue tag takes different values may be determined by those skilled in the art according to actual situations, and the above description is only an example, and does not limit the present invention. However, the historical overdue tag needs to have different corresponding overdue values when taking different values.
For example, the fitting values obtained by superimposing sub-fitting values corresponding to all historical customer quantitative feature vectors may be represented by, but not limited to, the following formula:
Figure BDA0003795150410000271
where E denotes a fitting value, N denotes the number of history customer quantitative feature vectors (the number of sub-fitting values), i denotes the number of history customer quantitative feature vectors, | F (x) i )-f(x i )| 2 The sub-fit values are indicated. It should be noted that the specific implementation manner of obtaining the fitting value by superimposing the sub-fitting values corresponding to all the historical customer quantitative feature vectors may be determined by those skilled in the art according to actual situations, and the above description is only an example, and does not limit the present invention.
For example, the fitting threshold may be determined by those skilled in the art according to practical situations, and the embodiment of the present invention is not limited thereto, for example, the fitting threshold may be, but is not limited to, 200, 50, 100, 1000, or 10000.
For example, the initial power coefficient of each power coefficient variable in the fitness function is used as the final power coefficient, specifically, the initial power coefficient of the power coefficient variable of each corresponding vector element in the fitness function is used as the final power coefficient of the corresponding vector element, and the same principle is applied to the initial multiple coefficient of each multiple coefficient variable as the final multiple coefficient.
For example, the selecting a fitness function that is not updated from all fitness functions as a current fitness function, and selecting a plurality of fitness functions from other fitness functions except the current fitness function as a cross operator function may include:
the method comprises a fitness function A, a fitness function B, a fitness function C, a fitness function D and a fitness function E, wherein the fitness function A is an updated fitness function when the process of the cross variation of the current round is carried out, so that the fitness functions which are not updated comprise the fitness function B, the fitness function C, the fitness function D and the fitness function E. And selecting a fitness function B as a current fitness function, wherein the other fitness functions except the current fitness function comprise other updated fitness functions and non-updated fitness functions, such as a fitness function A, a fitness function C, a fitness function D and a fitness function E. The number of the selected crossover operator functions may be determined by those skilled in the art according to practical situations, which is not limited in this embodiment of the present invention, for example, the number of the selected "multiple fitness functions" may be, but is not limited to, 2, 3, or 4, and is preferably 2, but the number of the selected crossover operator functions cannot be less than 2. When the fitness function is the fitness function B, the fitness function a and the fitness function C may be selected as the cross operator function, or when the fitness function is the fitness function a, the fitness function B and the fitness function E may be selected as the cross operator function.
It should be noted that the specific implementation manner of selecting one fitness function that is not updated from all fitness functions as the current fitness function and selecting a plurality of fitness functions from other fitness functions except the current fitness function as the crossover operator function may be determined by those skilled in the art according to actual situations, and the above description is only an example, and does not limit the present invention.
For example, the first random number may be, but is not limited to, a normally distributed random number or a random number generated based on a random number generation function.
Illustratively, the power crossing operator for obtaining each corresponding power coefficient variable of the current fitness function according to the power coefficient of the corresponding power coefficient variable in the multiple crossing operator functions and a preset first random number may be, but is not limited to, a power crossing operator for obtaining the power coefficient variable of the corresponding element class of the current fitness function according to the power coefficient variable of the vector element of the corresponding element class in the multiple crossing operator functions and a first random number, where a value range of the first random number is (0, 1). Specifically, the following examples are given:
for a certain element class "total asset balance", the element value variable x of its corresponding vector element in the crossover operator function A 3 Power coefficient variable k 3 Has a power coefficient F1 of 3, and the element value variable x of the corresponding vector element in the crossover operator function B 3 Power coefficient variable k 3 The power coefficient F2 of (a) is 5, and the value of the first random number a is 0.8, then the power cross operator of the vector element corresponding to the element type of "total asset balance" in the current fitness function C can be expressed as follows:
power crossover operator = F1 a + F2 (1-a)
The power cross operator value after substitution is 3.4.
It should be noted that, for a specific implementation manner of obtaining the power-degree intersection operator of each corresponding power-degree coefficient variable of the current fitness function according to the power-degree coefficient of the corresponding power-degree coefficient variable in the multiple intersection operator functions and the preset first random number, a person skilled in the art may determine the implementation manner according to actual situations, and the above description is only an example, and does not limit this.
For example, a specific implementation manner of obtaining the multiple crossover operator of each corresponding multiple coefficient variable of the current fitness function according to the multiple coefficient of the corresponding multiple coefficient variable in the multiple crossover operator functions and the first random number may refer to a description of a step of obtaining the power crossover operator of each corresponding power coefficient variable of the current fitness function according to the power coefficient of the corresponding power coefficient variable in the multiple crossover operator functions and a preset first random number in the embodiment of the present invention, and the principle is the same, and is not described herein again.
For example, the second random number may be, but is not limited to, a normally distributed random number or a random number generated based on a random number generation function.
Illustratively, the power crossing operator and the preset second random number are used to obtain a corresponding power variation operator, which may be, but not limited to, a power crossing operator of a power coefficient variable of a corresponding element type of the current fitness function and the second random number are subjected to variation operation to obtain a power variation operator of a power coefficient variable of a corresponding element type of the current fitness function. Wherein, the value range of the second random number is (-0.3, 0.3). The following examples are specific:
for a certain element type "total asset balance", the value of the second random number b is 0.1, and then the power variation operator of the vector element corresponding to the element type "total asset balance" in the current fitness function C may be represented as follows:
power mutation operator = power crossover operator +0.3 abb
The power variant operator value after substitution was 3.43. Wherein 0.3 in 0.3 × b is a correction coefficient, and the value of the correction coefficient is not necessarily set to 0.3, but may be set to other values, which can be determined by those skilled in the art according to actual situations.
It should be noted that, for a specific implementation manner of obtaining the corresponding power mutation operator according to the power intersection operator and the preset second random number, the specific implementation manner can be determined by those skilled in the art according to practical situations, and the above description is only an example, and does not limit this.
For example, a specific implementation manner of obtaining the corresponding multiple mutation operator according to the multiple crossover operator and the second random number may refer to a description of a step of obtaining the corresponding power mutation operator according to the power crossover operator and a preset second random number in the embodiment of the present invention, and the principle is the same, and details are not repeated here.
For example, assume that the current fitness function C is:
Figure BDA0003795150410000301
and the crossover operator function a is:
Figure BDA0003795150410000302
the crossover operator function B is:
Figure BDA0003795150410000303
the above-mentioned updating step includes updating based on e 1a Value (multiple coefficient), e 1b E and the first random number, calculating e 1c The multiple of (e) crosses the operator, and then according to e 1c The multiple crossover operator and the second random number of (e) to obtain e 1c Multiple mutation operator of (c), and based on k 1a Value (power coefficient), k 1b The value (power coefficient) of (c) and the first random number, and k is calculated 1c The power of the cross operator, and then according to k 1c The power of the first random number and the second random number to obtain k 1c The power of mutation operator. And for e 2c Multiple mutation operator of (k) 2c Power of variation operator e 3c Multiple mutation operator of (k) 3c The determination of the power mutation operator is similar to the above, and will not be described herein again.
Through the steps, on the basis of fully utilizing the principle of preferential inheritance of an elite retention strategy related to a genetic algorithm to enable the initial power coefficient and the initial multiple coefficient obtained by updating after each iteration to be closer to the characteristics of the corresponding vector elements, an unnecessary random selection link is also omitted (for the classification situation of the scheme, the random selection link can be optional, and the method has no effect on increasing the iteration accuracy), and the time of genetic iteration is reduced. Therefore, the method improves the speed of genetic iteration on the basis that the final power coefficient and the final multiple coefficient obtained after iteration are higher in conformity with the corresponding vector element characteristics, so that the accuracy of the final power coefficient and the final multiple coefficient is higher, the accuracy and the speed of integrally determining the second overdue type are improved, and the accuracy and the speed of integrally reminding overdue are improved.
In an optional embodiment, the obtaining a sub-fitted value corresponding to the historical customer quantitative feature vector based on the overdue value corresponding to the historical overdue label of the corresponding historical customer quantitative feature vector and the fitness includes:
subtracting the fitness from the overdue value to obtain a fitness difference value;
and taking the square of the absolute value of the adaptive difference value as the sub-fitting value.
Illustratively, | F (x) has been explicitly shown above i )-f(x i )| 2 Denotes the sub-fit value, then F (x) i ) An expiration value, f (x), representing the corresponding historical customer quantitative feature vector i ) Fitness, F (x), representing the quantitative feature vector of the corresponding historical customer i )-f(x i ) Representing the adapted difference to the historical customer quantitative feature vector.
Through the steps, the sub-fitting values can more clearly and obviously reflect the difference between the overdue value and the fitness of the same corresponding historical customer quantitative feature vector, so that the difference between the overall power coefficient and the multiple coefficient corresponding to one iteration and the total feature of the corresponding historical customer quantitative feature vector can be more clearly and obviously reflected, whether the difference between the overall power coefficient and the multiple coefficient corresponding to one iteration and the total feature of the corresponding historical customer quantitative feature vector is smaller than a specific degree or not can be more accurately judged, whether the genetic iteration is achieved, and the difference between the overall power coefficient and the multiple coefficient and the total feature of the corresponding historical customer quantitative feature vector is smaller than a specific degree is indirectly judged, so that the accuracy of the final power coefficient and the final multiple coefficient obtained subsequently is indirectly improved, and the accuracy of the overall repayment reminding is indirectly improved.
In an alternative embodiment, as shown in fig. 4, the determining a second overdue type of the current client based on the preset current client quantitative feature vector, the final power coefficient and the final multiple coefficient includes the following steps:
s401: correspondingly substituting element values of vector elements of the current customer quantitative feature vector into a fitness function corresponding to each historical customer quantitative feature vector, and obtaining a current customer adaptive value corresponding to the historical customer quantitative feature vector according to the final power coefficient and the final multiple coefficient corresponding to the historical customer quantitative feature vector in the fitness function.
S402: and subtracting the corresponding current client adaptive value from the overdue value corresponding to the historical overdue label of the historical client quantitative feature vector to obtain an initial proximity value, and taking an absolute value of the initial proximity value to obtain a proximity difference value.
S403: and determining a second overdue type of the current client according to the historical overdue label corresponding to the historical client quantitative feature vector with the minimum approach difference.
For example, the correspondingly substituting the element values of the vector elements of the current customer quantitative feature vector into the fitness function corresponding to each historical customer quantitative feature vector may be, but is not limited to, replacing the element values of the corresponding vector elements in each sub-fitness parameter in the fitness function with the element values of the vector elements of the same element type in the current customer quantitative feature vector, which is specifically exemplified as follows:
for a fitness function corresponding to a certain historical customer quantitative feature vector C:
Figure BDA0003795150410000311
Figure BDA0003795150410000312
representing the historical customer quantitative feature vector elementsThe element type is a sub-fitness parameter of a vector element of the total asset balance, and "1000000" represents an element value of the vector element.
Figure BDA0003795150410000313
The sub-fitness parameter of the vector element indicating that the historical customer quantitative feature vector element type is an insurance balance, and "3000000" indicates an element value of the vector element.
Figure BDA0003795150410000321
The sub-fitness parameter of the vector element representing that the historical customer quantitative feature vector element type is a financial balance, and "200000" represents the element value of the vector element. There is a current customer quantitative feature vector Z (1100000, 3100000, 210000), where "1100000" indicates that the element type of the current customer quantitative feature vector is a vector element of the total asset balance (and the element value is 1100000), "3100000" indicates that the element type of the current customer quantitative feature vector is a vector element of the insurance balance (and the element value is 3100000), and "210000" indicates that the element type of the current customer quantitative feature vector is a vector element of the financial balance (and the element value is 210000). Then an intermediate function corresponding to vector C and vector Z is obtained after the replacement:
Figure BDA0003795150410000322
since the final power coefficient and the final multiple coefficient (corresponding to the final values of e and k, respectively) of the sub-fitness parameter of the vector element corresponding to the historical customer quantitative feature vector corresponding to the intermediate function are known at this time, the calculation value of the intermediate function can be directly determined as the current customer adaptive value corresponding to the historical customer quantitative feature vector C.
It should be noted that, for a specific implementation manner that element values of vector elements of the current customer quantitative feature vector are correspondingly substituted into a fitness function corresponding to each historical customer quantitative feature vector, and a current customer adaptation value corresponding to the historical customer quantitative feature vector is obtained according to the final power coefficient and the final multiple coefficient corresponding to the historical customer quantitative feature vector in the fitness function, the implementation manner may be determined by a person skilled in the art according to an actual situation, and the above description is only an example, and does not limit the description.
For example, since one proximity difference corresponds to one historical customer quantitative feature vector and one historical customer quantitative feature vector corresponds to one historical overdue label, the second overdue type of the current customer can be determined directly according to the historical overdue label corresponding to the historical customer quantitative feature vector with the smallest proximity difference. The history overdue type corresponding to the history overdue label corresponding to the history client quantitative feature vector with the minimum approach difference can be determined as the second overdue type of the current client.
Because the value obtained by substituting the current customer quantitative feature vector into the fitness function of the corresponding historical customer quantitative feature vector can fully reflect the characteristics of the current customer quantitative feature vector and the characteristics of the corresponding historical customer quantitative feature vector, the difference between the current customer fitness value and the overdue value obtained after substitution can be consistent with the difference between the current customer quantitative feature vector and the characteristics of the historical customer quantitative feature vector, so that the conformity between the historical customer quantitative feature vector with the minimum subsequently determined closeness difference and the current customer quantitative feature vector is the highest and is closest to the characteristics. Therefore, the accuracy of the determined second overdue type can be improved, the processing and calculating processes of the steps are simple, the calculating complexity is low, and the consumed time is short, so that the speed of determining the second overdue type can be further improved, and the accuracy and the speed of the whole repayment reminding can be further improved.
In an optional embodiment, the correspondingly substituting element values of vector elements of the current customer quantitative feature vector into the fitness function corresponding to each historical customer quantitative feature vector includes:
and replacing the element value in the sub-fitness parameter of the corresponding vector element in the fitness function by the element value of the vector element in the current customer quantitative feature vector.
For example, a specific implementation manner of replacing the element value in the sub-fitness parameter corresponding to the vector element in the fitness function with the element value of the vector element in the current customer quantitative feature vector may refer to the above description of the step of correspondingly substituting the element value of the vector element of the current customer quantitative feature vector into the fitness function corresponding to each historical customer quantitative feature vector in the embodiment of the present invention, and is not described herein again.
Through the steps, the substituting process is more comprehensive, the error probability is lower, and the accuracy of the subsequent steps is facilitated.
In an alternative embodiment, as shown in fig. 5, the determining the future overdue rate of the current client according to the first overdue type and the second overdue type includes the following steps:
s501: and determining a first overdue rate corresponding to the first overdue type based on the historical overdue label corresponding to the first overdue type and the corresponding relation between the historical overdue labels with different preset values and the overdue rate.
S502: and determining a second overdue rate corresponding to the second overdue type based on the historical overdue label corresponding to the second overdue type and the corresponding relation between the historical overdue labels with different preset values and the overdue rate.
S503: determining an average of the first and second exceedances as the future overdue rate.
Illustratively, the corresponding relationship between the historical overdue labels with different values and the overdue rate may be determined by those skilled in the art according to actual situations, and the embodiment of the present invention does not limit this, and in general, the overdue degree of the overdue label identifier is more serious, and the overdue rate is larger. For example, the corresponding relationship between the historical overdue labels with different values and the overdue rate may be, but is not limited to, the following:
not-over-term tag >10%, first-stage over-term tag >30%, second-stage over-term tag >50%, third-stage over-term tag >70%, fourth-stage over-term tag >90%.
For example, since the correspondence between the history overdue tags taking different values and different history overdue types is known, the history overdue tag corresponding to the first overdue type and the history overdue tag corresponding to the second overdue type can be determined.
Through the steps, the first overdue type and the second overdue type can be comprehensively reflected by the future overdue rate in an averaging mode, the error correction and reduction effects are achieved, and when one of the first overdue type and the second overdue type is incorrect, the accuracy deviation of the obtained future overdue rate is reduced, so that the future overdue rate can have higher accuracy, and the repayment reminding accuracy can be improved.
Based on the same principle, the embodiment of the present invention discloses a credit card repayment reminding device 600, as shown in fig. 6, the credit card repayment reminding device 600 includes:
the decision tree processing module 601 is configured to determine a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative feature vector; the decision tree is associated with a plurality of preset historical customer qualitative characteristic vectors and historical overdue labels corresponding to the historical customer qualitative characteristic vectors;
a fitness processing module 602, configured to construct a corresponding fitness function according to a preset historical customer quantitative feature vector; performing genetic iteration based on the fitness function, and determining a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vectors;
a repayment reminding module 603, configured to determine a second overdue type of the current client based on a preset current client quantitative feature vector, the final power coefficient, and the final multiple coefficient; and determining future overdue rate of the current customer according to the first overdue type and the second overdue type, and carrying out credit card repayment reminding on the current customer based on the future overdue rate.
In an optional embodiment, the apparatus further comprises a first vector quantization module configured to:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
carrying out data cleaning, data extraction and data standardization processing on the preset initial historical client information to obtain intermediate historical client information;
performing feature vectorization processing on the intermediate historical client information to obtain a historical client feature vector;
and splitting the historical client characteristic vector according to the property of the element type of the vector element to respectively obtain a corresponding historical client qualitative characteristic vector and a corresponding historical client quantitative characteristic vector.
In an optional embodiment, the apparatus further comprises a second quantization module configured to:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
performing feature vectorization processing on preset current client information to obtain a current client feature vector;
and splitting the current customer characteristic vector according to the property of the element type of the vector element to respectively obtain a corresponding current customer qualitative characteristic vector and a corresponding current customer quantitative characteristic vector.
In an optional embodiment, the system further comprises a decision tree construction module for:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
and constructing a decision tree based on a plurality of historical customer qualitative feature vectors, element values of the historical customer qualitative feature vectors and corresponding historical overdue labels.
In an optional embodiment, the decision tree construction module is configured to:
obtaining a complete information entropy according to all the historical client qualitative feature vectors and corresponding historical overdue labels; wherein the value of the history overdue label is a non-overdue label, a first-stage overdue label, a second-stage overdue label, a third-stage overdue label or a fourth-stage overdue label;
obtaining a root conditional entropy of each element type of the vector elements according to the history overdue labels corresponding to all the history client qualitative feature vectors and the element values of the vector elements;
obtaining a root information gain entropy corresponding to the element type according to the complete information entropy and the root condition entropy, and establishing a root node of a decision tree by taking the element type with the maximum root information gain entropy as a root node attribute; respectively establishing child nodes corresponding to each element value based on each element value which the root node attribute can adopt;
repeatedly executing the step of establishing child nodes until the child nodes cannot be established so as to complete the construction of the decision tree, wherein the step of establishing child nodes comprises the following steps:
determining a plurality of historical customer qualitative characteristic vectors with vector elements corresponding to the sub-element values as the sub-vectors of the sub-nodes according to the sub-element values corresponding to the sub-nodes;
respectively judging whether the historical overdue labels corresponding to the sub-vectors of each child node are the same, if so, taking the child nodes as leaf nodes; determining a plurality of historical customer qualitative feature vectors with vector elements corresponding to the leaf element values as the leaf vectors of the leaf nodes according to the leaf element values corresponding to each leaf node;
if not, obtaining the sub-conditional entropy of each element type in the child nodes according to the historical overdue labels corresponding to all the child vectors and the element values of the vector elements;
obtaining sub information gain entropies corresponding to the element types according to the complete information entropy and the sub condition entropies, and taking the element type with the maximum sub information gain entropy as a sub node attribute; and respectively establishing child nodes of the next layer of the child nodes based on each element value which the attribute of the child nodes can adopt.
In an optional embodiment, the decision tree construction module is configured to:
obtaining a first number of the historical customer qualitative feature vectors with the historical overdue labels being non-overdue labels, a second number of the historical customer qualitative feature vectors with the historical overdue labels being first-stage overdue labels, a third number of the historical customer qualitative feature vectors with the historical overdue labels being second-stage overdue labels, a fourth number of the historical customer qualitative feature vectors with the historical overdue labels being third-stage overdue labels, and a fifth number of the historical customer qualitative feature vectors with the historical overdue labels being fourth-stage overdue labels according to all the historical customer qualitative feature vectors and corresponding historical overdue labels;
obtaining non-overdue rate based on the first quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a first-stage overdue rate based on the second quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a second stage overdue rate based on the third quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a third-stage overdue rate based on the fourth quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a fourth-stage overdue rate based on the fifth quantity and the total quantity of all historical customer qualitative feature vectors;
and obtaining the complete information entropy based on the non-overdue rate, the first-stage overdue rate, the second-stage overdue rate, the third-stage overdue rate and the fourth-stage overdue rate.
In an optional embodiment, the decision tree construction module is configured to:
according to the element values of the vector elements, the division number of the vector elements with different element values in the element types is respectively obtained;
obtaining a partition rate according to the partition rate and the total number of all historical customer qualitative characteristic vectors;
respectively taking the historical client qualitative feature vectors corresponding to different element values in the element types as corresponding division vectors, and obtaining division information entropies corresponding to the different element values based on the division vectors corresponding to the different element values in the element types and the historical overdue labels corresponding to the division vectors;
and obtaining the root condition entropy of the element type based on the partition rate and the partition information entropy corresponding to different element values of the element type.
In an optional embodiment, the decision tree construction module is configured to:
obtaining the sub-division number of vector elements with different element values in the element types of the sub-vectors respectively according to the element values of the vector elements of the sub-vectors;
obtaining a sub-division rate according to the sub-division number and the sub-vector number of the sub-vectors;
respectively taking the corresponding sub-vectors when different element values are taken from the element types as corresponding sub-division vectors, and obtaining sub-division information entropies corresponding to the different element values based on the corresponding sub-division vectors when the different element values are taken from the element types and the historical overdue labels corresponding to the sub-division vectors;
and obtaining the sub-condition entropy of the element type based on the sub-division rate and the sub-division information entropy corresponding to different element values of the element type.
In an optional embodiment, the decision tree processing module 601 is configured to:
determining a corresponding path in a decision tree according to the element values of the vector elements of the current customer qualitative feature vector;
obtaining a corresponding leaf node according to the path;
and determining a historical overdue type corresponding to the leaf node according to a historical overdue label corresponding to the leaf vector of the leaf node, and determining the corresponding historical overdue type as the first overdue type.
In an optional embodiment, the fitness processing module 602 is configured to:
respectively setting a corresponding power coefficient variable and a corresponding multiple coefficient variable for each vector element in each historical customer quantitative feature vector;
obtaining a sub-fitness parameter corresponding to each vector element based on the power coefficient variable, the multiple coefficient variable and the element value of the corresponding vector element;
and constructing a fitness function of the corresponding historical customer quantitative feature vector according to the sub-fitness parameter corresponding to each vector element.
In an optional embodiment, the fitness processing module 602 is configured to:
randomly setting an initial power coefficient of each power coefficient variable and an initial multiple coefficient of each multiple coefficient variable in the fitness function, and repeatedly executing the step of genetic iteration until all final power coefficients and final multiple coefficients of each fitness function are determined, wherein the step of genetic iteration comprises the following steps:
based on all the fitness functions, the fitness of each corresponding historical customer quantitative feature vector is obtained; obtaining a sub-fitting value corresponding to the historical customer quantitative feature vector based on the overdue value corresponding to the historical overdue label of the corresponding historical customer quantitative feature vector and the fitness; superposing the sub-fitting values corresponding to the quantitative feature vectors of all the historical customers to obtain fitting values;
judging whether the fitting value is smaller than or equal to a preset fitting threshold value, if so, taking an initial power coefficient of each power coefficient variable in the fitness function as the final power coefficient, and taking an initial multiple coefficient of each multiple coefficient variable as the final multiple coefficient;
if not, repeatedly executing cross mutation operation until all fitness functions are updated;
wherein the cross mutation operations comprise:
selecting one fitness function which is not updated from all fitness functions as a current fitness function, and selecting a plurality of fitness functions from other fitness functions except the current fitness function as cross operator functions;
obtaining a power coefficient cross operator of each corresponding power coefficient variable of the current fitness function according to a power coefficient of a corresponding power coefficient variable in a plurality of cross operator functions and a preset first random number; obtaining a multiple cross operator of each corresponding multiple coefficient variable of the current fitness function according to the multiple coefficient of the corresponding multiple coefficient variable in the multiple cross operator functions and the first random number;
obtaining a corresponding power mutation operator according to the power crossing operator and a preset second random number; obtaining a corresponding multiple mutation operator according to the multiple crossover operator and the second random number;
and taking the power variation operator corresponding to each power coefficient variable of the current fitness function as an initial power coefficient of the power coefficient variable, and taking the multiple variation operator corresponding to each multiple coefficient variable of the current fitness function as an initial multiple coefficient of the multiple coefficient variable, so as to finish updating the current fitness function.
In an optional embodiment, the fitness processing module 602 is configured to:
subtracting the fitness from the overdue value to obtain a fitness difference value;
and taking the square of the absolute value of the adaptive difference value as the sub-fitting value.
In an optional embodiment, the repayment reminding module 603 is configured to:
correspondingly substituting element values of vector elements of the current customer quantitative feature vector into a fitness function corresponding to each historical customer quantitative feature vector, and obtaining a current customer adaptive value corresponding to the historical customer quantitative feature vector according to the final power coefficient and the final multiple coefficient of the historical customer quantitative feature vector in the fitness function;
subtracting the corresponding current client adaptive value from the overdue value corresponding to the historical overdue label of the historical client quantitative feature vector to obtain an initial proximity value, and taking an absolute value of the initial proximity value to obtain a proximity difference value;
and determining a second overdue type of the current client according to the historical overdue label corresponding to the historical client quantitative feature vector with the minimum approach difference.
In an optional embodiment, the repayment reminding module 603 is configured to:
and replacing the element value in the sub-fitness parameter of the corresponding vector element in the fitness function by the element value of the vector element in the current customer quantitative feature vector.
In an optional embodiment, the repayment reminding module 603 is configured to:
determining a first overdue rate corresponding to the first overdue type based on the historical overdue label corresponding to the first overdue type and the corresponding relation between the historical overdue labels with different preset values and the overdue rate;
determining a second overdue rate corresponding to the second overdue type based on the historical overdue label corresponding to the second overdue type and the corresponding relationship between the historical overdue label of different preset values and the overdue rate;
determining an average of the first and second exceedances as the future overdue rate.
Since the principle of the credit card repayment reminding device 600 for solving the problem is similar to the above method, the implementation of the credit card repayment reminding device 600 can refer to the implementation of the above method, and is not described herein again.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer device, which may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
In a typical example, the computer device comprises in particular a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor implements the method as described above.
Referring now to FIG. 7, shown is a schematic block diagram of a computer device 700 suitable for use in implementing embodiments of the present application.
As shown in fig. 7, the computer device 700 includes a Central Processing Unit (CPU) 701, which can perform various appropriate works and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the system 700 are also stored. The CPU701, the ROM702, and the RAM703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including components such as a Cathode Ray Tube (CRT), a liquid crystal feedback (LCD), and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted as necessary in the storage section 708.
In particular, the processes described above with reference to the flowcharts may be implemented as a computer software program according to an embodiment of the present invention. For example, embodiments of the invention include a computer program product comprising a computer program tangibly embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
For convenience of description, the above devices are described as being divided into various units by function, respectively. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or apparatus comprising the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
All the embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (18)

1. A credit card repayment reminding method is characterized by comprising the following steps:
determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector;
constructing a corresponding fitness function according to a preset historical customer quantitative feature vector; performing genetic iteration based on the fitness function, and determining a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vectors;
determining a second overdue type of the corresponding current customer based on a preset current customer quantitative feature vector, the final power coefficient and a final multiple coefficient; and determining the future overdue rate of the current client according to the first overdue type and the second overdue type, and carrying out credit card repayment reminding on the current client based on the future overdue rate.
2. The method of claim 1, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
carrying out data cleaning, data extraction and data standardization processing on the preset initial historical client information to obtain intermediate historical client information;
performing feature vectorization processing on the intermediate historical client information to obtain a historical client feature vector;
and splitting the historical client characteristic vector according to the property of the element type of the vector element to respectively obtain a corresponding historical client qualitative characteristic vector and a corresponding historical client quantitative characteristic vector.
3. The method of claim 1, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
performing feature vectorization processing on preset current client information to obtain a current client feature vector;
and splitting the current customer characteristic vector according to the property of the element type of the vector element to respectively obtain the corresponding current customer qualitative characteristic vector and the current customer quantitative characteristic vector.
4. The method of claim 1, further comprising:
before determining a first overdue type of a corresponding current client according to a preset decision tree and a current client qualitative characteristic vector,
and constructing a decision tree based on a plurality of historical customer qualitative feature vectors, element values of the historical customer qualitative feature vectors and corresponding historical overdue labels.
5. The method of claim 4, wherein constructing a decision tree based on a plurality of the historical customer qualitative feature vectors, element values of the historical customer qualitative feature vectors, and corresponding historical past due labels comprises:
obtaining a complete information entropy according to all the historical client qualitative feature vectors and the corresponding historical overdue labels; the value of the historical overdue label is a non-overdue label, a first-stage overdue label, a second-stage overdue label, a third-stage overdue label or a fourth-stage overdue label;
obtaining a root conditional entropy of each element type of the vector elements according to the history overdue labels corresponding to all the history client qualitative feature vectors and the element values of the vector elements;
obtaining a root information gain entropy corresponding to the element type according to the complete information entropy and the root conditional entropy, and establishing a root node of a decision tree by taking the element type with the maximum root information gain entropy as a root node attribute; respectively establishing child nodes corresponding to each element value based on each element value which is possible for the root node attribute;
repeatedly executing the step of establishing child nodes until the child nodes cannot be established so as to complete the construction of the decision tree, wherein the step of establishing child nodes comprises the following steps:
determining a plurality of historical customer qualitative characteristic vectors with vector elements corresponding to the sub-element values as the sub-vectors of the sub-nodes according to the sub-element values corresponding to each sub-node;
respectively judging whether the historical overdue labels corresponding to the sub-vectors of each child node are the same, if so, taking the child nodes as leaf nodes; determining a plurality of historical customer qualitative feature vectors with vector elements corresponding to the leaf element values as the leaf vectors of the leaf nodes according to the leaf element values corresponding to each leaf node;
if not, obtaining the sub-conditional entropy of each element type in the child nodes according to the history overdue labels corresponding to all the child vectors and the element values of the vector elements;
obtaining sub information gain entropies corresponding to the element types according to the complete information entropies and the sub condition entropies, and taking the element type with the maximum sub information gain entropies as a sub node attribute; and respectively establishing child nodes of the next layer of the child nodes based on each element value which the attribute of the child nodes can adopt.
6. The method as claimed in claim 5, wherein said deriving a complete entropy from all the historical customer qualitative feature vectors and corresponding historical overdue labels comprises:
obtaining a first number of the historical customer qualitative feature vectors with the historical overdue labels being non-overdue labels, a second number of the historical customer qualitative feature vectors with the historical overdue labels being first-stage overdue labels, a third number of the historical customer qualitative feature vectors with the historical overdue labels being second-stage overdue labels, a fourth number of the historical customer qualitative feature vectors with the historical overdue labels being third-stage overdue labels, and a fifth number of the historical customer qualitative feature vectors with the historical overdue labels being fourth-stage overdue labels according to all the historical customer qualitative feature vectors and corresponding historical overdue labels;
obtaining an unexpired rate based on the first number and the total number of all historical customer qualitative feature vectors;
obtaining a first-stage overdue rate based on the second number and the total number of all historical customer qualitative feature vectors;
obtaining a second stage overdue rate based on the third quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a third-stage overdue rate based on the fourth quantity and the total quantity of all historical customer qualitative feature vectors;
obtaining a fourth-order overdue rate based on the fifth number and the total number of all historical customer qualitative feature vectors;
and obtaining the complete information entropy based on the non-overdue rate, the first-stage overdue rate, the second-stage overdue rate, the third-stage overdue rate and the fourth-stage overdue rate.
7. The method according to claim 5, wherein the obtaining the root conditional entropy for each element type of a vector element according to the historical overdue labels corresponding to all the historical customer qualitative feature vectors and the element values of the vector element comprises:
according to the element values of the vector elements, the division number of the vector elements with different element values in the element types is respectively obtained;
obtaining a partition rate according to the partition rate and the total number of all historical customer qualitative characteristic vectors;
respectively taking the corresponding historical client qualitative feature vectors when different element values are taken from the element types as corresponding division vectors, and obtaining division information entropies corresponding to the different element values based on the corresponding division vectors when the different element values are taken from the element types and the historical overdue labels corresponding to the division vectors;
and obtaining the root condition entropy of the element type based on the partition rate and the partition information entropy corresponding to different element values of the element type.
8. The method according to claim 5, wherein the obtaining the sub-conditional entropy for each element type in the child nodes according to the history overdue labels corresponding to all the child vectors and the element values of the vector elements comprises:
obtaining the sub-division number of vector elements with different element values in the element types of the sub-vectors respectively according to the element values of the vector elements of the sub-vectors;
obtaining a sub-division rate according to the sub-division number and the sub-vector number of the sub-vectors;
respectively taking the corresponding sub-vectors when different element values are taken from the element types as corresponding sub-division vectors, and obtaining sub-division information entropies corresponding to the different element values based on the corresponding sub-division vectors when the different element values are taken from the element types and historical overdue labels corresponding to the sub-division vectors;
and obtaining the sub-condition entropy of the element type based on the sub-division rate and the sub-division information entropy corresponding to different element values of the element type.
9. The method of claim 1, wherein determining a first overdue type of the corresponding current client according to the preset decision tree and the current client qualitative feature vector comprises:
determining a corresponding path in a decision tree according to the element values of the vector elements of the current customer qualitative feature vector;
obtaining a corresponding leaf node according to the path;
determining a history overdue type corresponding to the leaf node according to a history overdue label corresponding to the leaf vector of the leaf node, and determining the corresponding history overdue type as the first overdue type.
10. The method according to claim 1, wherein the constructing a corresponding fitness function according to the preset historical customer quantitative feature vector comprises:
respectively setting a corresponding power coefficient variable and a corresponding multiple coefficient variable for each vector element in each historical customer quantitative feature vector;
obtaining a sub-fitness parameter corresponding to each vector element based on the power coefficient variable, the multiple coefficient variable and the element value of the corresponding vector element;
and constructing a fitness function of the corresponding historical customer quantitative feature vector according to the sub-fitness parameter corresponding to each vector element.
11. The method of claim 10, wherein performing genetic iterations based on the fitness function to determine a plurality of final power coefficients and final multiplier coefficients for corresponding historical customer quantitative feature vectors comprises:
randomly setting an initial power coefficient of each power coefficient variable and an initial multiple coefficient of each multiple coefficient variable in the fitness function, and repeatedly executing the step of genetic iteration until all final power coefficients and final multiple coefficients of each fitness function are determined, wherein the step of genetic iteration comprises the following steps:
based on all the fitness functions, the fitness of each corresponding historical customer quantitative feature vector is obtained; obtaining a sub-fitting value corresponding to the historical customer quantitative feature vector based on the overdue value corresponding to the historical overdue label of the corresponding historical customer quantitative feature vector and the fitness; superposing the sub-fitting values corresponding to the quantitative feature vectors of all the historical customers to obtain fitting values;
judging whether the fitting value is smaller than or equal to a preset fitting threshold value, if so, taking an initial power coefficient of each power coefficient variable in the fitness function as the final power coefficient, and taking an initial multiple coefficient of each multiple coefficient variable as the final multiple coefficient;
if not, repeatedly executing cross mutation operation until all fitness functions are updated;
wherein the cross mutation operations comprise:
selecting one fitness function which is not updated from all fitness functions as a current fitness function, and selecting a plurality of fitness functions from other fitness functions except the current fitness function as cross operator functions;
obtaining a power cross operator of each corresponding power coefficient variable of the current fitness function according to a power coefficient of a corresponding power coefficient variable in a plurality of cross operator functions and a preset first random number; obtaining a multiple crossing operator of each corresponding multiple coefficient variable of the current fitness function according to the multiple coefficient of the corresponding multiple coefficient variable in the multiple crossing operator functions and the first random number;
obtaining a corresponding power mutation operator according to the power crossing operator and a preset second random number; obtaining a corresponding multiple mutation operator according to the multiple crossover operator and the second random number;
and taking the power variation operator corresponding to each power coefficient variable of the current fitness function as an initial power coefficient of the power coefficient variable, and taking the multiple variation operator corresponding to each multiple coefficient variable of the current fitness function as an initial multiple coefficient of the multiple coefficient variable, so as to finish updating the current fitness function.
12. The method of claim 11, wherein obtaining a sub-fit value corresponding to the historical quantitative feature vector based on a term value corresponding to a historical term label of the corresponding historical quantitative feature vector and the fitness comprises:
subtracting the fitness from the overdue value to obtain a fitness difference value;
and taking the square of the absolute value of the adaptive difference value as the sub-fitting value.
13. The method of claim 1, wherein the determining a corresponding second overdue type of the current client based on the preset current client quantitative feature vector, the final power coefficient and the final multiplier coefficient comprises:
correspondingly substituting element values of vector elements of the current customer quantitative feature vector into a fitness function corresponding to each historical customer quantitative feature vector, and obtaining a current customer adaptation value corresponding to the historical customer quantitative feature vector according to the final power coefficient and the final multiple coefficient of the historical customer quantitative feature vector in the fitness function;
subtracting the corresponding current client adaptive value from the overdue value corresponding to the historical overdue label of the historical client quantitative feature vector to obtain an initial closeness value, and taking an absolute value of the initial closeness value to obtain a proximity difference value;
and determining a second overdue type of the current client according to the historical overdue label corresponding to the historical client quantitative feature vector with the minimum proximity difference.
14. The method of claim 13, wherein correspondingly substituting element values of vector elements of the current customer quantitative feature vector into the fitness function corresponding to each historical customer quantitative feature vector comprises:
and replacing the element value in the sub-fitness parameter of the corresponding vector element in the fitness function by the element value of the vector element in the current customer quantitative feature vector.
15. The method of claim 1, wherein determining the future overdue rate of the current customer based on the first and second overdue types comprises:
determining a first overdue rate corresponding to the first overdue type based on the historical overdue label corresponding to the first overdue type and the corresponding relation between the historical overdue label of different preset values and the overdue rate;
determining a second overdue rate corresponding to the second overdue type based on the historical overdue label corresponding to the second overdue type and the corresponding relation between the historical overdue labels with different preset values and the overdue rate;
determining an average of the first and second exceedances as the future overdue rate.
16. A credit card repayment reminding device is characterized by comprising:
the decision tree processing module is used for determining a first overdue type of the corresponding current client according to a preset decision tree and the qualitative characteristic vector of the current client; the decision tree is associated with a plurality of preset historical customer qualitative characteristic vectors and historical overdue labels corresponding to the historical customer qualitative characteristic vectors;
the fitness processing module is used for constructing a corresponding fitness function according to a preset historical customer quantitative characteristic vector; performing genetic iteration based on the fitness function, and determining a plurality of final power coefficients and final multiple coefficients corresponding to the historical customer quantitative feature vectors;
the repayment reminding module is used for determining a second overdue type of the corresponding current customer based on a preset current customer quantitative characteristic vector, the final power coefficient and the final multiple coefficient; and determining the future overdue rate of the current client according to the first overdue type and the second overdue type, and carrying out credit card repayment reminding on the current client based on the future overdue rate.
17. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to any of claims 1-15 when executing the program.
18. A computer-readable medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1-15.
CN202210966354.6A 2022-08-12 2022-08-12 Credit card repayment reminding method and device Pending CN115239481A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210966354.6A CN115239481A (en) 2022-08-12 2022-08-12 Credit card repayment reminding method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210966354.6A CN115239481A (en) 2022-08-12 2022-08-12 Credit card repayment reminding method and device

Publications (1)

Publication Number Publication Date
CN115239481A true CN115239481A (en) 2022-10-25

Family

ID=83678448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210966354.6A Pending CN115239481A (en) 2022-08-12 2022-08-12 Credit card repayment reminding method and device

Country Status (1)

Country Link
CN (1) CN115239481A (en)

Similar Documents

Publication Publication Date Title
US20210166140A1 (en) Method and apparatus for training risk identification model and server
Francés et al. The cryptocurrency market: A network analysis
Li et al. Credit scoring by incorporating dynamic networked information
CN111340611A (en) Risk early warning method and device
CN111340612A (en) Account risk identification method and device and electronic equipment
CN116401379A (en) Financial product data pushing method, device, equipment and storage medium
US20220198556A1 (en) Inventory affordability and policy distance calculator
Shmilovici et al. Measuring the efficiency of the intraday forex market with a universal data compression algorithm
Halbleib et al. Forecasting covariance matrices: A mixed approach
CN110991992B (en) Processing method and device of business process information, storage medium and electronic equipment
CN115375357A (en) Customer loss early warning method and device
US20120226690A1 (en) Optimization of output data associated with a population
CN115239481A (en) Credit card repayment reminding method and device
CN115795345A (en) Information processing method, device, equipment and storage medium
CN114925275A (en) Product recommendation method and device, computer equipment and storage medium
Teles et al. Classification methods applied to credit scoring with collateral
CN114897607A (en) Data processing method and device for product resources, electronic equipment and storage medium
CN113807964A (en) Method, equipment and storage medium for predicting stock price and determining parameters
CN113849580A (en) Subject rating prediction method and device, electronic equipment and storage medium
CA3090143A1 (en) Systems and methods of generating resource allocation insights based on datasets
CN112632197A (en) Service relation processing method and device based on knowledge graph
CN115409596A (en) Bill cash-out service information pushing method and device
CN113836244B (en) Sample acquisition method, model training method, relation prediction method and device
CN111798243A (en) Suspicious transaction online identification method and device
CN113344664A (en) Atypical information overloaded financial product pushing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination