CN111461865A - Data analysis method and device - Google Patents

Data analysis method and device Download PDF

Info

Publication number
CN111461865A
CN111461865A CN202010246736.2A CN202010246736A CN111461865A CN 111461865 A CN111461865 A CN 111461865A CN 202010246736 A CN202010246736 A CN 202010246736A CN 111461865 A CN111461865 A CN 111461865A
Authority
CN
China
Prior art keywords
target object
consumption
information
risk
time period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010246736.2A
Other languages
Chinese (zh)
Other versions
CN111461865B (en
Inventor
黄文强
季蕴青
胡路苹
胡玮
黄雅楠
胡传杰
浮晨琪
李蚌蚌
申亚坤
王畅畅
徐晨敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bank of China Ltd
Original Assignee
Bank of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bank of China Ltd filed Critical Bank of China Ltd
Priority to CN202010246736.2A priority Critical patent/CN111461865B/en
Publication of CN111461865A publication Critical patent/CN111461865A/en
Application granted granted Critical
Publication of CN111461865B publication Critical patent/CN111461865B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Finance (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a data analysis method and a data analysis device, which comprise the steps of screening a high-risk target object, and acquiring consumption information of the high-risk target object in a first time period and basic information of a client to which the target object belongs; inputting consumption information of a high-risk target object in a first time period and basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain consumption information of the high-risk target object in a second time period; obtaining a scoring result of the high-risk target object based on the real consumption information of the high-risk target object in the second time period, the consumption information predicted by the target object in the second time period and a preset scoring model, and determining whether the high-risk target object has abnormal consumption conditions based on the relationship between the score of the high-risk target object and a preset threshold value. In this way, the potential borrowing risk of the credit card can be found in time, and the risk born by the bank is reduced as much as possible.

Description

Data analysis method and device
Technical Field
The invention relates to the field of finance, in particular to a data analysis method and device.
Background
With the improvement of living standard and the transformation of consumption concept, credit cards are becoming a way for people to consume. When a bank issues a credit card, the bank sets the credit card amount based on the payment capability of the customer in order to reduce the risk.
However, in real life, the credit card is often lent to other people for use, and this behavior is likely to increase the repayment pressure of the customer, so that the customer cannot pay timely, which also increases the risk of the bank invisibly.
Disclosure of Invention
In view of this, the embodiment of the invention discloses a data analysis method and device, which predict whether the risk of abnormal consumption occurs or not by analyzing the consumption condition of a credit card.
The embodiment of the invention discloses a data analysis method, which comprises the following steps:
obtaining repayment information of a target object and identity information of a user corresponding to the target object; the target object is a loan product;
screening a high-risk target object based on the repayment information of the target object and the identity information of the user to which the target object belongs; the high-risk target object represents a target object which is likely to have abnormal consumption, and consumption information of the high-risk target object in a first time period and basic information of a client to which the target object belongs are acquired;
inputting the consumption information of the high-risk target object in a first time period and the basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain the consumption information of the high-risk target object in a second time period;
acquiring real consumption information of the high-risk target object in a second time period;
obtaining a scoring result of the high-risk target object based on the real consumption information, the predicted consumption information and a preset scoring model of the high-risk target object in a second time period; the target object scoring result is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set;
and determining whether the high-risk target object has abnormal consumption conditions or not based on the relation between the score of the high-risk target object and a preset threshold value.
Optionally, the screening a high-risk target object based on the repayment information of the target object and the identity information of the user to which the target object belongs includes:
judging whether the repayment information of the target object is consistent with the identity information of the user to which the target object belongs;
and if the repayment information of the target object is inconsistent with the identity information of the user to which the target object belongs, indicating that the target object is a high-risk target object.
Optionally, the training process of the scoring model includes:
constructing a neural network model;
acquiring a second data set; the second data set comprises real consumption information and predicted consumption information of high-risk target objects in the same time period, and the target objects in the second data set are marked with scoring results;
acquiring initial parameter values of the neural network;
training the neural network model based on initial parameter values and a second data set of the neural network.
Optionally, the training process of the scoring model includes:
acquiring a calculation method for calculating the matching degree of each consumption object in the real consumption information and each consumption object in the preset consumption information;
determining a scoring rule of each consumption object in the real consumption information;
determining the grading statistical rules of all consumption objects in the real consumption information;
and training a preset expert system based on a calculation method of the matching degree of each consumption object in the real consumption information and each consumption object in the preset consumption information, a scoring rule which is not given to the consumption object in the real consumption information and scoring statistical rules of all the consumption objects in the real consumption information.
Optionally, the determining whether the high-risk customer has an abnormal consumption condition based on the relationship between the score of the high-risk target object and a preset threshold includes:
if the score of the high-risk target object is larger than a preset threshold value, it indicates that the consumption behavior of the target object is not abnormal;
and if the score of the high-risk target object is less than or equal to a preset threshold value, indicating that the consumption behavior of the target object is abnormal.
Optionally, the method further includes:
determining the total number of consumption objects included in the real consumption information of the second time period;
the threshold value is determined based on the total number of consumption objects included in the second time period real consumption information.
Optionally, the method further includes:
monitoring the use dynamics of the target object under the condition that the target object is detected to have consumption abnormal behaviors;
when the target object is monitored to be reused, acquiring the face information of a consumer using the target object;
judging whether the face information of the consumer is consistent with the preset face information of the user to which the target object belongs;
and if the face information of the consumer is not consistent with the preset face information of the user to which the target object belongs, sending a consumption abnormity prompt to the user to which the target object belongs.
The embodiment of the invention discloses a data analysis device, which comprises:
the first acquisition unit is used for acquiring repayment information of the target object and identity information of a user corresponding to the target object; the target object is a loan product;
the screening unit is used for screening the high-risk target object based on the repayment information of the target object and the identity information of the user to which the target object belongs; the high-risk target object represents a target object which is likely to have abnormal consumption, and consumption information of the high-risk target object in a first time period and basic information of a client to which the target object belongs are acquired;
the second acquisition unit is used for acquiring consumption information of a high-risk target object in a first time period and basic information of a client to which the target object belongs;
the consumption prediction model is used for inputting the consumption information of the high-risk target object in a first time period and the basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain the consumption information of the high-risk target object in a second time period;
the third acquisition unit is used for acquiring the real consumption information of the high-risk target object in a second time period;
the scoring result determining unit is used for obtaining a scoring result of the high-risk target object based on the real consumption information, the predicted consumption information and a preset scoring model of the high-risk target object in a second time period; the target object scoring result is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set;
and the abnormal consumption identification unit is used for determining whether the high-risk target object has an abnormal consumption condition or not based on the relation between the score of the high-risk target object and a preset threshold value.
Optionally, the screening unit includes:
the first screening subunit is used for judging whether the repayment information of the target object is consistent with the identity information of the user to which the target object belongs;
and the second screening subunit is used for indicating that the target object is a high-risk target object if the repayment information of the target object is inconsistent with the identity information of the user to which the target object belongs.
Optionally, the method further includes:
a training unit of a first scoring model to:
constructing a neural network model;
acquiring a second data set; the second data set comprises real consumption information and predicted consumption information of high-risk target objects in the same time period, and the target objects in the second data set are marked with scoring results;
acquiring initial parameter values of the neural network;
training the neural network model based on initial parameter values and a second data set of the neural network.
The embodiment of the invention discloses a data analysis method, which comprises the following steps of screening a high-risk target object based on repayment information of the target object and identity information of a user to which the target object belongs, so that the data volume of the target object to be detected is reduced, and the data processing efficiency is improved. Then, acquiring consumption information of a high-risk target object in a first time period and basic information of a client to which the target object belongs; inputting the consumption information of the high-risk target object in a first time period and the basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain the consumption information of the high-risk target object in a second time period; and obtaining a scoring result of the high-risk target object based on the real consumption information of the high-risk target object in the second time period, the consumption information predicted by the target object in the second time period and a preset scoring model, wherein the scoring result of the target object is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set. And determining whether the high-risk target object has abnormal consumption condition or not based on the relation between the score of the high-risk target object and a preset threshold value. In this way, the potential borrowing risk of the credit card can be found in time, and the risk born by the bank is reduced as much as possible.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a schematic flow chart illustrating a data analysis method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a scoring model training method according to an embodiment of the present invention;
FIG. 3 is a flow chart illustrating a further training method of a scoring model according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a data analysis apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a schematic flow chart of a data analysis method provided in an embodiment of the present invention is shown, where the method includes:
s101: obtaining repayment information of a target object and information of a client identity to which the target object belongs; the target object is a loan product;
in this embodiment, the loan-type product includes a credit card or other credit-type product that can effect installment.
In this embodiment, the repayment information of the target object includes information of a repayment party for repayment of the credit product, specifically, identity information of the repayment party. The identity information of the user to which the target object belongs refers to the identity information of the user registered when applying for the credit product. The identity information acquisition mode of the payment party may include multiple modes, which are not limited in this embodiment, and may include, for example, the following two modes:
the method comprises the steps of receiving identity information of a repayment party uploaded by a user;
the method II comprises the steps of obtaining a repayment mode of the target object;
and calling the identity information of the target object repayment party based on the repayment mode of the target object.
For example, the following steps are carried out: the payment mode can comprise the following steps: the bank card repayment, the WeChat repayment or the payment treasure repayment and the like, and if the repayment mode is the bank card repayment mode, the identity information of the client to which the account of the repayment bank card belongs can be called. If the repayment mode is a WeChat repayment mode or a Payment treasure repayment mode, the customer information of the repayment WeChat or the customer information of the Payment treasure can be called.
S102: screening a high-risk target object based on the repayment information of the target object and the identity information of the user to which the target object belongs; the high-risk target object represents a target object which may have abnormal consumption;
in this embodiment, in order to reduce the risk due to credit card borrowing, the bank back office system needs to monitor the use conditions of all the credit cards, but the number of customers who own the credit cards is very large, and in this case, the data processing efficiency is easily reduced.
The applicant also found that when paying for credit cards, the client can pay for the credit cards by the assets under their name, and also by the assets under other names, such as: the client can pay the credit card through the bank card under the name of the client, and also can pay the bank card through the bank card under the name of other people, and the bank does not limit how to pay. However, the applicant can know through big data analysis that when a client makes a payment through property in another name, the client has a high possibility of abnormal consumption, namely, the client is likely to borrow a credit card to others for use.
Therefore, in order to improve the efficiency of data processing, the client may be screened in the following manner to screen out the target object with higher risk, specifically, the method includes:
whether the repayment information of the target object is consistent with the identity information of the user to which the target object belongs;
and if the repayment information of the target object is inconsistent with the identity information of the user to which the target object belongs, indicating that the target object is a high-risk target object.
In this embodiment, the high-risk target object is a target object with a high possibility of having consumption abnormality, and in order to further confirm whether the high-risk target object has an abnormal consumption situation, the following operations are performed for each high-risk target object:
s103: acquiring consumption information of a high-risk target object in a first time period and basic information of a client to which the target object belongs;
in this embodiment, the consumption information in the first time period is consumption information of the target object (for example, a credit card) in a certain time period in the historical time.
The consumption information includes information of commodities purchased by the user through consumption, and in the case of credit card consumption, the historical consumption information may be represented as information of commodities purchased by the user through credit card consumption.
The basic customer information to which the target object belongs may include, for example: user age, family status, school calendar, income level, hobbies, etc.
For example, the following steps are carried out: if the target object is a credit card, the information of the customer corresponding to the target object indicates basic information of the customer registered when applying for the credit card, and may include, for example: user age, family status, school calendar, income level, hobbies, etc.
S104: inputting the consumption information of the high-risk target object in a first time period and the basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain the consumption information of the high-risk target object in a second time period;
the consumption prediction model is obtained after training based on a first data set, and the first data set comprises consumption information of the high-risk target object in a third time period and basic information of a client to which the high-risk target belongs;
in this embodiment, the consumption information in the third time period represents consumption information of the target object in a certain time period in the historical time.
The first time period is represented as a time period to be predicted, and the time of the third time period is earlier than that of the first time period.
For example, the following steps are carried out: if consumption information for the credit card 2020 and 9 months is to be predicted, the second time period may be understood as 9 months of 2020, the first time period and the third time period are both historical time periods earlier than 9 months, and the time of the third time period is earlier than the time of the first time period, for example, the first time period may be a time period from 1 month to 7 months of 2020, and the second time period may be 8 months of 2020.
In this embodiment, the consumption prediction model is obtained by training a preset machine learning model according to the consumption information of the target object in the third time period and the basic information of the client to which the target object belongs. The preset machine learning model may be any model capable of predicting consumption conditions through training. In this embodiment, the machine learning model is not limited, and may be a convolutional neural network model, an SVM model, or a random forest model.
S105: acquiring real consumption information of the high-risk target object in a second time period;
in this embodiment, the actual consumption information in the second time period is represented as the actual consumption condition of the target object in the second time period.
For example, the following steps are carried out: if the target object is a credit card, the consumption information can be obtained through the consumption record of the credit card, and then the real consumption information in the second time period can be obtained through the consumption record of the credit card in the second time period.
S106: obtaining a scoring result of the high-risk target object based on the real consumption information, the predicted consumption information and a preset scoring model of the high-risk target object in a second time period; the target object scoring result is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set;
in this embodiment, the scoring model may be obtained through a plurality of training modes, which are not limited in this embodiment, and preferably, the following two training modes are provided in this embodiment:
the first method,
S201: constructing a neural network model;
in this embodiment, the neural network model may include a plurality of types, and is not limited in this embodiment.
S202: acquiring a second data set; the second data set comprises real consumption information and predicted consumption information of high-risk target objects in the same time period, and the target objects in the second data set are marked with scoring results;
in this embodiment, the target object may be marked by a preset rule, where the rule is based on a principle of correlation between real consumption information and predicted consumption information, and specifically may be understood as:
calculating the matching degree of each consumption object in the real consumption information and the consumption object in the predicted consumption information;
scoring each consumption object in the real consumption information based on the matching degree of each consumption object in the real consumption information and the consumption object in the predicted consumption information;
and calculating the score of each target object based on the score of each consumption object in the real consumption information.
For each consumption object in the real consumption information, the higher the matching degree of the consumption object and the consumption object in the predicted consumption information is, the higher the score is, the lower the matching degree is, the lower the score is, and if the matching degree is not matched at all, the score can be zero or a negative value.
Besides, each consumption object in the real consumption information can be scored through a preset scoring model, wherein the preset scoring model can be obtained through training of a third data set, the third data set comprises real consumption information and predicted consumption information of the target object in the same time period, and each consumption object in the real consumption information is scored and labeled.
Alternatively, the matching degree between the real consumption information and the predicted consumption information may be calculated by a method of evaluating a function.
S203: acquiring initial parameter values of the neural network;
in this embodiment, the initial parameter values of the neural network may include: initial weight and threshold, etc.
S204: training the neural network model based on initial parameter values and a second data set of the neural network.
The second method,
S301: acquiring a calculation method for calculating the matching degree of each consumption object in the real consumption information and each consumption object in the predicted consumption information;
in this embodiment, a plurality of methods may be set to calculate the matching degree between each consumption object in the real consumption information and each consumption object in the predicted consumption information, which is not limited in this embodiment. Preferably, the following method can be employed:
the method comprises the following steps of aiming at each consumption object in real consumption information:
acquiring attribute information of a consumption object in real consumption information;
acquiring attribute information of all consumption objects in the predicted consumption information;
and calculating the matching degree of the attribute information of the consumption object in the real consumption information and the attribute information of the consumption object contained in the preset consumption information.
Calculating the matching degree of each consumption object in the real consumption information and each consumption object in the predicted consumption information through the trained classification model;
the training method of the classification model comprises the following steps:
acquiring a third data set; the third data set comprises: two volumetric data sets; each ontology dataset contains different consumption objects;
generating a matching relation between each consumption object in the two ontology datasets;
calculating the matching degree between the consumption objects based on the attribute information of different dimensions, thereby obtaining a third data set;
and training the classification model based on the third data set to obtain a method for calculating the matching degree of the consumption objects in different consumption information.
S302: determining a scoring rule of each consumption object in the real consumption information; wherein the scoring rule is determined based on a degree of matching of each consumption object in the real consumption information with each consumption object in the predicted consumption information. For example, the following steps are carried out: for consumption objects with high matching degree, higher score can be set, for consumption objects with low matching degree, lower score can be set, and for consumption objects which cannot be matched, 0 score or negative score can be set.
S303: determining a statistical rule of scores of all consumption objects in the real consumption information;
the statistical rule indicates how to count the scores of all consumption objects in the real consumption information.
For example: the scores of all consumption objects in the real consumption information can be added, or the scores of all consumption objects in the real consumption information can be weighted and summed.
S304: and training a preset expert system based on a calculation method of the matching degree of each consumption object in the real consumption information and each consumption object in the predicted consumption information, a scoring rule of each consumption object in the real consumption information and a statistical rule of scores of all consumption objects in the real consumption information.
The expert system thus obtained can calculate the final score of the target object after inputting the actual consumption information and the predicted consumption information of the target object.
S107: and determining whether the high-risk client has abnormal consumption conditions or not based on the relation between the score of the high-risk target object and a preset threshold value.
In this embodiment, since the total score of the high-risk target object is related to the consumption object in the real consumption situation, the consumption object in the real consumption situation is continuously changed in different scenes or different time periods, and the change includes a change in the number of the consumption objects. Therefore, when the number of consumption objects included in the real consumption situation changes, if the preset threshold value does not change, the accuracy of the calculated similarity result is affected, and in order to improve the accuracy of the final determination result, the preset threshold value may change based on the change of the number of consumption objects included in the real consumption situation, and based on this, the embodiment further includes:
determining the total number of consumption objects included in the real consumption information of the second time period;
the threshold value is determined based on the total number of consumption objects included in the second time period real consumption information.
In this embodiment, the determining whether the consumption behavior of the user is abnormal based on the relationship between the score of the high-risk target object and the preset threshold includes the following two cases:
if the score of the high-risk target object is larger than a preset threshold value, it indicates that the consumption behavior of the target object is not abnormal;
and if the score of the high-risk target object is less than or equal to a preset threshold value, indicating that the consumption behavior of the target object is abnormal.
For a credit card, if the consumption behavior of the user is abnormal, the current credit card may be considered to have a risk of being lent to another person.
In the embodiment, under the condition that the consumption abnormal behavior of the target object is detected, the face information of the consumer is obtained under the condition that the target object is detected to be used again; identifying face information of a consumer, and judging whether the face information of the consumer is consistent with face information of a user to which a target object belongs; and if the consumer information is inconsistent with the user information of the target object, verifying whether the consumer accords with the preset identity.
Wherein, the preset identity may include: the user to which the target object belongs sets a person who can consume the target object, or a person who belongs to a third-generation relationship with the user to which the target object belongs.
And if the consumer is detected not to be the user of the target object and the consumer does not accord with the preset identity, sending a consumption abnormity prompt to the user to which the target object belongs.
In the embodiment, under the condition that the credit card is monitored to be consumed again, the face information of the consumer can be acquired by calling the shop camera.
In this embodiment, first, a high-risk target object is screened based on the repayment information of the target object and the identity information of the user to which the target object belongs, so that the data volume of the target object to be detected is reduced, and the data processing efficiency is improved. Then, acquiring consumption information of a high-risk target object in a first time period and basic information of a client to which the target object belongs; inputting the consumption information of the high-risk target object in a first time period and the basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain the consumption information of the high-risk target object in a second time period; and obtaining a scoring result of the high-risk target object based on the real consumption information of the high-risk target object in the second time period, the consumption information predicted by the target object in the second time period and a preset scoring model, wherein the scoring result of the target object is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set. And determining whether the high-risk target object has abnormal consumption condition or not based on the relation between the score of the high-risk target object and a preset threshold value. In this way, the potential borrowing risk of the credit card can be found in time, and the risk born by the bank is reduced as much as possible.
Referring to fig. 4, a schematic structural diagram of a data analysis apparatus disclosed in an embodiment of the present invention is shown, and in this embodiment, the apparatus includes:
a first obtaining unit 401, configured to obtain payment information of a target object and identity information of a user corresponding to the target object; the target object is a loan product;
a screening unit 402, configured to screen a high-risk target object based on the repayment information of the target object and the identity information of the user to which the target object belongs; the high-risk target object represents a target object which is likely to have abnormal consumption, and consumption information of the high-risk target object in a first time period and basic information of a client to which the target object belongs are acquired;
a second obtaining unit 403, configured to obtain consumption information of a high-risk target object in a first time period and basic information of a client to which the target object belongs;
the consumption prediction model 404 is configured to input consumption information of the high-risk target object in a first time period and basic information of a client to which the high-risk target object belongs into a preset consumption prediction model, so as to obtain consumption information of the high-risk target object in a second time period;
a third obtaining unit 405, configured to obtain real consumption information of the high-risk target object in a second time period;
the scoring result determining unit 406 is configured to obtain a scoring result of the high-risk target object based on the real consumption information, the predicted consumption information, and a preset scoring model of the high-risk target object in the second time period; the target object scoring result is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set;
the abnormal consumption identification unit 407 is configured to determine whether an abnormal consumption condition exists in the high-risk target object based on a relationship between the score of the high-risk target object and a preset threshold.
Optionally, the screening unit includes:
the first screening subunit is used for judging whether the repayment information of the target object is consistent with the identity information of the user to which the target object belongs;
and the second screening subunit is used for indicating that the target object is a high-risk target object if the repayment information of the target object is inconsistent with the identity information of the user to which the target object belongs.
Optionally, the method further includes:
a training unit of a first scoring model to:
constructing a neural network model;
acquiring a second data set; the second data set comprises real consumption information and predicted consumption information of high-risk target objects in the same time period, and the target objects in the second data set are marked with scoring results;
acquiring initial parameter values of the neural network;
training the neural network model based on initial parameter values and a second data set of the neural network.
Optionally, the training unit of the second scoring model is configured to:
acquiring a calculation method for calculating the matching degree of each consumption object in the real consumption information and each consumption object in the preset consumption information;
determining a scoring rule of each consumption object in the real consumption information;
determining the grading statistical rules of all consumption objects in the real consumption information;
and training a preset expert system based on a calculation method of the matching degree of each consumption object in the real consumption information and each consumption object in the preset consumption information, a scoring rule which is not given to the consumption object in the real consumption information and scoring statistical rules of all the consumption objects in the real consumption information.
Optionally, the abnormal consumption identification unit is specifically configured to:
if the score of the high-risk target object is larger than a preset threshold value, it indicates that the consumption behavior of the target object is not abnormal;
and if the score of the high-risk target object is less than or equal to a preset threshold value, indicating that the consumption behavior of the target object is abnormal.
Optionally, the method further includes:
determining the total number of consumption objects included in the real consumption information of the second time period;
the threshold value is determined based on the total number of consumption objects included in the second time period real consumption information.
Optionally, the method further includes:
an anomaly-prompting detection unit for:
monitoring the use dynamics of the target object under the condition that the target object is detected to have consumption abnormal behaviors;
when the target object is monitored to be reused, acquiring the face information of a consumer using the target object;
judging whether the face information of the consumer is consistent with the preset face information of the user to which the target object belongs;
and if the face information of the consumer is not consistent with the preset face information of the user to which the target object belongs, sending a consumption abnormity prompt to the user to which the target object belongs.
Through the device of the embodiment, the high-risk target object is screened based on the repayment information of the target object and the identity information of the user to which the target object belongs, so that the data volume of the target object needing to be detected is reduced, and the data processing efficiency is improved. Then, acquiring consumption information of a high-risk target object in a first time period and basic information of a client to which the target object belongs; inputting the consumption information of the high-risk target object in a first time period and the basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain the consumption information of the high-risk target object in a second time period; and obtaining a scoring result of the high-risk target object based on the real consumption information of the high-risk target object in the second time period, the consumption information predicted by the target object in the second time period and a preset scoring model, wherein the scoring result of the target object is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set. And determining whether the high-risk target object has abnormal consumption condition or not based on the relation between the score of the high-risk target object and a preset threshold value. In this way, the potential borrowing risk of the credit card can be found in time, and the risk born by the bank is reduced as much as possible.
It should be noted that, in the present specification, the embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of data analysis, comprising:
obtaining repayment information of a target object and identity information of a user corresponding to the target object; the target object is a loan product;
screening a high-risk target object based on the repayment information of the target object and the identity information of the user to which the target object belongs; the high-risk target object represents a target object which is likely to have abnormal consumption, and consumption information of the high-risk target object in a first time period and basic information of a client to which the target object belongs are acquired;
inputting the consumption information of the high-risk target object in a first time period and the basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain the consumption information of the high-risk target object in a second time period;
acquiring real consumption information of the high-risk target object in a second time period;
obtaining a scoring result of the high-risk target object based on the real consumption information, the predicted consumption information and a preset scoring model of the high-risk target object in a second time period; the target object scoring result is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set;
and determining whether the high-risk target object has abnormal consumption conditions or not based on the relation between the score of the high-risk target object and a preset threshold value.
2. The method of claim 1, wherein the screening the high-risk target object based on the repayment information of the target object and the identity information of the user to which the target object belongs comprises:
judging whether the repayment information of the target object is consistent with the identity information of the user to which the target object belongs;
and if the repayment information of the target object is inconsistent with the identity information of the user to which the target object belongs, indicating that the target object is a high-risk target object.
3. The method of claim 1, wherein the training process of the scoring model comprises:
constructing a neural network model;
acquiring a second data set; the second data set comprises real consumption information and predicted consumption information of high-risk target objects in the same time period, and the target objects in the second data set are marked with scoring results;
acquiring initial parameter values of the neural network;
training the neural network model based on initial parameter values and a second data set of the neural network.
4. The method of claim 1, wherein the training process of the scoring model comprises:
acquiring a calculation method for calculating the matching degree of each consumption object in the real consumption information and each consumption object in the preset consumption information;
determining a scoring rule of each consumption object in the real consumption information;
determining the grading statistical rules of all consumption objects in the real consumption information;
and training a preset expert system based on a calculation method of the matching degree of each consumption object in the real consumption information and each consumption object in the preset consumption information, a scoring rule which is not given to the consumption object in the real consumption information and scoring statistical rules of all the consumption objects in the real consumption information.
5. The method of claim 1, wherein determining whether the high-risk customer has abnormal consumption based on the relationship between the score of the high-risk target object and a preset threshold comprises:
if the score of the high-risk target object is larger than a preset threshold value, it indicates that the consumption behavior of the target object is not abnormal;
and if the score of the high-risk target object is less than or equal to a preset threshold value, indicating that the consumption behavior of the target object is abnormal.
6. The method of claim 1, further comprising:
determining the total number of consumption objects included in the real consumption information of the second time period;
the threshold value is determined based on the total number of consumption objects included in the second time period real consumption information.
7. The method of claim 1, further comprising:
monitoring the use dynamics of the target object under the condition that the target object is detected to have consumption abnormal behaviors;
when the target object is monitored to be reused, acquiring the face information of a consumer using the target object;
judging whether the face information of the consumer is consistent with the preset face information of the user to which the target object belongs;
and if the face information of the consumer is not consistent with the preset face information of the user to which the target object belongs, sending a consumption abnormity prompt to the user to which the target object belongs.
8. A data analysis apparatus, comprising:
the first acquisition unit is used for acquiring repayment information of the target object and identity information of a user corresponding to the target object; the target object is a loan product;
the screening unit is used for screening the high-risk target object based on the repayment information of the target object and the identity information of the user to which the target object belongs; the high-risk target object represents a target object which is likely to have abnormal consumption, and consumption information of the high-risk target object in a first time period and basic information of a client to which the target object belongs are acquired;
the second acquisition unit is used for acquiring consumption information of a high-risk target object in a first time period and basic information of a client to which the target object belongs;
the consumption prediction model is used for inputting the consumption information of the high-risk target object in a first time period and the basic information of a client to which the high-risk target object belongs into a preset consumption prediction model to obtain the consumption information of the high-risk target object in a second time period;
the third acquisition unit is used for acquiring the real consumption information of the high-risk target object in a second time period;
the scoring result determining unit is used for obtaining a scoring result of the high-risk target object based on the real consumption information, the predicted consumption information and a preset scoring model of the high-risk target object in a second time period; the target object scoring result is determined based on the correlation between the real consumption information and the predicted consumption information, and the scoring model is obtained after training through a preset second data set;
and the abnormal consumption identification unit is used for determining whether the high-risk target object has an abnormal consumption condition or not based on the relation between the score of the high-risk target object and a preset threshold value.
9. The apparatus of claim 8, wherein the screening unit comprises:
the first screening subunit is used for judging whether the repayment information of the target object is consistent with the identity information of the user to which the target object belongs;
and the second screening subunit is used for indicating that the target object is a high-risk target object if the repayment information of the target object is inconsistent with the identity information of the user to which the target object belongs.
10. The apparatus of claim 8, further comprising:
a training unit of a first scoring model to:
constructing a neural network model;
acquiring a second data set; the second data set comprises real consumption information and predicted consumption information of high-risk target objects in the same time period, and the target objects in the second data set are marked with scoring results;
acquiring initial parameter values of the neural network;
training the neural network model based on initial parameter values and a second data set of the neural network.
CN202010246736.2A 2020-03-31 2020-03-31 Data analysis method and device Active CN111461865B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010246736.2A CN111461865B (en) 2020-03-31 2020-03-31 Data analysis method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010246736.2A CN111461865B (en) 2020-03-31 2020-03-31 Data analysis method and device

Publications (2)

Publication Number Publication Date
CN111461865A true CN111461865A (en) 2020-07-28
CN111461865B CN111461865B (en) 2024-02-02

Family

ID=71684266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010246736.2A Active CN111461865B (en) 2020-03-31 2020-03-31 Data analysis method and device

Country Status (1)

Country Link
CN (1) CN111461865B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104839962A (en) * 2015-04-28 2015-08-19 百度在线网络技术(北京)有限公司 Smart wallet, information processing method thereof and device
CN105844501A (en) * 2016-05-18 2016-08-10 上海亿保健康管理有限公司 Consumption behavior risk control system and method
CN107203883A (en) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 A kind of risk control method and equipment
CN107679982A (en) * 2017-09-29 2018-02-09 电子科技大学 A kind of credit card risk checking method based on point process
CN109785162A (en) * 2018-12-13 2019-05-21 平安科技(深圳)有限公司 Medical insurance method for detecting abnormality, device, equipment and computer storage medium
US20190188769A1 (en) * 2017-12-14 2019-06-20 Wells Fargo Bank, N.A. Customized predictive financial advisory for a customer
CN110895758A (en) * 2019-12-02 2020-03-20 中国银行股份有限公司 Screening method, device and system for credit card account with cheating transaction

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104839962A (en) * 2015-04-28 2015-08-19 百度在线网络技术(北京)有限公司 Smart wallet, information processing method thereof and device
CN107203883A (en) * 2016-03-17 2017-09-26 阿里巴巴集团控股有限公司 A kind of risk control method and equipment
CN105844501A (en) * 2016-05-18 2016-08-10 上海亿保健康管理有限公司 Consumption behavior risk control system and method
CN107679982A (en) * 2017-09-29 2018-02-09 电子科技大学 A kind of credit card risk checking method based on point process
US20190188769A1 (en) * 2017-12-14 2019-06-20 Wells Fargo Bank, N.A. Customized predictive financial advisory for a customer
CN109785162A (en) * 2018-12-13 2019-05-21 平安科技(深圳)有限公司 Medical insurance method for detecting abnormality, device, equipment and computer storage medium
CN110895758A (en) * 2019-12-02 2020-03-20 中国银行股份有限公司 Screening method, device and system for credit card account with cheating transaction

Also Published As

Publication number Publication date
CN111461865B (en) 2024-02-02

Similar Documents

Publication Publication Date Title
US20200265512A1 (en) System, method and computer program for underwriting and processing of loans using machine learning
EP1361526A1 (en) Electronic data processing system and method of using an electronic processing system for automatically determining a risk indicator value
Bravo et al. Granting and managing loans for micro-entrepreneurs: New developments and practical experiences
US20150026039A1 (en) System and method for predicting consumer credit risk using income risk based credit score
CN110689438A (en) Enterprise financial risk scoring method and device, computer equipment and storage medium
CN110807700A (en) Unsupervised fusion model personal credit scoring method based on government data
US20150269669A1 (en) Loan risk assessment using cluster-based classification for diagnostics
CN111882420A (en) Generation method of response rate, marketing method, model training method and device
CN113989019A (en) Method, device, equipment and storage medium for identifying risks
CN112232950A (en) Loan risk assessment method and device, equipment and computer-readable storage medium
CN112102076A (en) Comprehensive risk early warning system of platform
CN113393328A (en) Method and device for assessing pre-financing and pre-loan approval and computer storage medium
Wagdi et al. The integration of big data and artificial neural networks for enhancing credit risk scoring in emerging markets: Evidence from Egypt
US10699335B2 (en) Apparatus and method for total loss prediction
CN111461865B (en) Data analysis method and device
CN113822751A (en) Online loan risk prediction method
CN113870020A (en) Overdue risk control method and device
Yip Business failure prediction: a case-based reasoning approach
Lee et al. Application of machine learning in credit risk scorecard
Bouazza et al. Datamining for fraud detecting, state of the art
CN113269412A (en) Risk assessment method and related device
CN111461866B (en) Data analysis method and device
Chowdhury et al. Application of Data Analytics in Risk Management of Fintech Companies
Tselekidou A machine learning approach for micro-credit scoring and limit optimization
JP2002197268A (en) Loan managing system, its method, and computer software program product which makes computer system manage loan

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant