CN110991650A - Method and device for training card maintenance identification model and identifying card maintenance behavior - Google Patents

Method and device for training card maintenance identification model and identifying card maintenance behavior Download PDF

Info

Publication number
CN110991650A
CN110991650A CN201911162068.9A CN201911162068A CN110991650A CN 110991650 A CN110991650 A CN 110991650A CN 201911162068 A CN201911162068 A CN 201911162068A CN 110991650 A CN110991650 A CN 110991650A
Authority
CN
China
Prior art keywords
card
credit card
account
training
transaction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911162068.9A
Other languages
Chinese (zh)
Inventor
陈燕
王萌
朱晓丹
姚均霖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN201911162068.9A priority Critical patent/CN110991650A/en
Publication of CN110991650A publication Critical patent/CN110991650A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Technology Law (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Development Economics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

A method and a device for training a card-keeping recognition model and recognizing a card-keeping behavior are disclosed. Constructing a first training sample according to a first credit card account judged by a service party to have the card-keeping behavior, wherein the first training sample comprises a sample label used for indicating that the first credit card account has the card-keeping behavior and at least one sample characteristic; constructing a second training sample according to a second credit card account existing in the set after the first credit card account is removed from the credit card account set maintained by the service party, wherein the second training sample comprises a sample label used for indicating that the second credit card account does not have card-holding behavior and at least one sample characteristic; and training a card-keeping recognition model for recognizing whether card-keeping behaviors exist in the credit card account or not based on the at least one first training sample and the at least one second training sample. Therefore, the card-feeding identification model with high coverage rate and accuracy can be obtained, and the effectiveness of the label data of the constructed training sample can be ensured from the source.

Description

Method and device for training card maintenance identification model and identifying card maintenance behavior
Technical Field
The invention relates to the field of artificial intelligence, in particular to a training method of a card raising recognition model, a method and a device for recognizing whether a credit card has a card raising behavior.
Background
The card maintenance means that a part of credit card amount is used through consumption or cash register and the like, then the remaining amount is consumed (cash withdrawal) after the bill date, the consumed money is paid, and the bill payment can be realized through repeated operation. This behavior is clearly fraudulent and prohibited by law.
At present, expert rules are mainly adopted to detect card-keeping customers. Specifically, according to the experience of business experts, business rules for detecting card-raising customers are formulated by combining with actual risk events, and the business rules are applied to all transactions in a period of time. The experience of the business experts mainly comes from the review of domain knowledge and the existing risk events, and the business experts mainly aim at the occurred and relatively obvious card raising behaviors and abnormal behaviors, so that the dimensionality of expert rules formulated based on the experience of the business experts is simple, the threshold value is higher, the coverage rate is lower, and the missed grabbing of card raising clients is easily caused.
Disclosure of Invention
The exemplary embodiment of the invention aims to overcome the defect that missed grabbing of card maintenance customers is easily caused when the card maintenance customers are detected based on expert rules in the prior art.
According to a first aspect of the present invention, a training method for a card raising recognition model is provided, which includes: constructing a first training sample according to a first credit card account judged to have the card-keeping behavior by a service party, wherein the first training sample comprises a sample label used for indicating that the first credit card account has the card-keeping behavior and at least one sample characteristic; constructing a second training sample according to a second credit card account existing in the set after the first credit card account is removed from the credit card account set maintained by the business party, wherein the second training sample comprises a sample label used for indicating that no card-holding behavior exists in the second credit card account and at least one sample characteristic; and training a card-holding identification model for identifying whether card-holding behaviors exist in the credit card account or not based on at least one piece of the first training sample and at least one piece of the second training sample.
Optionally, the step of constructing the first training sample comprises: according to a first judging date that the first credit card account is judged to have card-keeping behavior by a service party, acquiring account information of the first credit card account in a first preset time range before the first judging date; determining characteristics of the first credit card account based on the acquired account information; constructing a first training sample corresponding to the first credit card account based on the label and the feature for indicating that card-holding behavior exists for the first credit card account.
Optionally, the step of constructing the second training sample comprises: determining a second judging date of the second credit card account according to a first judging date of one or more first credit card accounts judged by a service party to have card-keeping behavior, so that the date distribution of the second judging date is consistent or basically consistent with the date distribution of the first judging date; acquiring account information of the second credit card account within a second preset time range before the second judgment date; determining a characteristic of the second credit card account based on the acquired account information; constructing a second training sample corresponding to the second credit card account based on the label and the feature indicating that no card-holding behavior exists for the second credit card account.
Optionally, the training method of the card-raising recognition model further includes: constructing a second training sample for a second credit card account on which transaction activities occurred within a third predetermined time range before the second decision date; or rejecting second training samples corresponding to second credit card accounts which do not have transaction behaviors within a third preset time range before the second judgment date.
Optionally, the account information includes at least one of: account transaction information; credit information of a user associated with the credit card account.
Optionally, the features are classified into transaction-like features and credit-like features.
Optionally, the transaction class characteristics include at least one characteristic dimension of: the method comprises the following steps of monthly transaction condition, transaction merchant abnormal condition, consumption condition after payment, transaction mode from bill date to payment date, reasonable transaction event condition and transaction mode aggregation condition, wherein each characteristic dimension comprises one or more characteristics, and/or the credit-type characteristics comprise at least one characteristic dimension of the following characteristics: credit card debit condition, asset liability condition, personal credit condition, each feature dimension comprising one or more features.
Optionally, the characteristics relating to the monthly transaction scenario include at least one of: monthly spending amounts; the number of consumption per month; a monthly withdrawal amount; number of cash withdrawals per month; paying money per month; the number of monthly payments; a monthly offer transaction amount; number of special deals per month; testing the transaction amount per month; testing the number of transactions per month; different transaction types are compared every month; the monthly transaction amount exceeds the number of strokes of a preset amount, wherein the preset amount is a numerical value which can be divided by ten; average monthly quota usage; the frequency of the monthly quota utilization rate exceeding a predetermined threshold.
Optionally, the characteristics relating to consumption after payment comprise at least one of: the times of transaction behaviors exceeding the preset amount within a fixed time length range before and after each repayment; the times that the transaction amount and the repayment amount related to the transaction behavior are within the preset proportion in the preset time range before and after each repayment; and the utilization rate of the quota within the preset time range after payment.
Optionally, the characteristics relating to the pattern of transactions during the bill day to the payment day include at least one of: a transaction amount related to a predetermined type of transaction activity occurring within a predetermined time window between the billing date and the payment date; the transaction amount related to the predetermined type of transaction action occurring within a predetermined time window between the bill date and the repayment date is in proportion to the transaction amount related to the predetermined type of transaction action within the bill period; the number of times of predetermined types of transaction actions occurring within a predetermined time window between the billing date and the payment date; a ratio of a number of predetermined types of transaction activities occurring within a predetermined time window between a billing day and a payment day to a number of times the predetermined types of transaction activities occur within a billing period, wherein the predetermined types of transaction activities include at least one of: consumption behavior, cash-out behavior, payment-on-behalf behavior, and repayment behavior.
Optionally, the characteristics associated with the transaction merchant exception condition include at least one of: the consumption amount deviation degree of the merchant which transacts with the credit card account on the merchant type; a monthly transaction time interval for a merchant with whom transactions occur with the credit card account.
Optionally, the characteristics relating to the reasonableness of the transaction event include at least one of: the number of times of change of the offline transaction location within a preset time range; the number of times of change of the terminal equipment used by the offline transaction of the credit card account within the preset time range.
Optionally, the characteristics relating to the transaction pattern aggregation comprise at least one of: whether a date with transaction times more than a preset threshold value exists is obtained by counting the transaction behaviors in a preset time range; whether a place with transaction times more than a preset threshold exists or not is obtained by counting the transaction behaviors in a preset time range; counting whether a terminal with transaction times more than a preset threshold exists or not according to the transaction behaviors in a preset time range; whether a date with transaction times more than a preset threshold value exists is obtained by counting the transaction behaviors with the transaction amount more than a preset value in a preset time range; whether a place with transaction frequency of the transaction amount larger than the preset value and larger than a preset threshold exists in the place obtained by counting the transaction behavior of the transaction amount larger than the preset value in the preset time range; counting whether a terminal with transaction frequency of the transaction amount larger than the preset amount is larger than a preset threshold value exists or not according to the transaction behavior of the transaction amount larger than the preset value in the preset time range; counting whether a date with transaction times more than a preset threshold value exists or not according to the transaction behaviors after the hotspot transaction behaviors are removed within a preset time range; counting whether a place with transaction times more than a preset threshold exists or not according to the transaction behavior after the hotspot transaction behavior is removed within a preset time range; and counting whether the transaction behavior after the hot spot transaction behavior is removed in the preset time range has a date with the transaction frequency more than a preset threshold value.
Optionally, the characteristics relating to the debit of the credit card include at least one of: the number of times the number of overdue periods exceeds the predetermined number of periods; the number of times the overdue amount exceeds a predetermined value; generating the number of overdue accounts; generating a proportion of the amount that is overdue to the amount due; the number of times of account lost; the number of freezes.
Optionally, the characteristics relating to the liability conditions comprise at least one of: loan rank is the number of concerns; the loan rank is the number of sub-levels; the loan level is the number of doubtful times; loan rank is the number of losses; and the debt rate index is obtained by dividing the debt amount by the average value of the credit card amount of the latest batch core.
Optionally, the characteristics relating to the personal credit situation comprise at least one of: the number of outstanding loan strokes; not clearing the balance; an outstanding debit card balance; an outstanding debit card used balance; the outstanding balance of the account credit card is not sold; number of defaults; a default amount; the maximum expiration of the credit card that produced the breach; generating a number of overdraft months for the default credit card; the number of strokes is guaranteed to be used for external guarantee; an external guarantee amount; the balance of the principal is guaranteed.
Optionally, the step of training a card-keeping recognition model for recognizing whether card-keeping behavior exists in the credit card account comprises: training a first card-holding recognition model for recognizing whether a credit card account has card-holding behaviors or not based on the sample labels and the transaction characteristics of the training samples; training a second card-holding recognition model for recognizing whether a card-holding behavior exists in the credit card account or not based on the sample label and the credit-class characteristics of the training sample; and using the first card maintenance identification model and the second card maintenance identification model together as a card maintenance identification model for identifying whether card maintenance behaviors exist in a credit card account.
Optionally, the training method of the card-raising recognition model further includes: assigning a first weight to the first card-raising identification model; and giving a second weight to the second card raising identification model, wherein the second weight is smaller than the first weight.
Optionally, the training method of the card-raising recognition model further includes: analyzing characteristics of the credit card accounts to determine the importance of each characteristic to determining whether a card-raising behavior exists for the credit card account; and selecting one or more characteristics as sample characteristics according to the sequence of the importance from large to small.
Optionally, the step of determining the importance of each of the features to the determination of whether the card-holding behavior of the credit card account exists comprises: grouping according to the values of the characteristics; calculating a first proportion of the number of credit card accounts with card maintenance behaviors in each group to the number of all credit card accounts with card maintenance behaviors in a credit card account set, and a second proportion of the number of credit card accounts without card maintenance behaviors in each group to the number of all credit card accounts without card maintenance behaviors in the credit card account set; determining the importance of the features to determine whether card-holding behavior exists for the credit card account, wherein the importance of the features is equal to the sum of the importance of the features under each group, and the importance of the features under a single group is positively correlated with the difference between the first ratio and the second ratio.
Optionally, the card-raising identification model is a gradient boosting decision tree model.
Optionally, the training method of the card-raising recognition model further includes: and acquiring new sample data, and performing incremental training on the card-raising identification model based on the new sample data.
According to a second aspect of the present invention, there is provided a method for identifying whether a card-holding behavior of a credit card exists, comprising: acquiring account information of a credit card account to be identified; and identifying whether card-keeping behavior exists in the credit card account to be identified by using a card-keeping identification model based on the account information, wherein the card-keeping identification model is obtained by training according to the training method of the first aspect of the invention.
Optionally, the step of identifying whether the credit card account to be identified has card-keeping behavior by using a card-keeping identification model comprises: extracting one or more characteristics from the account information to construct a prediction sample; inputting the prediction sample into the card-keeping identification model to obtain a score which is output by the card-keeping identification model and used for representing the probability of the card-keeping behavior of the credit card account to be identified.
Optionally, the method further comprises: and if the overdue information which exceeds a preset term number exists in the credit card account to be identified, outputting the identification result of whether the credit card account to be identified has the card-holding behavior and the overdue information in a correlated manner.
According to a third aspect of the present invention, a training device for a card raising recognition model is provided, which includes: the system comprises a first construction unit, a second construction unit and a third construction unit, wherein the first construction unit is used for constructing a first training sample according to a first credit card account judged to have the card-keeping behavior by a service party, and the first training sample comprises a sample label and at least one sample characteristic, wherein the sample label is used for indicating that the first credit card account has the card-keeping behavior; a second construction unit, configured to construct a second training sample according to a second credit card account existing in a set after the first credit card account is removed from the set of credit card accounts maintained by the service party, where the second training sample includes a sample label used for indicating that no card-holding behavior exists in the second credit card account and at least one sample feature; and the training unit is used for training a card-holding identification model for identifying whether a card-holding behavior exists in the credit card account or not based on at least one first training sample and at least one second training sample.
Optionally, the first building unit comprises: the first acquisition unit is used for acquiring account information of the first credit card account in a first preset time range before a first judgment date according to the first judgment date that the first credit card account is judged to have the card-keeping behavior by a service party; a first determination unit configured to determine a characteristic of the first credit card account based on the acquired account information; a first construction subunit, configured to construct a first training sample corresponding to the first credit card account based on the label indicating that the card-holding behavior exists for the first credit card account and the feature.
Optionally, the second building unit comprises: a judging date determining unit, configured to determine a second judging date of the second credit card account according to a first judging date on which one or more first credit card accounts are judged to have card-holding behavior by a service party, so that a date distribution of the second judging date is consistent or substantially consistent with a date distribution of the first judging date; a second acquisition unit, configured to acquire account information of the second credit card account within a second predetermined time period before the second determination date; a second determination unit configured to determine a characteristic of the second credit card account based on the acquired account information; a second construction subunit, configured to construct a second training sample corresponding to the second credit card account based on the label and the feature indicating that no card-holding behavior exists for the second credit card account.
Optionally, the second constructing subunit constructs a second training sample for a second credit card account on which a transaction occurred within a third predetermined time range before the second determination date; or the device further comprises a rejecting unit, which is used for rejecting the second training sample corresponding to the second credit card account without transaction behavior within a third predetermined time range before the second determination date.
Optionally, the account information includes at least one of: account transaction information; credit information of a user associated with the credit card account.
Optionally, the features are classified into transaction-like features and credit-like features.
Optionally, the transaction class characteristics include at least one characteristic dimension of: the method comprises the following steps of monthly transaction condition, transaction merchant abnormal condition, consumption condition after payment, transaction mode from bill date to payment date, reasonable transaction event condition and transaction mode aggregation condition, wherein each characteristic dimension comprises one or more characteristics, and/or the credit-type characteristics comprise at least one characteristic dimension of the following characteristics: credit card debit condition, asset liability condition, personal credit condition, each feature dimension comprising one or more features.
Optionally, the characteristics relating to the monthly transaction scenario include at least one of: monthly spending amounts; the number of consumption per month; a monthly withdrawal amount; number of cash withdrawals per month; paying money per month; the number of monthly payments; a monthly offer transaction amount; number of special deals per month; testing the transaction amount per month; testing the number of transactions per month; different transaction types are compared every month; the monthly transaction amount exceeds the number of strokes of a preset amount, wherein the preset amount is a numerical value which can be divided by ten; average monthly quota usage; the frequency of the monthly quota utilization rate exceeding a predetermined threshold.
Optionally, the characteristics relating to consumption after payment comprise at least one of: the times of transaction behaviors exceeding the preset amount within a fixed time length range before and after each repayment; the times that the transaction amount and the repayment amount related to the transaction behavior are within the preset proportion in the preset time range before and after each repayment; and the utilization rate of the quota within the preset time range after payment.
Optionally, the characteristics relating to the pattern of transactions during the bill day to the payment day include at least one of: a transaction amount related to a predetermined type of transaction activity occurring within a predetermined time window between the billing date and the payment date; the transaction amount related to the predetermined type of transaction action occurring within a predetermined time window between the bill date and the repayment date is in proportion to the transaction amount related to the predetermined type of transaction action within the bill period; the number of times of predetermined types of transaction actions occurring within a predetermined time window between the billing date and the payment date; a ratio of a number of predetermined types of transaction activities occurring within a predetermined time window between a billing day and a payment day to a number of times the predetermined types of transaction activities occur within a billing period, wherein the predetermined types of transaction activities include at least one of: consumption behavior, cash-out behavior, payment-on-behalf behavior, and repayment behavior.
Optionally, the characteristics associated with the transaction merchant exception condition include at least one of: the consumption amount deviation degree of the merchant which transacts with the credit card account on the merchant type; a monthly transaction time interval for a merchant with whom transactions occur with the credit card account.
Optionally, the characteristics relating to the reasonableness of the transaction event include at least one of: the number of times of change of the offline transaction location within a preset time range; the number of times of change of the terminal equipment used by the offline transaction of the credit card account within the preset time range.
Optionally, the characteristics relating to the transaction pattern aggregation comprise at least one of: whether a date with transaction times more than a preset threshold value exists is obtained by counting the transaction behaviors in a preset time range; whether a place with transaction times more than a preset threshold exists or not is obtained by counting the transaction behaviors in a preset time range; counting whether a terminal with transaction times more than a preset threshold exists or not according to the transaction behaviors in a preset time range; whether a date with transaction times more than a preset threshold value exists is obtained by counting the transaction behaviors with the transaction amount more than a preset value in a preset time range; whether a place with transaction frequency of the transaction amount larger than the preset value and larger than a preset threshold exists in the place obtained by counting the transaction behavior of the transaction amount larger than the preset value in the preset time range; counting whether a terminal with transaction frequency of the transaction amount larger than the preset amount is larger than a preset threshold value exists or not according to the transaction behavior of the transaction amount larger than the preset value in the preset time range; counting whether a date with transaction times more than a preset threshold value exists or not according to the transaction behaviors after the hotspot transaction behaviors are removed within a preset time range; counting whether a place with transaction times more than a preset threshold exists or not according to the transaction behavior after the hotspot transaction behavior is removed within a preset time range; and counting whether the transaction behavior after the hot spot transaction behavior is removed in the preset time range has a date with the transaction frequency more than a preset threshold value.
Optionally, the characteristics relating to the debit of the credit card include at least one of: the number of times the number of overdue periods exceeds the predetermined number of periods; the number of times the overdue amount exceeds a predetermined value; generating the number of overdue accounts; generating a proportion of the amount that is overdue to the amount due; the number of times of account lost; the number of freezes.
Optionally, the characteristics relating to the liability conditions comprise at least one of: loan rank is the number of concerns; the loan rank is the number of sub-levels; the loan level is the number of doubtful times; loan rank is the number of losses; and the debt rate index is obtained by dividing the debt amount by the average value of the credit card amount of the latest batch core.
Optionally, the characteristics relating to the personal credit situation comprise at least one of: the number of outstanding loan strokes; not clearing the balance; an outstanding debit card balance; an outstanding debit card used balance; the outstanding balance of the account credit card is not sold; number of defaults; a default amount; the maximum expiration of the credit card that produced the breach; generating a number of overdraft months for the default credit card; the number of strokes is guaranteed to be used for external guarantee; an external guarantee amount; the balance of the principal is guaranteed.
Optionally, the training unit comprises: the first training unit is used for training a first card-holding identification model for identifying whether a credit card account has card-holding behaviors or not based on the sample labels and the transaction characteristics of the training samples; the second training unit is used for training a second card-holding identification model for identifying whether a card-holding behavior exists in a credit card account or not based on the sample label and the credit-class characteristics of the training sample; the first card maintenance identification model and the second card maintenance identification model form a card maintenance identification model for identifying whether card maintenance behaviors exist in a credit card account.
Optionally, the training device for the card-raising recognition model further includes: and the weight distribution unit is used for giving a first weight to the first card maintenance identification model and giving a second weight to the second card maintenance identification model, and the second weight is smaller than the first weight.
Optionally, the training device for the card-raising recognition model further includes: the analysis unit is used for analyzing the characteristics of the credit card account to determine the importance of each characteristic on judging whether the credit card account has a card-holding behavior; and the selecting unit is used for selecting one or more characteristics as sample characteristics according to the sequence of the importance from large to small.
Optionally, the analysis unit comprises: the grouping unit is used for grouping according to the values of the characteristics; the calculating unit is used for calculating a first proportion of the number of credit card accounts with card maintenance behaviors in each group to the number of all credit card accounts with card maintenance behaviors in the credit card account set, and a second proportion of the number of credit card accounts without card maintenance behaviors in each group to the number of all credit card accounts without card maintenance behaviors in the credit card account set; an importance determination unit, configured to determine an importance of the feature to determining whether there is a card-holding behavior for the credit card account, wherein the importance of the feature is equal to a sum of the importance of the feature under each group, and the importance of the feature under a single group is positively correlated with a difference between the first ratio and the second ratio.
Optionally, the card-raising identification model is a gradient boosting decision tree model.
Optionally, the training device for the card-raising recognition model further includes: the sample data acquisition unit is used for acquiring new sample data; the training unit is also used for carrying out incremental training on the card-raising recognition model based on new sample data.
According to a third aspect of the present invention, there is provided an apparatus for identifying whether a card-holding behavior of a credit card exists, comprising: the acquisition unit is used for acquiring account information of a credit card account to be identified; and the identification unit is used for identifying whether the credit card account to be identified has card-keeping behavior or not by using a card-keeping identification model based on the account information, wherein the card-keeping identification model is obtained by training according to the training method of the first aspect of the invention.
Optionally, the identification unit includes: the extraction unit is used for extracting one or more characteristics from the account information to construct a prediction sample; and the operation unit is used for inputting the prediction sample sign into the card-keeping identification model so as to obtain a score which is output by the card-keeping identification model and is used for representing the probability of the card-keeping behavior of the credit card account to be identified.
Optionally, the apparatus further comprises: and the output unit is used for outputting the identification result of whether the credit card account to be identified has the card holding behavior and the overdue information in a correlated manner under the condition that the overdue information which exceeds the preset term number exists in the credit card account to be identified.
According to a fifth aspect of the present invention, there is provided a system comprising at least one computing device and at least one storage device storing instructions, wherein the instructions, when executed by the at least one computing device, cause the at least one computing device to perform a method as set forth in the first or second aspect of the present invention.
According to a sixth aspect of the present invention, there is provided a computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform the method as set forth in the first or second aspect of the present invention.
In the method and the device for training the card-feeding identification model and identifying the card-feeding behavior according to the exemplary embodiment of the invention, the card-feeding identification model with higher coverage rate and accuracy rate can be obtained by constructing the training sample and performing supervised training on the training sample by adopting a machine learning method, wherein the label data of the training sample is confirmed by a business party, so that the validity of the label data of the constructed training sample can be ensured from the source.
Drawings
These and/or other aspects and advantages of the present invention will become more apparent and more readily appreciated from the following detailed description of the embodiments of the invention, taken in conjunction with the accompanying drawings of which:
FIG. 1 illustrates a flow diagram of a method of training a card-based recognition model according to an exemplary embodiment of the present invention;
FIG. 2 illustrates a flow chart of a method for identifying whether card-curing behavior exists for a credit card in accordance with an exemplary embodiment of the present invention;
FIG. 3 is a block diagram illustrating an exemplary embodiment of a training apparatus for a card-based recognition model;
fig. 4 is a block diagram illustrating an apparatus for recognizing whether a card-holding behavior of a credit card exists according to an exemplary embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, exemplary embodiments thereof will be described in further detail below with reference to the accompanying drawings and detailed description.
FIG. 1 illustrates a flowchart of a training method of a card-raising recognition model according to an exemplary embodiment of the present invention. The method shown in fig. 1 may be implemented entirely in software via a computer program, and the method shown in fig. 1 may also be executed by a specifically-configured computing device.
Referring to fig. 1, in step S110, a first training sample is constructed according to a first credit card account determined by a service party to have a card-holding behavior, where the first training sample includes a sample label indicating that the first credit card account has the card-holding behavior and at least one sample feature.
The first credit card account is a credit card account determined by a service party to have card-keeping behavior, and the service party, i.e. a credit card issuer, may be, but is not limited to, a bank or other commercial establishment. The sample label of the first training sample is used for indicating that the card-feeding action exists in the first credit card account, and the sample label can be regarded as a real labeling result provided by the business party. Thus, the first training sample may be considered a "black sample" for which card-keeping behavior exists.
The sample characteristics of the first training sample are from account information of the first credit card account. That is, the sample characteristics may be determined based on account information of the first credit card account. The process of determining the sample characteristics based on the account information about the account information will be described below, and will not be described herein again.
Considering that the card-keeping behavior is time-efficient, if no time limit is added, the sample features are determined based on the account information of the first credit card account within a random time period, so that the sample features extracted from the selected account information cannot represent the card-keeping behavior, and the sample labels indicate that the card-keeping behavior exists, so that the constructed training samples cannot be used for model training. For example, assuming that a card-holding behavior exists only in 1 month to 3 months for a first credit card account, if the account information of the first credit card account after 4 months is selected to obtain sample characteristics, it is obvious that the obtained sample characteristics cannot reflect the card-holding behavior of the credit card account, and the sample labels indicate that the card-holding behavior exists, so that when the model is trained based on the training sample, the accuracy of the trained model is reduced.
Therefore, the invention introduces a judging date (which can be called as a first judging date for distinguishing convenience) that the first credit card account is judged by the service party to have the card-holding behavior. The invention provides that account information of the first credit card account within a first predetermined time range before a first decision date can be acquired, and the characteristics of the first credit card account are determined based on the acquired account information. The first predetermined time range may be determined according to actual conditions, and may be two months or a quarter before the first determination date, for example.
Therefore, by introducing the first judging date, determining the characteristics of the first credit card account based on the account information of the first credit card account in the first preset time range before the first judging date, and constructing the first training sample according to the characteristics, the sample characteristics in the constructed first training sample can reflect the card-holding behavior, and the sample label of the first training sample can be regarded as the real labeling result aiming at the first credit card account, so that the quality of the first training sample can be ensured.
In step S120, a second training sample is constructed according to a second credit card account existing in the set after the first credit card account is removed from the set of credit card accounts maintained by the service party, where the second training sample includes a sample label for indicating that no card-holding behavior exists in the second credit card account and at least one sample feature.
The second credit card account refers to the credit card account existing in the set after the first credit card account is removed from the set of credit card accounts maintained by the business party, and the first credit card account refers to the credit card account judged by the business party to have the card-holding behavior, so the second credit card account refers to the credit card account which does not have the card-holding behavior. The second training sample may be considered a "white sample" for which no card-holding behavior exists.
The sample characteristics of the second training sample are from account information of a second credit card account. That is, the sample characteristics may be determined based on account information of the second credit card account. The process of determining the sample characteristics based on the account information about the account information will be described below, and will not be described herein again.
Also considering that the card-holding behavior is time-efficient, if no time limit is added, the sample features are determined based on the account information of the second credit card account within a random time period, so that the meaning of the sample feature characterization extracted from the selected account information may not be consistent with the labeling result indicated by the sample label, that is, the determined sample features may be features reflecting the existence of the card-holding behavior, and the sample label is used for indicating the absence of the card-holding behavior, so that the constructed training sample cannot be used for model training. For example, assuming that a second credit card account belongs to a card-holding account two years ago, but does not belong to a card-holding account in the last two years, if the account information of the second credit card account two years ago is selected to obtain sample characteristics, the obtained sample characteristics reflect the existence of card-holding behavior, and the sample labels indicate the absence of card-holding behavior, so that when the model is trained based on the training sample, the accuracy of the trained model is reduced.
For this purpose, the present invention proposes that, based on a first determination date on which one or more first credit card accounts are determined by a business party to have card-holding behavior, a determination date (which may be referred to as a second determination date for convenience of distinction) of a second credit card account is determined so that a date distribution of the second determination date coincides or substantially coincides with a date distribution of the first determination date. The method comprises the steps of obtaining account information of a second credit card account within a second preset time range before a second judging date, determining characteristics of the second credit card account based on the obtained account information, and constructing a second training sample corresponding to the second credit card account based on labels and characteristics used for indicating that no card feeding behavior exists in the second credit card account. Wherein the second predetermined time range may be determined according to actual conditions, and may be two months before the second determination date, for example.
Therefore, the date distribution of the second judging date is set to be consistent or basically consistent with the date distribution of the first judging date, the characteristics of the second credit card account are determined according to the account information in a second preset time length range before the second judging date, and the second training sample is constructed according to the characteristics, so that the characteristics of the sample in the constructed second training sample reflect the characteristics of the absence of the card-holding behavior, and the sample label of the first training sample indicates the absence of the card-holding behavior, so that the quality of the second training sample can be ensured.
The business party generally judges whether the credit card account has the card-keeping behavior according to the transaction behavior of the credit card account, so that the first credit card account generally has the transaction behavior before the first judgment date when the first credit card account is judged to have the card-keeping behavior. And the second judging date of the second credit card account is generated according to the first judging date, and the second credit card account may or may not have transaction behavior before the second judging date. If no transaction action exists, the second credit card account may have no account information related to the transaction action within a second predetermined time range before the second decision date, so that the features extracted from the account information with less information amount are limited, and the quality of the second training sample constructed based on the limited features cannot be guaranteed.
Therefore, the invention provides that when the second training sample is constructed, the second training sample can be constructed for the second credit card account which has the transaction behavior within a third predetermined time range before the second determination date, or the second training sample corresponding to the second credit card account which has no transaction behavior within the third predetermined time range before the second determination date can be removed after the second training sample is constructed. Wherein the third predetermined time length range may be the same as or different from the second predetermined time length range. As an example, the third predetermined time range may refer to a month in which the second determination date is located, that is, the second training sample may be constructed for a second credit card account in which dynamic account transactions (for example, transactions such as drawing and transferring) occur in the month on the second determination date, or the second training sample corresponding to a second credit card account in which no dynamic account transactions occur in the month on the second determination date may be eliminated.
In step S130, a card-holding identification model for identifying whether card-holding behavior exists in the credit card account is trained based on the at least one first training sample and the at least one second training sample.
As described above, the first training sample may be considered a "black sample" and the second training sample may be considered a "white sample". Based on the at least one black sample and the at least one white sample, a card-based recognition model can be trained using supervised learning. Considering that most of the features are continuity features, the card maintenance identification model can be but is not limited to a gradient lifting decision tree model with good practicability on continuity variables, that is, a GBDT decision tree algorithm can be adopted to find a better decision path through continuously fitting residual errors of the decision tree before the decision tree so as to obtain the card maintenance identification model. The present invention is not repeated in detail with respect to the structure and training process of the card-feeding recognition model.
As an example, misjudged accounts existing in the first training sample, that is, the first credit card account which is erroneously judged by the service party to have the card-holding behavior, may be eliminated. False positive accounts can be removed from the first training sample and added to the second training sample based on customer complaints.
The basic flow of the training method of the card-raising recognition model of the present invention is described in detail with reference to fig. 1. The method comprises the steps of constructing a first training sample (namely a black sample) according to a first credit card account with card-raising behaviors provided by a business party, and constructing a second training sample (namely a white sample) according to the first training sample, so that the validity of label data of the constructed training sample can be ensured from the source.
The details of the training method of the card-raising recognition model of the present invention will be further described below.
The account information referred to herein may include, but is not limited to, account transaction information and/or credit information of a user associated with a credit card account. The account transaction information refers to information generated based on transaction behavior of a credit card account, and the account transaction information can be acquired from a service party. The credit information of the user associated with the credit card account, that is, the credit information under the identification number to which the credit card account belongs, may include, but is not limited to, credit investigation records, and the coverage of the credit investigation records may be all the credit investigation data under the identification number.
The features determined based on the account transaction information may be referred to as transaction-like features and the features determined based on the credit information may be referred to as credit-like features. Thus, the features in the training samples mentioned above (first training sample/second training sample) can be classified into transaction-like features and credit-like features.
1. Transaction-like features
The transaction class characteristics may include, but are not limited to, at least one of the following characteristic dimensions: the method comprises the following steps of monthly transaction condition, transaction merchant abnormal condition, consumption condition after payment, transaction mode from bill date to payment date, reasonable transaction event condition and transaction mode aggregation condition, wherein each feature dimension comprises one or more features.
Monthly transaction scenarios, i.e., monthly transaction scenarios. Characteristics related to the monthly transaction scenario may include, but are not limited to, at least one of the following: monthly spending amounts; the number of consumption per month; a monthly withdrawal amount; number of cash withdrawals per month; paying money per month; the number of monthly payments; a monthly appointment transaction amount, which may refer to a transaction with an appointment merchant; number of special deals per month; testing the transaction amount per month; testing the number of transactions per month; monthly different transaction types, wherein the different transaction types may include, but are not limited to, one or more of consumption, cancellation, offer transaction, test transaction; transacting a number of strokes per month that exceeds a predetermined amount, wherein the predetermined amount is a number divisible by ten, such as but not limited to, ten, hundred, or thousand; average monthly quota usage; the frequency of monthly quota usage exceeding a predetermined threshold (e.g., 90%, 70%, 50%).
The transaction merchant in the abnormal condition of the transaction merchant refers to the merchant transacting with the credit card account, and the abnormal condition of the transaction merchant reflects whether the transaction merchant of the credit card account is abnormal or not. Characteristics associated with transaction merchant anomalies may include, but are not limited to, at least one of the following: the consumption amount deviation degree of the merchant transacting with the credit card account on the type of the merchant, wherein the consumption amount deviation degree can be used for representing the difference between the average transaction amount transacting between the merchant and other credit cards and the average transaction amount under the type of the merchant, the difference is positively correlated with the consumption amount deviation degree, namely the larger the difference is, the larger the consumption amount deviation degree is, and when the consumption amount deviation degree is greater than a preset threshold value, the transaction merchant can be considered to be abnormal, and the possibility of the card maintenance behavior of the credit card account transacting with the abnormal transaction merchant is higher; the monthly transaction time interval of the merchant transacting with the credit card account can be determined whether there are transaction attributes concentrated in a certain period of time (such as a fixed date of each month), for example, if the monthly transaction behavior of the merchant concentrated in a fixed few dates, it can be determined that the merchant is abnormal.
Consumption after repayment can be used for representing but not limited to whether the brush is frequently returned or not and whether concentrated consumption exists after repayment or not. As an example, characteristics relating to consumption after payment include at least one of: the number of times of transaction activities (such as consumption and cash withdrawal) exceeding a preset amount (such as 100 or 500) within a long range (such as 12/24/48 hour window) before and after each repayment; the times that the transaction amount and the repayment amount related to the transaction behavior in the preset time length range before and after each repayment are within the preset proportion (such as 50%, 30% and 20%); credit usage within a predetermined time period (e.g., 5/10 days) after payment.
Characteristics related to the transaction pattern during the billing day to the payment day may include, but are not limited to, at least one of: a transaction amount associated with a predetermined type of transaction occurring within a predetermined time window (e.g., first 10 days, last 10 days) between the billing day and the payment day; the transaction amount related to the predetermined type of transaction action occurring within a predetermined time window (e.g., the first 10 days and the last 10 days) between the billing day and the repayment day is a proportion of the transaction amount related to the predetermined type of transaction action within the billing period; the number of times a predetermined type of transaction occurred within a predetermined time window (e.g., first 10 days, last 10 days) between the billing day and the payment day; the number of predetermined types of transactions occurring within a predetermined time window (e.g., first 10 days, last 10 days) between the billing day and the payment day is proportional to the number of transactions occurring within the billing period. Wherein the predetermined type of transaction activity may include, but is not limited to, at least one of: consumption behavior, cash-out behavior, payment-on-behalf behavior, and repayment behavior.
The transaction event rationality is used to characterize whether a transaction event is rational. Features relevant to the reasonableness of a transaction event may include, but are not limited to, at least one of: the number of times of change of the location of offline transaction within a preset time range mainly refers to a POS input mode, namely a transaction mode of reading a magnetic strip/chip of a credit card on site or manually inputting the transaction on site in a contact or non-contact manner, such as statistics of provinces, cities and countries change times within half an hour, half an hour to 1 hour, 1 hour to 2 hours, 2 hours to 6 hours and 6 hours to 12 hours; the number of changes of the terminal device (such as a POS machine) used by the offline transaction of the credit card account within the predetermined time range can be counted, for example, the number of changes of the terminal number used by the offline transaction within 2min/5min/10min of the offline transaction can be counted.
Features related to transaction pattern aggregation may include, but are not limited to, at least one of: whether a date with more than a preset threshold number of transactions exists is counted according to the transaction behaviors in a preset time range (such as the past one week/half month/1 quarter); whether a place with transaction times more than a preset threshold value exists or not is counted according to transaction behaviors in a preset time range (such as the past week/half month/1 quarter); whether a terminal with more than a predetermined threshold number of transactions exists or not is counted according to the transaction behaviors in a predetermined time range (such as the past one week/half month/1 quarter), and the terminal can be a terminal device used by a credit card account for offline transactions, such as a POS machine; whether the transaction behavior with the transaction amount larger than the preset value (such as 100/500) in the preset time length range (such as the past one week/half month/1 quarter) is counted to have the date with the transaction number more than the preset threshold value; whether a place with transaction amount larger than a predetermined value and transaction times larger than a predetermined threshold value exist in the transaction behavior which is counted and obtained for transaction amount larger than a predetermined value (for example, 100/500) in a predetermined time length range (for example, the past one week/half month/1 quarter); whether a terminal with transaction amount larger than a predetermined amount of transaction frequency more than a predetermined threshold value exists or not is counted by aiming at the transaction behavior with the transaction amount larger than a predetermined value (such as 100/500) in a predetermined time length range (such as the past one week/half month/1 quarter); whether a date with transaction times more than a preset threshold value exists or not is counted according to the transaction behaviors after the hot spot transaction behaviors are removed in a preset time range (such as the past week/half month/1 quarter); whether a place with transaction times more than a preset threshold exists or not is obtained by counting the transaction behaviors after the hot spot transaction behaviors are removed within a preset time range (such as the past week/half month/1 quarter); whether the transaction frequency is more than the date of the preset threshold value is counted according to the transaction behaviors after the hot transaction behaviors are removed in the preset time range (such as the past week/half month/1 quarter). The hotspot transaction behavior may refer to a transaction behavior with a frequency greater than a predetermined threshold (e.g., 10000) according to statistics.
2. Credit class features
Credit class features may include, but are not limited to, at least one of the following feature dimensions: credit card debit condition, asset liability condition, personal credit condition, each feature dimension comprising one or more features.
A credit card debit is a debit of a credit card registered under the identity of the user with whom the credit card account is associated. Characteristics related to the debit of the credit card may include, but are not limited to, at least one of: the number of times the number of overdue periods exceeds the predetermined number of periods; the number of times the overdue amount exceeds a predetermined value; generating the number of overdue accounts; generating a proportion of the amount that is overdue to the amount due; the number of times of account lost; the number of freezes.
The term "overdue" refers to the number of days from the day of return to the date of return. Specific amounts of overdue can be indicated by "M + number", e.g., M1 indicates one overdue period, i.e., 1-29 days, and M2 indicates two overdue periods, i.e., 30-59 days. The predetermined number of periods may be, but is not limited to, periods 1, 2, 3 and above.
The standing account is the receivable which is beyond the repayment deadline, can not be recovered after the discussion, and can become bad account if being in a standing state for a long time. The number of open accounts is the number of open accounts determined by the bank party issuing the debit card. The number of freezes is the number of times the credit card is frozen by the bank side to which the credit card is issued.
The liability conditions are used to characterize the liability conditions under the identity of the user with which the credit card account is associated. Characteristics related to an asset liability condition may include, but are not limited to, at least one of: loan rank is the number of concerns; the loan rank is the number of sub-levels; the loan level is the number of doubtful times; loan rank is the number of losses; and the debt rate index is obtained by dividing the debt amount by the average value of the credit card amount of the latest batch core. The loan grade refers to five-grade classification of loan quality according to the actual repayment ability of the borrower, namely, the loan is divided into five types according to the risk degree: normal, concern, secondary, suspect, loss. The liability rate index may be used to describe a person's liability situation.
The characteristics related to the personal credit situation may include, but are not limited to, at least one of: the number of outstanding loan strokes; not clearing the balance; an outstanding debit card balance; an outstanding debit card used balance; the outstanding balance of the account credit card is not sold; number of defaults; a default amount; the maximum expiration of the credit card that produced the breach; generating a number of overdraft months for the default credit card; the number of strokes is guaranteed to be used for external guarantee; an external guarantee amount; the balance of the principal is guaranteed.
The invention may perform feature extraction on account information of a credit card account to obtain one or more of the features described above. After feature extraction, the features of the obtained credit card account may be analyzed to determine the importance of each feature to determining whether the credit card account has a card-raising behavior, and one or more features may be selected as sample features according to the descending order of importance.
The invention can analyze the importance of the features in various ways. As an example, grouping may be performed according to the value of the features; calculating a first proportion of the number of credit card accounts with card maintenance behaviors in each group to the number of all credit card accounts with card maintenance behaviors in a credit card account set, and a second proportion of the number of credit card accounts without card maintenance behaviors in each group to the number of all credit card accounts without card maintenance behaviors in the credit card account set; determining the importance of the features to judging whether the credit card account has card-keeping behavior, wherein the importance of the features is equal to the sum of the importance of the features under each group, and the importance of the features under a single group is positively correlated with the difference between the first proportion and the second proportion.
Taking the feature of "amount consumed per month" as an example, the features may be grouped according to the following value intervals, wherein the distribution of the number of credit card accounts with card-keeping behavior and the number of credit card accounts without card-keeping behavior in each group is shown in the following table.
Consumption amount per month (Yuan) Existence of card-raising behavior Absence of card-raising behavior
<1000 2500 47500
[1000,2500] 3000 27000
[2500,5000] 3000 12000
>5000 1500 3500
In the present embodiment, the importance IV of the ith packet can be calculated by the following formulai
Figure BDA0002286376910000171
Wherein, # yiNumber of credit card accounts, # y, indicating that card-holding behavior exists in the ith grouptIndicates the number of all credit card accounts with card-keeping behavior in the credit card account set, # yi/#ytDenotes a first ratio, # niIndicates the number of credit card accounts with no card-holding behavior in the ith group, # ntThe number of all the credit card accounts without card-holding behavior in the credit card account set is represented. Taking the above table as an example, the set of credit card accounts includes 100000 credit card accounts, # yt10000, # nt90000. The specific process of calculating the importance of the monthly consumption amount under each group by using the above formula is not repeated.
After a training sample consisting of a sample label, transaction characteristics and credit characteristics is obtained, the transaction characteristics and the credit characteristics can be trained separately to respectively obtain a first card-holding identification model trained based on the transaction characteristics and a second card-holding identification model trained based on the credit characteristics.
Specifically, a first card-holding identification model for identifying whether card-holding behavior exists in the credit card account can be trained based on the sample labels and the transaction class characteristics of the training samples, a second card-holding identification model for identifying whether card-holding behavior exists in the credit card account can be trained based on the sample labels and the credit class characteristics of the training samples, and the first card-holding identification model and the second card-holding identification model are taken together as a card-holding identification model for finally identifying whether card-holding behavior exists in the credit card account. The training samples are the first training sample and the second training sample mentioned above.
Considering that account transaction information of a credit card account can be observed by a business party visually, credit information of a user related to the credit card account has reference value for judging whether card maintenance behavior exists, but is not intuitive enough. Therefore, the first card-keeping identification model can be endowed with a first weight, and the second card-keeping identification model can be endowed with a second weight, wherein the second weight is smaller than the first weight. For example, the first weight may be 70% and the second weight may be 30%.
When the credit card account to be identified is judged to have the card-holding behavior, the transaction score obtained by the first card-holding identification model and the credit investigation score obtained by the second card-holding identification model can be combined in a weighted average mode to serve as the final comprehensive score, and whether the credit card account to be identified has the card-holding behavior is judged according to the comprehensive score.
The method can also periodically acquire new sample data and perform incremental training on the card-feeding identification model based on the new sample data to obtain the card-feeding identification model containing the new sample. Therefore, the continuous updating iteration of the card-raising identification model can be ensured. The new sample data may be a training sample constructed based on a newly added credit card account which is determined to have the card-raising behavior every month and provided by the service party.
In summary, the conventional expert rules are usually simple combinations of some finite dimensional features, the mined rules are basically unchanged, and in reality, variable factors are many and complexity is high, and an effective rule system cannot be obtained through comparison and observation by naked eyes, so that a machine learning mode in a big data scene is particularly effective. The invention provides a machine learning-based method for carrying out supervised training on a large number of samples, deriving a large number of characteristic fields from account transaction information and credit information, finally training to obtain a card-holding recognition model, and carrying out pre-estimation scoring on new data to be tested by using the trained card-holding recognition model to output the card-holding probability.
Fig. 2 illustrates a flowchart of a method for identifying whether a credit card has a card-holding behavior according to an exemplary embodiment of the present invention. The method illustrated in fig. 2 may be implemented entirely in software via a computer program, and the method illustrated in fig. 2 may also be executed by a specifically-configured computing device.
Referring to fig. 2, in step S210, account information of a credit card account to be recognized is acquired.
The credit card account to be identified refers to a credit card account to be judged whether a card-keeping behavior exists or not. The account information may be account information for the credit card account to be identified within a predetermined time period prior to identification. The account information may include, but is not limited to, at least one of: account transaction information; credit information of a user with whom the credit card account is associated. For the account transaction information and the credit information, the above description may be referred to, and details are not repeated here.
In step S220, based on the account information, whether a card-keeping behavior exists in the credit card account to be identified is identified by using a card-keeping identification model. The card-feeding identification model can be obtained by training according to the training method of the card-feeding identification model. For the card-feeding recognition model and the training process thereof, reference may be made to the above description, and details are not repeated here.
By way of example, one or more characteristics can be extracted from account information of the credit card account to be identified, a prediction sample is constructed, and the prediction sample is input into the card-keeping identification model to obtain a score value which is output by the card-keeping identification model and used for representing the probability of the card-keeping behavior of the credit card account to be identified.
Taking the card-holding identification model composed of the first card-holding identification model and the second card-holding identification model as described above as an example, the acquired account information of the credit card account to be identified may include account transaction information and credit information. One or more transaction class characteristics can be extracted from the account transaction information, a first prediction sample composed of the transaction class characteristics is constructed, and the first prediction sample is input into a first card-holding identification model to obtain a first score which is output by the first card-holding identification model and used for representing the probability of card-holding behaviors of the credit card account to be identified. One or more credit characteristics are extracted from the credit information, a second prediction sample formed by the credit characteristics is constructed, and the second prediction sample is input into a second card-holding recognition model to obtain a second score which is output by the second card-holding recognition model and used for representing the probability of the card-holding behavior of the credit card account to be recognized. And then, based on the first weight of the first card-keeping identification model and the second weight of the second card-keeping identification model, carrying out weighted summation on the first score and the second score, and taking the weighted summation result as a score which is finally used for representing the probability of the card-keeping behavior of the credit card account to be identified.
After the card-holding identification model is used for identifying whether the card-holding behavior exists in the credit card account to be identified, the identification result of whether the card-holding behavior exists in the credit card account to be identified is output, and overdue information of the credit card account can be output in a correlated manner. The overdue information mentioned herein may refer to information on whether the credit card account is overdue within a predetermined time period in the past (e.g., half of the past). Considering that the overdue of M1 may be caused by forgetting the payment date by the customer, the overdue number of M2 is short, and the overdue of M3 is probably malicious and not paid. Therefore, alternatively, in the case that the overdue information exceeding the predetermined term number (such as M3) exists in the credit card account to be recognized, the recognition result of whether the card-holding behavior exists in the credit card account to be recognized and the overdue information may be output in association.
Fig. 3 is a block diagram illustrating a structure of a training apparatus for a card feeding recognition model according to an exemplary embodiment of the present invention. Wherein the functional modules of the training apparatus of the card-based recognition model can be implemented by hardware, software or a combination of hardware and software implementing the principles of the present disclosure. It will be appreciated by those skilled in the art that the functional blocks described in fig. 3 may be combined or divided into sub-blocks to implement the principles of the invention described above. Thus, the description herein may support any possible combination, or division, or further definition of the functional modules described herein.
The functional modules that the training device of the card-raising recognition model can have and the operations that each functional module can execute are briefly described below, and for the details related thereto, reference may be made to the description above in conjunction with fig. 1, which is not described herein again.
Referring to fig. 3, the training apparatus 300 for the card feeding recognition model includes a first building unit 310, a second building unit 320, and a training unit 330.
The first construction unit 310 is configured to construct a first training sample according to the first credit card account determined by the service party to have the card-holding behavior, where the first training sample includes a sample label indicating that the first credit card account has the card-holding behavior and at least one sample feature.
The first construction unit 310 may include a first acquisition unit, a first determination unit, and a first construction sub-unit. The first acquisition unit is used for acquiring account information of the first credit card account in a first preset time range before a first judgment date according to the first judgment date that the business party judges that the card-keeping behavior exists; the first determining unit is used for determining the characteristics of the first credit card account based on the acquired account information; the first construction subunit is used for constructing a first training sample corresponding to the first credit card account based on the label and the characteristic used for indicating that the card-keeping behavior of the first credit card account exists.
The second constructing unit 320 is configured to construct a second training sample according to a second credit card account existing in the set after the first credit card account is removed from the set of credit card accounts maintained by the service party, where the second training sample includes a sample label indicating that there is no card-holding behavior for the second credit card account and at least one sample feature.
The second constructing unit 320 may include a determination date determining unit, a second acquiring unit, a second determining unit, and a second constructing sub-unit. The judging date determining unit is used for determining a second judging date of the second credit card account according to a first judging date of one or more first credit card accounts judged by a service party to have card-holding behavior, so that the date distribution situation of the second judging date is consistent or basically consistent with the date distribution situation of the first judging date; the second acquisition unit is used for acquiring account information of the second credit card account within a second preset time range before the second judgment date; the second determination unit is used for determining the characteristics of the second credit card account based on the acquired account information; the second construction subunit is configured to construct a second training sample corresponding to the second credit card account based on the label and the feature indicating that no card-holding behavior exists for the second credit card account.
The second constructing subunit may construct a second training sample for a second credit card account for which a transaction occurred within a third predetermined time period before the second decision date; or the training device 300 may further include a rejecting unit, configured to reject the second training sample corresponding to the second credit card account for which no transaction has occurred within a third predetermined time range before the second determination date.
The account information may include account transaction information and/or credit information of a user with which the credit card account is associated. The features may be classified into transaction-like features and credit-like features. For the feature dimensions that the transaction-class features and the credit-class features may include and the feature types that may be specifically included in each feature dimension, reference may be made to the above description, and details are not repeated here.
The training unit 330 is configured to train a card-holding recognition model for recognizing whether a card-holding behavior exists in the credit card account based on at least one of the first training samples and at least one of the second training samples.
The training unit 330 may include a first training unit and a second training unit. The first training unit is used for training a first card-holding identification model for identifying whether a credit card account has card-holding behaviors or not based on the sample labels and the transaction characteristics of the training samples; the second training unit is used for training a second card-holding identification model for identifying whether a card-holding behavior exists in a credit card account or not based on the sample label and the credit-class characteristics of the training sample; the first card maintenance identification model and the second card maintenance identification model form a card maintenance identification model for identifying whether card maintenance behaviors exist in a credit card account.
The training apparatus 300 for card-keeping identification models may further include a weight assignment unit configured to assign a first weight to the first card-keeping identification model and assign a second weight to the second card-keeping identification model, where the second weight is smaller than the first weight.
The training device 300 for the card feeding recognition model can further comprise an analysis unit and a selection unit. The analysis unit is used for analyzing the characteristics of the credit card account to determine the importance of each characteristic on judging whether the credit card account has a card-holding behavior; the selection unit is used for selecting one or more characteristics as sample characteristics according to the sequence of the importance from large to small.
The analysis unit may include a grouping unit, a calculation unit, and an importance determination unit. The grouping unit is used for grouping according to the value of the characteristic; the calculating unit is used for calculating a first proportion of the number of credit card accounts with card maintenance behaviors in each group to the number of all credit card accounts with card maintenance behaviors in the credit card account set, and a second proportion of the number of credit card accounts without card maintenance behaviors in each group to the number of all credit card accounts without card maintenance behaviors in the credit card account set; the importance determination unit is used for determining the importance of the features on judging whether the credit card account has card-keeping behavior, wherein the importance of the features is equal to the sum of the importance of the features under each group, and the importance of the features under a single group is positively correlated with the difference between the first proportion and the second proportion.
The training device 300 for the card-holding identification model may further include a sample data obtaining unit, configured to obtain new sample data, and at this time, the training unit 330 performs incremental training on the card-holding identification model based on the new sample data.
It should be understood that, the specific implementation manner of the training apparatus 300 for a card-feeding recognition model according to an exemplary embodiment of the present invention can be implemented with reference to the above description of the training method for a card-feeding recognition model in conjunction with fig. 1, and will not be described herein again.
Fig. 4 is a block diagram illustrating a structure of an apparatus for recognizing whether a card-holding behavior of a credit card exists according to an exemplary embodiment of the present invention. The functional modules of the apparatus for identifying the presence of card-holding behavior on a credit card may be implemented by hardware, software, or a combination of hardware and software implementing the principles of the present disclosure. It will be appreciated by those skilled in the art that the functional blocks described in fig. 4 may be combined or divided into sub-blocks to implement the principles of the invention described above. Thus, the description herein may support any possible combination, or division, or further definition of the functional modules described herein.
The functional modules that the device for identifying whether the credit card has the card-holding behavior and the operations that can be executed by each functional module are briefly described below, and for the details related thereto, reference may be made to the description above in conjunction with fig. 2, and details are not repeated here.
Referring to fig. 4, the apparatus 400 for identifying whether a credit card has a card-holding behavior includes an obtaining unit 410 and an identifying unit 420.
The obtaining unit 410 is used for obtaining account information of a credit card account to be identified; and
the identifying unit 420 is configured to identify whether a card-holding behavior exists in the credit card account to be identified, based on the account information, by using a card-holding identification model, where the card-holding identification model may be obtained by training according to the training method of the card-holding identification model of the present invention.
The recognition unit 420 may include an extraction unit and an operation unit. The extraction unit is used for extracting one or more characteristics from the account information to construct a prediction sample; and the operation unit is used for inputting the prediction sample into the card-keeping identification model so as to obtain a score which is output by the card-keeping identification model and is used for representing the probability of the card-keeping behavior of the credit card account to be identified.
The apparatus 400 for identifying whether a card-holding behavior exists on a credit card may further include an output unit configured to output, in a case where overdue information exceeding a predetermined number of times exists on the credit card account to be identified, an identification result of whether the card-holding behavior exists on the credit card account to be identified and the overdue information in association with each other.
It should be understood that, according to the embodiment of the present invention, the specific implementation manner of the apparatus 400 for identifying whether a card-holding behavior exists on a credit card can be implemented with reference to the related description of the method for identifying whether a card-holding behavior exists on a credit card in conjunction with fig. 2, and will not be described herein again.
The training method of the card-keeping recognition model, the method and the device for recognizing whether the credit card has the card-keeping behavior according to the exemplary embodiment of the invention are described above with reference to fig. 1 to 4. It should be understood that the above-described method may be implemented by a program recorded on a computer-readable medium, for example, according to an exemplary embodiment of the present invention, there may be provided a computer-readable storage medium storing instructions, wherein the computer program for executing the training method of the card-keeping recognition model of the present invention (for example, as shown in fig. 1) or the method for recognizing whether a card-keeping behavior of a credit card (for example, as shown in fig. 2) is recorded on the computer-readable medium.
The computer program in the computer-readable medium may be executed in an environment deployed in a computer device such as a client, a host, a proxy device, a server, and the like, and it should be noted that the computer program may be used to perform additional steps other than the steps shown in fig. 1 or fig. 2 or perform more specific processing when the steps are performed, and the content of the additional steps and the further processing are described with reference to fig. 1 and fig. 2, and will not be described again to avoid repetition.
It should be noted that the training device of the card-holding recognition model and the device for recognizing whether the credit card has the card-holding behavior according to the exemplary embodiment of the present invention may completely depend on the operation of the computer program to realize the corresponding functions, that is, each device corresponds to each step in the functional architecture of the computer program, so that the whole device is called by a special software package (e.g., lib library) to realize the corresponding functions.
Alternatively, each of the devices shown in fig. 3 and 4 may be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the corresponding operations may be stored in a computer-readable medium such as a storage medium, so that a processor may perform the corresponding operations by reading and executing the corresponding program code or code segments.
For example, exemplary embodiments of the present invention may also be implemented as a computing device including a storage component having stored therein a set of computer-executable instructions that, when executed by the processor, perform a method of training a card-based identification model or a method for identifying whether card-based behavior exists with a credit card.
In particular, the computing devices may be deployed in servers or clients, as well as on node devices in a distributed network environment. Further, the computing device may be a PC computer, tablet device, personal digital assistant, smart phone, web application, or other device capable of executing the set of instructions described above.
The computing device need not be a single computing device, but can be any device or collection of circuits capable of executing the instructions (or sets of instructions) described above, individually or in combination. The computing device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In the computing device, the processor may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
Some operations described in the training method of the card-worn recognition model or the method for recognizing whether a credit card has a card-worn behavior according to an exemplary embodiment of the present invention may be implemented by software, some operations may be implemented by hardware, and further, the operations may be implemented by a combination of hardware and software.
The processor may execute instructions or code stored in one of the memory components, which may also store data. Instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.
The memory component may be integral to the processor, e.g., having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the storage component may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The storage component and the processor may be operatively coupled or may communicate with each other, such as through an I/O port, a network connection, etc., so that the processor can read files stored in the storage component.
Further, the computing device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the computing device may be connected to each other via a bus and/or a network.
Operations involved in a training method of a card-holding recognition model or a method for recognizing whether a credit card has a card-holding behavior according to an exemplary embodiment of the present invention may be described as various interconnected or coupled functional blocks or functional diagrams. However, these functional blocks or functional diagrams may be equally integrated into a single logic device or operated on by non-exact boundaries.
For example, as described above, the training apparatus for a card-holding recognition model or the apparatus for recognizing whether a credit card has a card-holding behavior according to an exemplary embodiment of the present invention may include a storage unit and a processor, wherein the storage unit stores therein a set of computer-executable instructions that, when executed by the processor, perform the above-mentioned training method for a card-holding recognition model or the above-mentioned method for recognizing whether a credit card has a card-holding behavior.
While exemplary embodiments of the invention have been described above, it should be understood that the above description is illustrative only and not exhaustive, and that the invention is not limited to the exemplary embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. Therefore, the protection scope of the present invention should be subject to the scope of the claims.

Claims (10)

1. A training method of a card raising recognition model comprises the following steps:
constructing a first training sample according to a first credit card account judged to have the card-keeping behavior by a service party, wherein the first training sample comprises a sample label used for indicating that the first credit card account has the card-keeping behavior and at least one sample characteristic;
constructing a second training sample according to a second credit card account existing in the set after the first credit card account is removed from the credit card account set maintained by the business party, wherein the second training sample comprises a sample label used for indicating that no card-holding behavior exists in the second credit card account and at least one sample characteristic; and
training a card-keeping recognition model for recognizing whether card-keeping behaviors exist in the credit card account or not based on at least one piece of the first training sample and at least one piece of the second training sample.
2. The training method of the card-keeping recognition model according to claim 1, wherein the step of constructing the first training sample comprises:
according to a first judging date that the first credit card account is judged to have card-keeping behavior by a service party, acquiring account information of the first credit card account in a first preset time range before the first judging date;
determining characteristics of the first credit card account based on the acquired account information;
constructing a first training sample corresponding to the first credit card account based on the label and the feature for indicating that card-holding behavior exists for the first credit card account.
3. The training method of the card-keeping recognition model according to claim 1, wherein the step of constructing the second training sample comprises:
determining a second judging date of the second credit card account according to a first judging date of one or more first credit card accounts judged by a service party to have card-keeping behavior, so that the date distribution of the second judging date is consistent or basically consistent with the date distribution of the first judging date;
acquiring account information of the second credit card account within a second preset time range before the second judgment date;
determining a characteristic of the second credit card account based on the acquired account information;
constructing a second training sample corresponding to the second credit card account based on the label and the feature indicating that no card-holding behavior exists for the second credit card account.
4. The training method of the card-raising recognition model according to claim 3, further comprising:
constructing a second training sample for a second credit card account on which transaction activities occurred within a third predetermined time range before the second decision date; or
And rejecting second training samples corresponding to second credit card accounts which do not have transaction behaviors within a third preset time range before the second judgment date.
5. The training method of the card raising recognition model according to claim 2 or 3, wherein the account information comprises at least one of the following:
account transaction information;
credit information of a user associated with the credit card account.
6. A method for identifying the presence of card-holding behavior for a credit card, comprising:
acquiring account information of a credit card account to be identified; and
on the basis of the account information, whether card-keeping behavior exists in the credit card account to be identified is identified by using a card-keeping identification model, wherein the card-keeping identification model is obtained by training according to the training method of any one of claims 1 to 5.
7. A training device for a card raising recognition model comprises:
the system comprises a first construction unit, a second construction unit and a third construction unit, wherein the first construction unit is used for constructing a first training sample according to a first credit card account judged to have the card-keeping behavior by a service party, and the first training sample comprises a sample label and at least one sample characteristic, wherein the sample label is used for indicating that the first credit card account has the card-keeping behavior;
a second construction unit, configured to construct a second training sample according to a second credit card account existing in a set after the first credit card account is removed from the set of credit card accounts maintained by the service party, where the second training sample includes a sample label used for indicating that no card-holding behavior exists in the second credit card account and at least one sample feature; and
and the training unit is used for training a card-holding identification model for identifying whether a card-holding behavior exists in the credit card account or not based on at least one first training sample and at least one second training sample.
8. An apparatus for identifying the presence of card-holding behavior on a credit card, comprising:
the acquisition unit is used for acquiring account information of a credit card account to be identified; and
an identification unit, configured to identify whether a card-holding behavior exists in the credit card account to be identified, based on the account information, using a card-holding identification model, where the card-holding identification model is trained according to the training method of any one of claims 1 to 5.
9. A system comprising at least one computing device and at least one storage device storing instructions that, when executed by the at least one computing device, cause the at least one computing device to perform the method of any of claims 1 to 6.
10. A computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform the method of any of claims 1 to 6.
CN201911162068.9A 2019-11-25 2019-11-25 Method and device for training card maintenance identification model and identifying card maintenance behavior Pending CN110991650A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911162068.9A CN110991650A (en) 2019-11-25 2019-11-25 Method and device for training card maintenance identification model and identifying card maintenance behavior

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911162068.9A CN110991650A (en) 2019-11-25 2019-11-25 Method and device for training card maintenance identification model and identifying card maintenance behavior

Publications (1)

Publication Number Publication Date
CN110991650A true CN110991650A (en) 2020-04-10

Family

ID=70086343

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911162068.9A Pending CN110991650A (en) 2019-11-25 2019-11-25 Method and device for training card maintenance identification model and identifying card maintenance behavior

Country Status (1)

Country Link
CN (1) CN110991650A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507829A (en) * 2020-04-22 2020-08-07 广州东百信息科技有限公司 Overseas credit card wind control model iteration method, device, equipment and storage medium
CN111754337A (en) * 2020-06-30 2020-10-09 上海观安信息技术股份有限公司 Method and system for identifying credit card maintenance contract group
CN115545088A (en) * 2022-02-22 2022-12-30 北京百度网讯科技有限公司 Model construction method, classification method and device and electronic equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636912A (en) * 2015-02-13 2015-05-20 银联智惠信息服务(上海)有限公司 Identification method and device for withdrawal of credit cards
CN107103171A (en) * 2016-02-19 2017-08-29 阿里巴巴集团控股有限公司 The modeling method and device of machine learning model
CN108389125A (en) * 2018-02-27 2018-08-10 挖财网络技术有限公司 The overdue Risk Forecast Method and device of credit applications
CN109034209A (en) * 2018-07-03 2018-12-18 阿里巴巴集团控股有限公司 The training method and device of the real-time identification model of active risk
CN109460795A (en) * 2018-12-17 2019-03-12 北京三快在线科技有限公司 Classifier training method, apparatus, electronic equipment and computer-readable medium
CN109978033A (en) * 2019-03-15 2019-07-05 第四范式(北京)技术有限公司 The method and apparatus of the building of biconditional operation people's identification model and biconditional operation people identification
CN110009174A (en) * 2018-12-13 2019-07-12 阿里巴巴集团控股有限公司 Risk identification model training method, device and server
CN110046200A (en) * 2018-11-07 2019-07-23 阿里巴巴集团控股有限公司 Text trust model analysis method, equipment and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104636912A (en) * 2015-02-13 2015-05-20 银联智惠信息服务(上海)有限公司 Identification method and device for withdrawal of credit cards
CN107103171A (en) * 2016-02-19 2017-08-29 阿里巴巴集团控股有限公司 The modeling method and device of machine learning model
CN108389125A (en) * 2018-02-27 2018-08-10 挖财网络技术有限公司 The overdue Risk Forecast Method and device of credit applications
CN109034209A (en) * 2018-07-03 2018-12-18 阿里巴巴集团控股有限公司 The training method and device of the real-time identification model of active risk
CN110046200A (en) * 2018-11-07 2019-07-23 阿里巴巴集团控股有限公司 Text trust model analysis method, equipment and device
CN110009174A (en) * 2018-12-13 2019-07-12 阿里巴巴集团控股有限公司 Risk identification model training method, device and server
CN109460795A (en) * 2018-12-17 2019-03-12 北京三快在线科技有限公司 Classifier training method, apparatus, electronic equipment and computer-readable medium
CN109978033A (en) * 2019-03-15 2019-07-05 第四范式(北京)技术有限公司 The method and apparatus of the building of biconditional operation people's identification model and biconditional operation people identification

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507829A (en) * 2020-04-22 2020-08-07 广州东百信息科技有限公司 Overseas credit card wind control model iteration method, device, equipment and storage medium
CN111754337A (en) * 2020-06-30 2020-10-09 上海观安信息技术股份有限公司 Method and system for identifying credit card maintenance contract group
CN111754337B (en) * 2020-06-30 2024-02-23 上海观安信息技术股份有限公司 Method and system for identifying credit card maintenance card present community
CN115545088A (en) * 2022-02-22 2022-12-30 北京百度网讯科技有限公司 Model construction method, classification method and device and electronic equipment
CN115545088B (en) * 2022-02-22 2023-10-24 北京百度网讯科技有限公司 Model construction method, classification method, device and electronic equipment

Similar Documents

Publication Publication Date Title
CN111967779B (en) Risk assessment method, device and equipment
US20160364727A1 (en) System and method for identifying compromised accounts
US8170998B2 (en) Methods, systems, and computer program products for estimating accuracy of linking of customer relationships
EP3121782A1 (en) Systems and methods for identifying information related to payment card breaches
CN110895758B (en) Screening method, device and system for credit card account with cheating transaction
CN110991650A (en) Method and device for training card maintenance identification model and identifying card maintenance behavior
CN109493086B (en) Method and device for determining illegal commercial tenant
CN110648214A (en) Method and device for determining abnormal account
CN110659961A (en) Method and device for identifying off-line commercial tenant
CN112581271B (en) Merchant transaction risk monitoring method, device, equipment and storage medium
CN110675078A (en) Marketing company risk diagnosis method, system, computer terminal and storage medium
CN112819476A (en) Risk identification method and device, nonvolatile storage medium and processor
CN113034046A (en) Data risk metering method and device, electronic equipment and storage medium
CN110942312A (en) POS machine cash register identification method, system, equipment and storage medium
CN113159924A (en) Method and device for determining trusted client object
CN112884480A (en) Method and device for constructing abnormal transaction identification model, computer equipment and medium
CN117350854A (en) Funds tracking method, apparatus, electronic device and storage medium
CN111242763A (en) Method and device for determining target user group
CN110570301B (en) Risk identification method, device, equipment and medium
CN116308370A (en) Training method of abnormal transaction recognition model, abnormal transaction recognition method and device
CN114626863A (en) Detection method, device, equipment and storage medium for export tax cheating enterprise
CN113052604A (en) Object detection method, device, equipment and storage medium
Kang Fraud Detection in Mobile Money Transactions Using Machine Learning
CN111091472A (en) Data processing method, device and equipment
Huda et al. IDENTIFICATION OF FRAUD ATTRIBUTES FOR DETECTING FRAUD BASED ONLINE SALES TRANSACTION

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination