CN114463113A - Method and device for supplementing positive samples in credit investigation wind control modeling - Google Patents

Method and device for supplementing positive samples in credit investigation wind control modeling Download PDF

Info

Publication number
CN114463113A
CN114463113A CN202210099499.0A CN202210099499A CN114463113A CN 114463113 A CN114463113 A CN 114463113A CN 202210099499 A CN202210099499 A CN 202210099499A CN 114463113 A CN114463113 A CN 114463113A
Authority
CN
China
Prior art keywords
credit
report
overdue
credit investigation
investigation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210099499.0A
Other languages
Chinese (zh)
Inventor
周晓瑞
卓正兴
杨青
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Du Xiaoman Technology Beijing Co Ltd
Original Assignee
Du Xiaoman Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Du Xiaoman Technology Beijing Co Ltd filed Critical Du Xiaoman Technology Beijing Co Ltd
Priority to CN202210099499.0A priority Critical patent/CN114463113A/en
Publication of CN114463113A publication Critical patent/CN114463113A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/03Credit; Loans; Processing thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The application provides a method and a device for supplementing positive samples in credit investigation wind control modeling, wherein the method comprises the following steps: selecting a plurality of credit investigation reports corresponding to the unused credit users of the institution from the total credit investigation reports of the institution; analyzing the loan records in the credit investigation report and judging whether an overdue loan record exists or not aiming at each credit investigation report in the plurality of credit investigation reports, if so, recording detailed information corresponding to the overdue loan record, otherwise, adding a non-overdue mark to the credit investigation report; and screening out a positive sample for credit investigation wind control modeling from the analyzed plurality of credit investigation reports. According to the method, overdue performance of users without credit in other institutions is inferred to be overdue performance of loan in the institution, and the overdue performance is used as a label for model training to expand a positive sample for training of the credit investigation wind control model, so that the problems of insufficient positive sample and unbalanced sample in credit investigation wind control modeling are effectively solved, and the risk recognition capability and generalization capability of the credit investigation wind control model are enhanced.

Description

Method and device for supplementing positive samples in credit investigation wind control modeling
Technical Field
The application relates to the technical field of computers, in particular to a technical scheme for supplementing positive samples in credit investigation wind control modeling.
Background
In the field of credit, a wind control model is established to find customers who may be overdue, and then whether to loan or not is determined according to the customer overdue possibility given by the wind control model and policy rules. Compared with other machine learning tasks, the wind control modeling has the remarkable characteristic that positive samples in a training data set are few in proportion, so that the class distribution of the samples is very unbalanced, and the result is that the learned model pays more attention to negative samples, the sensitivity of the positive samples is reduced, and the performance of the model in actual prediction is influenced. Here, the positive samples refer to samples with overdue behavior, and in the wind control modeling, the proportion of overdue samples in the model training set is usually low, so that the positive samples belong to a few samples, the negative samples refer to samples without overdue behavior, and most samples in the model training set do not have overdue behavior and belong to a majority of samples.
In the prior art, a method for solving the problem of data imbalance in wind control modeling is mainly based on a resampling training set, and the specific method comprises the following steps: undersampling, oversampling, and Smote synthesize a few oversamples. Oversampling randomly replicates a few samples (positive samples), balancing the positive and negative sample ratios by increasing the size of the few samples. Undersampling reduces the size of the majority of samples by randomly undersampling the majority (negative samples) as opposed to oversampling. Both undersampling and oversampling are the equalization of positive and negative sample ratios without constructing new samples. Smote synthesizes a few oversampling samples, finds n adjacent positive samples adjacent to each target positive sample, and conducts random linear interpolation on the target positive samples to construct new positive samples. The formula of the random linear interpolation is as follows
xnew=xi+rand(0,1)*(yj-xi),j=1,2,...,n
Wherein x isnewRepresenting newly constructed positive samples, xiRepresenting a positive sample of the object, yjA jth adjacent positive sample representing the target positive sample. A new positive sample is constructed by adding a perturbation value to the target positive sample at a random ratio between 0 and the distance to the adjacent positive sample.
Oversampling can simply and directly improve the problem of unbalanced sample distribution, but since the supplemented positive samples are obtained by simply copying the existing positive samples, so that the variance of the positive samples is smaller than the actual variance, the model can over-emphasize the existing positive samples, and if part of the positive samples are marked by errors or noise, the errors or noise can be amplified by times, so that the obvious disadvantage of oversampling is that the positive samples are over-fitted. Undersampling improves the sample distribution imbalance problem by randomly deleting negative samples, and the main disadvantage of this method is that potentially useful data important for model training may be discarded, resulting in bias of model training. Smote synthesis minority oversampling increases the proportion of positive samples by constructing new positive samples after clustering the positive samples, can relieve the problem of reduction of the variance of the positive samples in oversampling, and simultaneously avoids the problem of discarding important data in undersampling.
Disclosure of Invention
The application aims to provide a technical scheme for supplementing positive samples in credit investigation wind control modeling.
According to an embodiment of the present application, there is provided a method for supplementing positive samples in credit investigation wind control modeling, wherein the method includes:
selecting a plurality of credit investigation reports corresponding to the unused credit users of the institution from the total credit investigation reports of the institution;
analyzing the loan records in the credit investigation report and judging whether an overdue loan record exists or not aiming at each credit investigation report in the plurality of credit investigation reports, if so, recording detailed information corresponding to the overdue loan record, otherwise, adding a non-overdue mark to the credit investigation report;
and screening out a positive sample for credit investigation wind control modeling from the analyzed plurality of credit investigation reports.
According to another embodiment of the present application, there is provided an apparatus for supplementing a positive sample in credit wind control modeling, wherein the apparatus includes:
a module for selecting a plurality of credit investigation reports corresponding to the unused credit users of the institution from the total credit investigation reports of the institution;
a module used for analyzing the loan record in the credit investigation report and judging whether an overdue loan record exists or not aiming at each credit investigation report in the plurality of credit investigation reports, if so, recording detailed information corresponding to the overdue loan record, and if not, adding an overdue mark to the credit investigation report;
and screening out a positive sample for credit investigation wind control modeling from the analyzed plurality of credit investigation reports.
There is also provided, in accordance with another embodiment of the present application, a computer apparatus, wherein the computer apparatus includes: a memory for storing one or more programs; one or more processors coupled with the memory, the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method for supplementing positive samples in credit wind control modeling described herein.
According to another embodiment of the present application, there is also provided a computer-readable storage medium having stored thereon a computer program executable by a processor to perform the method for supplementing positive samples in credit wind control modeling described herein.
Compared with the prior art, the method has the following advantages: the method has the advantages that the available positive samples can be screened out and supplemented to the training set of the credit investigation wind control modeling by analyzing whether overdue loan records exist in the credit investigation report of the credit investigation user of the mechanism and combining certain screening conditions, the positive samples of the credit investigation wind control model training can be expanded by deducing the overdue performance of the credit investigation user in other mechanisms as the overdue performance of the loan of the mechanism and using the overdue performance as a label of the model training, so that the problems of insufficient positive samples and unbalanced samples in the credit investigation wind control modeling are effectively relieved, and the risk identification capability and the generalization capability of the credit investigation wind control model are enhanced; the supplementary positive samples are not simple to copy the existing positive samples, so the problem of the reduction of the variance of the positive samples caused by over-sampling is avoided, meanwhile, the source of the positive sample supplemented by the method is consistent with that of the real positive sample, and the positive sample is from the user applying for credit at the mechanism, therefore, the distribution of the two samples is close, the sample variance is also close, the sample screened by the method is reasonable as a supplementary positive sample, and, unlike under-sampling, the scheme of the present application does not discard negative samples, thus avoiding the risk of useful data loss, and unlike Smote synthesis, which synthesizes a few over-samples, the scheme of the present application does not construct virtual samples, but the real overdue performance of the user in the related loan behaviors is mined, so that the problem that a sample with enumeration type characteristics and missing values cannot be constructed by Smote synthesis of few oversampling is solved, and the supplemented real sample is also favorable for learning a credit investigation wind control model with good interpretability.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments made with reference to the following drawings:
fig. 1 shows a schematic flow chart of a method for supplementing positive samples in credit investigation wind control modeling according to an embodiment of the present application;
FIG. 2 illustrates a system flow diagram for supplementing positive samples in credit wind control modeling according to an example of the present application;
figure 3 shows a flow diagram for credit report selection according to an example of the present application;
FIG. 4 illustrates a flow diagram for loan record resolution according to an example of the application;
figure 5 shows a flow diagram for credit report aggregation according to an example of the present application;
FIG. 6 shows a flow diagram for conditional screening of an example of the present application;
FIG. 7 shows a schematic structural diagram of an apparatus for supplementing positive samples in credit wind control modeling according to an embodiment of the present application;
FIG. 8 illustrates an exemplary system that can be used to implement the various embodiments described in this application.
The same or similar reference numbers in the drawings identify the same or similar elements.
Detailed Description
Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel, concurrently, or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
The term "device" in this context refers to an intelligent electronic device that can perform predetermined processes such as numerical calculations and/or logic calculations by executing predetermined programs or instructions, and may include a processor and a memory, wherein the predetermined processes are performed by the processor executing program instructions prestored in the memory, or performed by hardware such as an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), or performed by a combination of the above two.
The technical scheme of the application is mainly realized by computer equipment. Wherein the computer device comprises a network device and a user device. The network device includes, but is not limited to, a single network server, a server group consisting of a plurality of network servers, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of computers or network servers, wherein Cloud Computing is one of distributed Computing, a super virtual computer consisting of a collection of loosely coupled computers. The user equipment includes but is not limited to PCs, tablets, smart phones, IPTV, PDAs, wearable devices, and the like. The computer equipment can be independently operated to realize the application, and can also be accessed into a network to realize the application through the interactive operation with other computer equipment in the network. The network in which the computer device is located includes, but is not limited to, the internet, a wide area network, a metropolitan area network, a local area network, a VPN network, a wireless Ad Hoc network (Ad Hoc network), and the like.
It should be noted that the above-mentioned computer devices are only examples, and other computer devices that are currently available or that may come into existence in the future, such as may be applicable to the present application, are also included within the scope of the present application and are incorporated herein by reference.
The methods discussed later herein, some of which are illustrated by flow diagrams, may be implemented by hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the necessary tasks may be stored in a machine or computer readable medium such as a storage medium. The processor(s) may perform the necessary tasks.
Specific structural and functional details disclosed herein are merely representative and are provided for purposes of describing example embodiments of the present application. This application may, however, be embodied in many alternate forms and should not be construed as limited to only the embodiments set forth herein.
It will be understood that, although the terms first, second, etc. may be used herein to describe various elements, these elements should not be limited by these terms. These terms are only used to distinguish one element from another. For example, a first element may be termed a second element, and, similarly, a second element may be termed a first element, without departing from the scope of example embodiments. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used herein, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be noted that, in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed substantially concurrently, or the figures may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
Firstly, the credit investigation wind control modeling refers to a wind control modeling task for predicting overdue possibility of a user by training a model by taking a human credit investigation report as a data source and mining risk characteristics of the user; the credit investigation report (also called as "credit investigation report" in this scheme) is a record of personal credit information issued by the people's bank in China, records personal identity information and historical credit transaction detail information, and is an important data source for evaluating the risk of the applicant by the financial institution. Aiming at the technical problems existing in the method for solving the data imbalance problem in wind control modeling in the prior art and the data mining of the human credit investigation report, the scheme provides a positive sample supplement scheme based on inference expression. According to the application, historical overdue expressions of unused credit users of the mechanism in other mechanisms in the people's bank credit report can be mined, and the overdue expressions are screened under certain conditions to be inferred as overdue expressions of the users lending in the mechanism, so that positive samples for training the credit investigation wind control model are supplemented.
The embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
Fig. 1 shows a schematic flow chart of a method for supplementing positive samples in credit investigation wind control modeling according to an embodiment of the present application. The method of the present embodiment is implemented by a computer device. The method according to the present embodiment includes step S11, step S12, and step S13. In step S11, the computer device selects a plurality of credit investigation reports corresponding to the unused credit users of the institution from the total credit investigation reports of the institution; in step S12, the computer device analyzes the loan record in the credit investigation report and determines whether an overdue loan record exists for each credit investigation report in the plurality of credit investigation reports, if yes, records detailed information corresponding to the overdue loan record, otherwise, adds a non-overdue flag to the credit investigation report; in step S13, the computer device screens out positive samples for credit formation wind control modeling from the parsed multiple credit reports.
In step S11, the computer device selects a plurality of credit reports corresponding to the unused credit users of the institution from the total credit reports of the institution. The mechanism refers to a financial mechanism for inquiring a credit investigation report of a user and carrying out credit investigation pneumatic control modeling to identify overdue risks of the user. In some embodiments, the non-trusted users refer to users who are not trusted by the institution, and the non-trusted users include users who refuse the trust granted by the institution and/or users who pass the trust granted by the institution but do not actually use the trust; the credit is used for using a credit line, the user uses the credit line, namely the user represents to initiate a debit, and the label can be marked to the user through whether the user normally repays the credit. It should be noted that, a user who has used credit in the mechanism can certainly determine the label according to whether the user normally pays, and the user is in the training set and does not need to perform screening.
In some embodiments, a filtering condition can be preset, and the credit investigation report of the unused credit user of the institution can be selected from the full credit investigation report of the institution by means of conditional filtering. In some embodiments, the predetermined filter condition is a condition that defines an untrusted user of the facility; for example, the predetermined filtering condition limits the credit investigation report of the user which only selects the un-credited user to reject the credit approval of the mechanism; for another example, the predetermined filtering condition defines the credit investigation report of all users who do not use credit in the institution, that is, users who do not use credit include users who have refused the credit authorization of the institution and users who have passed the credit authorization but do not actually use credit. In some embodiments, the predetermined filter condition is a condition defining the non-trusted users of the institution and credit investigation reports, for example, the predetermined filter condition defines users who deny credit to the institution for the non-trusted users and users who pass credit through the institution but do not actually use credit, and at the same time defines credit investigation reports whose query time is selected within a predetermined time range. In practical applications, the present mechanism may set predetermined filtering conditions based on demand. In some embodiments, if the number of selected credit investigation reports is greater than the first predetermined number, a new filtering condition may be further set for the secondary screening.
As an example of step S11, for each credit investigation report of the total credit investigation report of the institution, it is determined whether the user corresponding to the credit investigation report uses credit at the institution, if so, the credit investigation report is skipped, otherwise, the credit investigation report is selected.
In step S12, the computer device analyzes the loan record in the credit investigation report and determines whether an overdue loan record exists for each credit investigation report in the plurality of credit investigation reports, if yes, records detailed information corresponding to the overdue loan record, otherwise, adds an overdue-free flag to the credit investigation report. In some embodiments, for each credit investigation report, each loan record in the credit investigation report is sequentially analyzed, wherein whether a loan is overdue is judged on the condition that the balance of the outstanding fund is greater than 0 for more than 30 days, if yes, detail information related to the loan is recorded, wherein the detail information includes but is not limited to loan issuing time, loan issuing mechanism, loan type, loan balance, overdue total amount, loan overdue occurrence time and the like, and if not, detail information does not need to be recorded, and only an overdue-free mark needs to be added to the credit investigation report (for example, the credit investigation report is marked as "overdue-free"). In some embodiments, if there is an overdue loan record in the credit report, an overdue flag may be added to the credit report, such as "overdue" the credit report flag; in some embodiments, if there are more than one overdue loan records in a credit report, the number of overdue loans may be recorded for subsequent screening. In some embodiments, the credit report query time corresponding to each credit report is recorded for subsequent screening.
In step S13, the computer device screens out positive samples for credit formation wind control modeling from the parsed multiple credit reports. In some embodiments, a credit report meeting the screening condition is screened from a plurality of analyzed credit reports based on a preset screening condition, and the screened credit report is supplemented as a positive sample for modeling of credit wind control. Wherein the screening conditions include any pre-set conditions for positive sample screening; in some embodiments, the screening conditions are determined based on the actual business requirements of the credit investigation wind control modeling, for example, after setting the screening conditions for the basic screening, the corresponding screening conditions are further added for different loan types. It should be noted that any implementation for screening out a positive sample for credit investigation wind control modeling from a plurality of analyzed credit investigation reports is included in the scope of protection of the present application.
In some embodiments, the step S13 includes a step S131 (not shown) and a step S132 (not shown). In step S131, the computer device aggregates the parsed multiple credit reports into multiple credit report sets from the user dimension; in step S132, the computer device screens out positive samples for credit formation wind control modeling from the plurality of credit report sets.
In some embodiments, after the credit investigation reports of all the unused credit users are analyzed, the analysis is performed by using the user as a dimension to aggregate all the analyzed credit investigation reports to obtain a plurality of credit investigation report sets, each credit investigation report set uniquely corresponds to one unused credit user, that is, all the credit investigation reports of one unused credit user are aggregated to the same credit investigation report set, and different unused credit users correspond to different credit investigation report sets, wherein each analyzed credit investigation report includes various information recorded or added in the analysis process, such as detailed information corresponding to an overdue loan record, no overdue mark, an overdue mark, query time of the credit investigation report (also referred to as query time in this application), and the like.
In some embodiments, the step S131 further includes: and aggregating the plurality of credit investigation reports from the user dimension to obtain a plurality of credit investigation report sets, and sequencing the credit investigation reports in each credit investigation report set according to the corresponding credit investigation time. In some embodiments, after the credit investigation reports of all the users without credit are analyzed, the analysis is performed by using the users as dimensions to aggregate all the analyzed credit investigation reports to obtain a plurality of credit investigation report sets, and the credit investigation report query times corresponding to all the credit investigation reports are sequenced in each credit investigation report set aggregated by each user without credit for subsequent screening.
In some embodiments, the step S132 further includes: and for each credit report set, judging whether an overdue credit report is included in the credit report set, if not, skipping the credit report set, if so, judging whether an unexpired credit report exists before the overdue credit report, if not, skipping the credit report set, and if so, screening the unexpired credit report as a positive sample for credit-related wind control modeling. In the embodiments, since the screened user (i.e. the user corresponding to the screened credit investigation report) has overdue behavior between two credit investigation report queries, it can be inferred that if the user uses credit in the institution for the period of time, the user also has overdue behavior of loan, so that the user can be supplemented with a positive sample for the training of credit investigation pneumatic control modeling. As an example, the following operations are performed for each user (i.e., each credit report set): firstly, judging whether an overdue loan record exists in the history of the user (namely whether the overdue loan record exists in a credit investigation report set corresponding to the user), and if not, skipping the user (namely, skipping because the credit investigation report set does not meet the condition); if the user has the overdue loan record historically, but before the credit investigation inquiry of the overdue loan record exists each time, the credit investigation report inquiry without the overdue loan record does not exist, skipping the user; if the user has an overdue loan record in history and the credit investigation report query of the overdue loan record exists before the credit investigation report query of the overdue loan record exists, the credit investigation report without the overdue loan record before the credit investigation report with the overdue loan record exists is supplemented as a positive sample, namely the last non-overdue credit investigation report with the overdue loan record for the first time of the user is supplemented as a positive sample.
In some embodiments, the step S132 further includes: and for each credit report set, judging whether an overdue credit report is included in the credit report set or not, if not, skipping the credit report set, if so, judging whether an overdue credit report exists before the overdue credit report, if not, skipping the credit report set, if so, judging whether a preset service screening condition is met or not according to detail information corresponding to an overdue loan record in the overdue credit report, if so, screening the overdue credit report as a positive sample for credit-reporting wind-control modeling, and otherwise, skipping the credit report set. In the embodiments, after the user with overdue behavior between two credit investigation report queries is screened, screening is further continued based on a preset service screening condition, so that the user with overdue behavior on a specific service can be screened, therefore, it is possible to infer that the loan overdue also occurs if the screened user uses credit on similar services of the mechanism in the period of time. In some embodiments, the predetermined traffic screening conditions are determined in connection with actual traffic modeled by the credit wind control, e.g. the predetermined traffic screening conditions comprise at least any one of: the time interval between the credit investigation report query of the user without the overdue loan record and the credit investigation report query of the user with the overdue loan record is larger than a preset interval (such as 1 year), the loan type of the overdue loan is a specific type (such as a small cash loan), and the overdue total amount of the overdue loan is larger than a preset amount (such as 1000 yuan). As an example, the following is performed for each user: firstly, judging whether the user has a overdue loan record in history, and skipping the user if the user does not have the overdue loan record; if the user has past loan records in history, but before the inquiry of the credit investigation report of the past loan records exists each time, the inquiry of the credit investigation report without the past loan records does not exist, screening the credit investigation report which accords with the preset business screening condition according to the detail record corresponding to the past loan records, and supplementing the screened credit investigation report with a positive sample which is not used for credit investigation pneumatic control modeling. Therefore, the method and the system can utilize the similarity of loan businesses among different institutions to infer the overdue loan behavior of the user in other institutions in the credit investigation report as the overdue loan behavior of the user in the institution on the basis of meeting certain conditions, avoid the defects of the conventional sampling-based positive sample supplement technology, and supplement high-quality positive samples for credit investigation wind control modeling.
Fig. 2 shows a flowchart of a system for supplementing positive samples in credit investigation wind control modeling according to an example of the present application, and the specific flow includes: 1) selecting credit investigation reports of users without credit of the organization in a condition filtering mode for subsequent analysis and screening steps; 2) the method comprises the steps of loan record analysis, wherein a credit report records detail information of each historical loan of a user, each loan record is analyzed in sequence and whether the loan is overdue is judged, if the overdue loan record exists, detail information (namely overdue loan attribute) related to the loan record is recorded, and if the overdue loan record does not exist, the detail information is not recorded; 3) aggregating credit investigation reports, after analyzing all credit investigation reports of users without credit, aggregating all analyzed credit investigation reports by taking users as dimensions, and sequencing each aggregated credit investigation report set by each user according to the inquiry time of the credit investigation reports for next filtering and screening according to conditions; 4) and (4) condition screening, namely performing condition judgment on the credit investigation report after the analysis and aggregation steps are completed, and selecting overdue users meeting the conditions to supplement the positive samples for the training of the wind control model.
Fig. 3 is a schematic diagram illustrating a flow for credit investigation report selection according to an example of the present application, where the specific flow includes: and judging whether the user corresponding to the credit report uses the credit in the mechanism or not for each credit report, if so, skipping the credit report, and if not, selecting the credit report.
Fig. 4 is a schematic flow chart illustrating an exemplary process for loan record analysis according to the present application, where the detailed flow includes: for the credit investigation report of the un-credit user, loan record analysis is firstly carried out, namely, each loan record is sequentially analyzed and whether the loan is overdue is judged, the overdue judgment condition is that the loan is overdue for more than 30 days and the non-return fund is more than 0, if the overdue loan record exists, detail information related to the loan record is recorded, wherein the detail information comprises loan issuing time, a loan issuing mechanism, loan types, loan balance, overdue total amount, overdue loan occurrence time and the like, and if the overdue loan record does not exist, the detail information is not recorded, and the credit investigation report is marked to be free of overdue.
Fig. 5 illustrates a flow diagram for aggregation of credit investigation reports according to an example of the present application, and as shown in fig. 5, analyzed credit investigation reports are aggregated into a plurality of credit investigation report sets, each credit investigation set corresponds to a different user (e.g., users 1, 2.. multidot.n shown in fig. 5), each credit investigation report records its query time and whether it is overdue, and the overdue credit investigation report also records overdue details (i.e., the details information described above), and the credit investigation reports in each credit investigation report set are sorted according to query time.
Fig. 6 shows a schematic flow chart for conditional screening according to an example of the present application, where the specific flow includes: for the credit investigation reports aggregated according to the users (namely each credit investigation report set), firstly judging whether the credit investigation report with overdue record exists, if not, skipping the user; if yes, further judging whether an overdue credit report exists before the overdue credit report, and if not, skipping the user; if so, further judging whether other condition screening is passed, if not, skipping the user, and if so, screening out the positive sample meeting the conditions. For example, for each user (i.e., for each credit report set), it is first determined whether there is a overdue record in the history of the user, and if not, the user is skipped; if the user has overdue records in history, but no credit investigation inquiry without overdue records exists before the credit investigation inquiry of the overdue records exists each time, skipping over the user; if the user has overdue records historically and before the credit investigation report inquiry of the overdue records exists, the credit investigation report inquiry without the overdue records exists, screening is carried out according to detail information of the overdue loan records, and specific screening conditions can be determined by combining actual business of credit investigation and control modeling, for example, the time interval between the credit investigation report inquiry without the overdue loan records of the user and the credit investigation inquiry with the overdue loan records of the user is more than 1 year, the loan type of the overdue loan is a small cash loan, the overdue total amount of the overdue loan is more than 1000 yuan, and the like.
According to the scheme of the application, whether overdue loan records exist in credit investigation reports of users without credit of the mechanism can be analyzed, and then, the available positive samples are screened out and supplemented to a training set of credit investigation wind control modeling in combination with a certain screening condition, the positive samples of credit investigation wind control model training can be expanded by deducing overdue expressions of the users without credit in other mechanisms as overdue expressions of the loan in the mechanism and using the overdue expressions as labels of model training, so that the problems of insufficient positive samples and unbalanced samples in the credit investigation wind control modeling are effectively relieved, and the risk identification capability and generalization capability of the credit investigation wind control model are enhanced; the supplementary positive samples are not simple to copy the existing positive samples, so the problem of the reduction of the variance of the positive samples caused by over-sampling is avoided, meanwhile, the source of the positive sample supplemented by the method is consistent with that of the real positive sample, and the positive sample is from the user applying for credit at the mechanism, therefore, the distribution of the two samples is close, the sample variance is also close, the sample screened by the method is reasonable as a supplementary positive sample, and, unlike under-sampling, the scheme of the present application does not discard negative samples, thus avoiding the risk of useful data loss, and unlike Smote synthesis, which synthesizes a few over-samples, the scheme of the present application does not construct virtual samples, but the real overdue performance of the user in the related loan behaviors is mined, so that the problem that a sample with enumeration type characteristics and missing values cannot be constructed by Smote synthesis of few oversampling is solved, and the supplemented real sample is also favorable for learning a credit investigation wind control model with good interpretability.
Fig. 7 shows a schematic structural diagram of an apparatus for supplementing a positive sample in credit investigation wind control modeling according to an embodiment of the present application. The apparatus for supplementing a positive sample in the credit wind control modeling (hereinafter, simply referred to as "first apparatus 1") includes a first module 11, a second module 12, and a third module 13. The first module 11 is used for selecting a plurality of credit investigation reports corresponding to the unused credit users of the institution from the full credit investigation reports of the institution; the second module 12 is configured to, for each credit investigation report in the plurality of credit investigation reports, analyze a loan record in the credit investigation report and determine whether an overdue loan record exists, if so, record detailed information corresponding to the overdue loan record, and otherwise, add a non-overdue flag to the credit investigation report; the third module 13 is used for screening out a positive sample for credit investigation wind control modeling from the analyzed plurality of credit investigation reports.
The first module 11 selects a plurality of credit investigation reports corresponding to the unused credit users of the institution from the total credit investigation reports of the institution. The mechanism refers to a financial mechanism for inquiring a credit investigation report of a user and carrying out credit investigation pneumatic control modeling to identify overdue risks of the user. In some embodiments, the non-trusted users refer to users who are not trusted by the institution, and the non-trusted users include users who refuse the trust granted by the institution and/or users who pass the trust granted by the institution but do not actually use the trust; the credit is used for using a credit line, the user uses the credit line, namely the user represents to initiate a debit, and the label can be marked to the user through whether the user normally repays the credit. It should be noted that, a user who has used credit in the mechanism can certainly determine the label according to whether the user normally pays, and the user is in the training set and does not need to perform screening.
In some embodiments, a filtering condition can be preset, and the credit investigation report of the unused credit user of the institution can be selected from the full credit investigation report of the institution by means of conditional filtering. In some embodiments, the predetermined filter condition is a condition that defines an untrusted user of the facility; for example, the predetermined filtering condition limits the credit investigation report of the user which only selects the un-credited user to reject the credit approval of the mechanism; for another example, the predetermined filtering condition defines the credit investigation report of all users who do not use credit in the institution, that is, users who do not use credit include users who have refused the credit authorization of the institution and users who have passed the credit authorization but do not actually use credit. In some embodiments, the predetermined filter condition is a condition defining the non-trusted users of the institution and credit investigation reports, for example, the predetermined filter condition defines users who deny credit to the institution for the non-trusted users and users who pass credit through the institution but do not actually use credit, and at the same time defines credit investigation reports whose query time is selected within a predetermined time range. In practical applications, the present mechanism may set predetermined filtering conditions based on demand. In some embodiments, if the number of selected credit investigation reports is greater than the first predetermined number, a new filtering condition may be further set for the secondary screening.
As an example of the first module 11, for each credit investigation report in the total credit investigation reports of the institution, it is determined whether the user corresponding to the credit investigation report uses credit at the institution, if so, the credit investigation report is skipped, otherwise, the credit investigation report is selected.
The second module 12 analyzes the loan record in the credit investigation report and judges whether an overdue loan record exists for each credit investigation report in the plurality of credit investigation reports, if yes, detail information corresponding to the overdue loan record is recorded, and if not, a non-overdue mark is added to the credit investigation report. In some embodiments, for each credit investigation report, each loan record in the credit investigation report is sequentially analyzed, wherein whether a loan is overdue is judged on the condition that the balance of the outstanding fund is greater than 0 for more than 30 days, if yes, detail information related to the loan is recorded, wherein the detail information includes but is not limited to loan issuing time, loan issuing mechanism, loan type, loan balance, overdue total amount, loan overdue occurrence time and the like, and if not, detail information does not need to be recorded, and only an overdue-free mark needs to be added to the credit investigation report (for example, the credit investigation report is marked as "overdue-free"). In some embodiments, if there is an overdue loan record in the credit report, an overdue flag may be added to the credit report, such as "overdue" the credit report flag; in some embodiments, if there are more than one overdue loan records in a credit report, the number of overdue loans may be recorded for subsequent screening. In some embodiments, the credit report query time corresponding to each credit report is recorded for subsequent screening.
The third module 13 screens out a positive sample for credit investigation wind control modeling from the plurality of parsed credit investigation reports. In some embodiments, a credit report meeting the screening condition is screened from a plurality of analyzed credit reports based on the preset screening condition, and the screened credit report is supplemented as a positive sample for credit control modeling. Wherein the screening conditions include any pre-set conditions for positive sample screening; in some embodiments, the screening conditions are determined based on the actual business requirements of the credit investigation wind control modeling, for example, after setting the screening conditions for the basic screening, the corresponding screening conditions are further added for different loan types. It should be noted that any implementation for screening out a positive sample for credit investigation wind control modeling from a plurality of analyzed credit investigation reports is included in the scope of protection of the present application.
In some embodiments, the third module 13 includes a module 131 (not shown) and a module 132 (not shown). The module 131 aggregates the analyzed multiple credit investigation reports into multiple credit investigation report sets from the user dimension; the module 132 screens out positive samples from the multiple credit reporting sets for credit wind control modeling.
In some embodiments, after the credit investigation reports of all the unused credit users are analyzed, the analysis is performed by using the user as a dimension to aggregate all the analyzed credit investigation reports to obtain a plurality of credit investigation report sets, each credit investigation report set uniquely corresponds to one unused credit user, that is, all the credit investigation reports of one unused credit user are aggregated to the same credit investigation report set, and different unused credit users correspond to different credit investigation report sets, wherein each analyzed credit investigation report includes various information recorded or added in the analysis process, such as detailed information corresponding to an overdue loan record, no overdue mark, an overdue mark, inquiry time of the credit investigation report (also referred to as inquiry time in this application), and the like.
In some embodiments, the module 131 is further configured to: and aggregating the plurality of credit investigation reports from the user dimension to obtain a plurality of credit investigation report sets, and sequencing the credit investigation reports in each credit investigation report set according to the corresponding credit investigation time. In some embodiments, after the credit investigation reports of all the users without credit are analyzed, the analysis is performed by using the users as dimensions to aggregate all the analyzed credit investigation reports to obtain a plurality of credit investigation report sets, and the credit investigation report query times corresponding to all the credit investigation reports are sequenced in each credit investigation report set aggregated by each user without credit for subsequent screening.
In some embodiments, the module 132 is further configured to: and for each credit report set, judging whether an overdue credit report is included in the credit report set, if not, skipping the credit report set, if so, judging whether an unexpired credit report exists before the overdue credit report, if not, skipping the credit report set, and if so, screening the unexpired credit report as a positive sample for credit-related wind control modeling. In the embodiments, since the screened user (i.e. the user corresponding to the screened credit investigation report) has overdue behavior between two credit investigation report queries, it can be inferred that if the user uses credit in the institution for the period of time, the user also has overdue behavior of loan, so that the user can be supplemented with a positive sample for the training of credit investigation pneumatic control modeling. As an example, the following operations are performed for each user (i.e., each credit report set): firstly, judging whether an overdue loan record exists in the history of the user (namely whether the overdue loan record exists in a credit investigation report set corresponding to the user), and if not, skipping the user (namely, skipping because the credit investigation report set does not meet the condition); if the user has the overdue loan record historically, but before the credit investigation inquiry of the overdue loan record exists each time, the credit investigation report inquiry without the overdue loan record does not exist, skipping the user; if the user has an overdue loan record in history and the credit investigation report query of the overdue loan record exists before the credit investigation report query of the overdue loan record exists, the credit investigation report without the overdue loan record before the credit investigation report with the overdue loan record exists is supplemented as a positive sample, namely the last non-overdue credit investigation report with the overdue loan record for the first time of the user is supplemented as a positive sample.
In some embodiments, the module 132 is further configured to: and for each credit report set, judging whether an overdue credit report is included in the credit report set or not, if not, skipping the credit report set, if so, judging whether an overdue credit report exists before the overdue credit report, if not, skipping the credit report set, if so, judging whether a preset service screening condition is met or not according to detail information corresponding to an overdue loan record in the overdue credit report, if so, screening the overdue credit report as a positive sample for credit-reporting wind-control modeling, and otherwise, skipping the credit report set. In the embodiments, after the user with overdue behavior between two credit investigation report queries is screened, screening is further continued based on a preset service screening condition, so that the user with overdue behavior on a specific service can be screened, therefore, it is possible to infer that the loan overdue also occurs if the screened user uses credit on similar services of the mechanism in the period of time. In some embodiments, the predetermined traffic screening conditions are determined in connection with actual traffic modeled by the credit wind control, e.g. the predetermined traffic screening conditions comprise at least any one of: the time interval between the credit investigation report query of the user without the overdue loan record and the credit investigation report query of the user with the overdue loan record is larger than a preset interval (such as 1 year), the loan type of the overdue loan is a specific type (such as a small cash loan), and the overdue total amount of the overdue loan is larger than a preset amount (such as 1000 yuan). As an example, the following is performed for each user: firstly, judging whether the user has a overdue loan record in history, and skipping the user if the user does not have the overdue loan record; if the user has past loan records in history, but before the inquiry of the credit investigation report of the past loan records exists each time, the inquiry of the credit investigation report without the past loan records does not exist, screening the credit investigation report which accords with the preset business screening condition according to the detail record corresponding to the past loan records, and supplementing the screened credit investigation report with a positive sample which is not used for credit investigation pneumatic control modeling. Therefore, the method and the system can utilize the similarity of loan businesses among different institutions to infer the overdue loan behavior of the user in other institutions in the credit investigation report as the overdue loan behavior of the user in the institution on the basis of meeting certain conditions, avoid the defects of the conventional sampling-based positive sample supplement technology, and supplement high-quality positive samples for credit investigation wind control modeling.
According to the scheme of the application, whether overdue loan records exist in credit investigation reports of users without credit of the mechanism can be analyzed, and then, the available positive samples are screened out and supplemented to a training set of credit investigation wind control modeling in combination with a certain screening condition, the positive samples of credit investigation wind control model training can be expanded by deducing overdue expressions of the users without credit in other mechanisms as overdue expressions of the loan in the mechanism and using the overdue expressions as labels of model training, so that the problems of insufficient positive samples and unbalanced samples in the credit investigation wind control modeling are effectively relieved, and the risk identification capability and generalization capability of the credit investigation wind control model are enhanced; the supplementary positive samples are not simple to copy the existing positive samples, so the problem of the reduction of the variance of the positive samples caused by over-sampling is avoided, meanwhile, the source of the positive sample supplemented by the method is consistent with that of the real positive sample, and the positive sample is from the user applying for credit at the mechanism, therefore, the distribution of the two samples is close, the sample variance is also close, the sample screened by the method is reasonable as a supplementary positive sample, and, unlike under-sampling, the scheme of the present application does not discard negative samples, thus avoiding the risk of useful data loss, and unlike Smote synthesis, which synthesizes a few over-samples, the scheme of the present application does not construct virtual samples, but the real overdue performance of the user in the related lending behavior is mined, so that the problem that a few samples with enumerated type characteristics and missing values cannot be constructed by Smote synthesis oversampling is solved, and the supplemented real samples are also favorable for learning a credit investigation wind control model with good interpretability.
The present application further provides a computer device, wherein the computer device includes: a memory for storing one or more programs; one or more processors coupled with the memory, the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method for supplementing positive samples in credit wind control modeling described herein.
The present application also provides a computer-readable storage medium having stored thereon a computer program executable by a processor for performing the method for supplementing positive samples in credit wind control modeling described herein.
The present application also provides a computer program product which, when executed by an apparatus, causes the apparatus to perform the method for supplementing positive samples in credit wind control modeling described herein.
FIG. 8 illustrates an exemplary system that can be used to implement the various embodiments described in this application.
In some embodiments, system 1000 can be implemented as any of the processing devices in the embodiments of the present application. In some embodiments, system 1000 may include one or more computer-readable media (e.g., system memory or NVM/storage 1020) having instructions and one or more processors (e.g., processor(s) 1005) coupled with the one or more computer-readable media and configured to execute the instructions to implement modules to perform the actions described herein.
For one embodiment, system control module 1010 may include any suitable interface controllers to provide any suitable interface to at least one of the processor(s) 1005 and/or to any suitable device or component in communication with system control module 1010.
The system control module 1010 may include a memory controller module 1030 to provide an interface to the system memory 1015. Memory controller module 1030 may be a hardware module, a software module, and/or a firmware module.
System memory 1015 may be used to load and store data and/or instructions, for example, for system 1000. For one embodiment, system memory 1015 may include any suitable volatile memory, such as suitable DRAM. In some embodiments, the system memory 1015 may include a double data rate type four synchronous dynamic random access memory (DDR4 SDRAM).
For one embodiment, system control module 1010 may include one or more input/output (I/O) controllers to provide an interface to NVM/storage 1020 and communication interface(s) 1025.
For example, NVM/storage 1020 may be used to store data and/or instructions. NVM/storage 1020 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more hard disk drive(s) (HDD (s)), one or more Compact Disc (CD) drive(s), and/or one or more Digital Versatile Disc (DVD) drive (s)).
NVM/storage 1020 may include storage resources that are physically part of a device on which system 1000 is installed or may be accessed by the device and not necessarily part of the device. For example, NVM/storage 1020 may be accessed over a network via communication interface(s) 1025.
Communication interface(s) 1025 may provide an interface for system 1000 to communicate over one or more networks and/or with any other suitable device. System 1000 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols.
For one embodiment, at least one of the processor(s) 1005 may be packaged together with logic for one or more controller(s) of the system control module 1010, e.g., memory controller module 1030. For one embodiment, at least one of the processor(s) 1005 may be packaged together with logic for one or more controller(s) of the system control module 1010 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 1005 may be integrated on the same die with logic for one or more controller(s) of the system control module 1010. For one embodiment, at least one of the processor(s) 1005 may be integrated on the same die with logic of one or more controllers of the system control module 1010 to form a system on a chip (SoC).
In various embodiments, system 1000 may be, but is not limited to being: a server, a workstation, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.). In various embodiments, system 1000 may have more or fewer components and/or different architectures. For example, in some embodiments, system 1000 includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and speakers.
It will be evident to those skilled in the art that the present application is not limited to the details of the foregoing illustrative embodiments, and that the present application may be embodied in other specific forms without departing from the spirit or essential attributes thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference sign in a claim should not be construed as limiting the claim concerned. Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.

Claims (14)

1. A method for supplementing positive samples in credit wind control modeling, wherein the method comprises:
selecting a plurality of credit investigation reports corresponding to the unused credit users of the institution from the total credit investigation reports of the institution;
analyzing the loan records in the credit investigation report and judging whether an overdue loan record exists or not aiming at each credit investigation report in the plurality of credit investigation reports, if so, recording detailed information corresponding to the overdue loan record, otherwise, adding a non-overdue mark to the credit investigation report;
and screening out a positive sample for credit investigation wind control modeling from the analyzed plurality of credit investigation reports.
2. The method of claim 1, wherein the non-trusted users include users who have rejected the trust from the institution and/or users who have passed the trust from the institution but did not actually use the trust.
3. The method of claim 1 or 2, wherein the screening out positive samples for credit investigation wind control modeling from the parsed plurality of credit investigation reports comprises:
aggregating the analyzed multiple credit investigation reports into multiple credit investigation report sets from the user dimension;
and screening out positive samples for credit investigation wind control modeling from the plurality of credit investigation report sets.
4. The method of claim 3, wherein the aggregating the parsed credit reports from the user dimension into credit report sets comprises:
and aggregating the plurality of credit investigation reports from the user dimension to obtain a plurality of credit investigation report sets, and sequencing the credit investigation reports in each credit investigation report set according to the corresponding credit investigation time.
5. The method of claim 3, wherein the screening out positive samples for credit wind control modeling from the plurality of credit report sets comprises:
and for each credit report set, judging whether an overdue credit report is included in the credit report set, if not, skipping the credit report set, if so, judging whether an unexpired credit report exists before the overdue credit report, if not, skipping the credit report set, and if so, screening the unexpired credit report as a positive sample for credit-related wind control modeling.
6. The method of claim 3, wherein the screening out positive samples for credit wind control modeling from the plurality of credit report sets comprises:
and for each credit report set, judging whether an overdue credit report is included in the credit report set or not, if not, skipping the credit report set, if so, judging whether an overdue credit report exists before the overdue credit report, if not, skipping the credit report set, if so, judging whether a preset service screening condition is met or not according to detail information corresponding to an overdue loan record in the overdue credit report, if so, screening the overdue credit report as a positive sample for credit-reporting wind-control modeling, and otherwise, skipping the credit report set.
7. An apparatus for supplementing positive samples in credit wind control modeling, wherein the apparatus comprises:
a module for selecting a plurality of credit investigation reports corresponding to unused credit users of the organization from the total credit investigation reports of the organization;
a module for analyzing the loan record in the credit investigation report and judging whether an overdue loan record exists or not aiming at each credit investigation report in the plurality of credit investigation reports, if so, recording detailed information corresponding to the overdue loan record, otherwise, adding a non-overdue mark to the credit investigation report;
and screening out a positive sample for credit investigation wind control modeling from the analyzed plurality of credit investigation reports.
8. The apparatus of claim 7, wherein the non-trusted users include users who have been denied credit by the institution and users who have passed credit by the institution but do not actually use credit.
9. The apparatus of claim 7 or 8, wherein the means for screening out positive samples for credit wind control modeling from the parsed plurality of credit reports comprises:
a module for aggregating the parsed multiple credit reports into multiple credit report sets from the user dimension;
means for screening out positive samples for credit investigation wind control modeling from the plurality of credit investigation report sets.
10. The apparatus of claim 9, wherein the means for aggregating parsed credit reports into credit report sets from a user dimension is configured to:
and aggregating the plurality of credit investigation reports from the user dimension to obtain a plurality of credit investigation report sets, and sequencing the credit investigation reports in each credit investigation report set according to the corresponding credit investigation time.
11. The apparatus of claim 9, wherein the means for screening out positive samples for credit wind control modeling from the plurality of credit report sets is configured to:
and for each credit report set, judging whether an overdue credit report is included in the credit report set, if not, skipping the credit report set, if so, judging whether an unexpired credit report exists before the overdue credit report, if not, skipping the credit report set, and if so, screening the unexpired credit report as a positive sample for credit-related wind control modeling.
12. The apparatus of claim 11, wherein the means for screening out positive samples from the plurality of credit report sets for credit wind control modeling is configured to:
and for each credit report set, judging whether an overdue credit report is included in the credit report set or not, if not, skipping the credit report set, if so, judging whether an overdue credit report exists before the overdue credit report, if not, skipping the credit report set, if so, judging whether a preset screening condition is met or not according to detail information corresponding to an overdue loan record in the overdue credit report, if so, screening the overdue credit report as a positive sample for credit-reporting wind-control modeling, and otherwise, skipping the credit report set.
13. A computer device, wherein the computer device comprises:
a memory for storing one or more programs;
one or more processors coupled to the memory,
the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method recited by any of claims 1-6.
14. A computer-readable storage medium, on which a computer program is stored, which computer program can be executed by a processor to perform the method according to any one of claims 1 to 6.
CN202210099499.0A 2022-01-27 2022-01-27 Method and device for supplementing positive samples in credit investigation wind control modeling Pending CN114463113A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210099499.0A CN114463113A (en) 2022-01-27 2022-01-27 Method and device for supplementing positive samples in credit investigation wind control modeling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210099499.0A CN114463113A (en) 2022-01-27 2022-01-27 Method and device for supplementing positive samples in credit investigation wind control modeling

Publications (1)

Publication Number Publication Date
CN114463113A true CN114463113A (en) 2022-05-10

Family

ID=81411205

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210099499.0A Pending CN114463113A (en) 2022-01-27 2022-01-27 Method and device for supplementing positive samples in credit investigation wind control modeling

Country Status (1)

Country Link
CN (1) CN114463113A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115511910A (en) * 2022-08-22 2022-12-23 电子科技大学长三角研究院(湖州) Anti-attack method, system, medium, equipment and terminal for video tracking

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115511910A (en) * 2022-08-22 2022-12-23 电子科技大学长三角研究院(湖州) Anti-attack method, system, medium, equipment and terminal for video tracking
CN115511910B (en) * 2022-08-22 2024-01-12 电子科技大学长三角研究院(湖州) Video tracking-oriented attack countermeasure method, system, medium, equipment and terminal

Similar Documents

Publication Publication Date Title
Koh et al. A two-step method to construct credit scoring models with data mining techniques
Ashofteh et al. A conservative approach for online credit scoring
US7904366B2 (en) Method and system to determine resident qualifications
US8533235B2 (en) Infrastructure and architecture for development and execution of predictive models
WO2008147918A2 (en) System and method for automated detection of never-pay data sets
US9195671B2 (en) Infrastructure and architecture for development and execution of predictive models
US20200175403A1 (en) Systems and methods for expediting rule-based data processing
US8984022B1 (en) Automating growth and evaluation of segmentation trees
CN111125266A (en) Data processing method, device, equipment and storage medium
JP6251383B2 (en) Calculating the probability of a defaulting company
Van Thiel et al. Artificial intelligent credit risk prediction: An empirical study of analytical artificial intelligence tools for credit risk prediction in a digital era
CN111062600B (en) Model evaluation method, system, electronic device, and computer-readable storage medium
Zhao et al. Dmdp: A dynamic multi-source default probability prediction framework
CN114463113A (en) Method and device for supplementing positive samples in credit investigation wind control modeling
Hsin et al. Feature engineering and resampling strategies for fund transfer fraud with limited transaction data and a time-inhomogeneous modi operandi
US11308130B1 (en) Constructing ground truth when classifying data
US20180285878A1 (en) Evaluation criterion for fraud control
Ting et al. What is missing? Using data mining techniques with business cycle phases for predicting company financial crises
CN114565450A (en) Overdue common debt-based collection strategy determination method and related equipment
CN114493853A (en) Credit rating evaluation method, credit rating evaluation device, electronic device and storage medium
Vozzella et al. Default and asset correlation: An empirical study for Italian SMEs
US20230126127A1 (en) Financial information enrichment for intelligent credit decision making
CN115953248B (en) Wind control method, device, equipment and medium based on saprolitic additivity interpretation
JP7194077B2 (en) Management analysis support system and management analysis support method
Lohse Machine Learning in Banking: Exploring the feasibility of using consumer level bank transaction data for credit risk evaluation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination