CN113743735A - Risk score generation method and device - Google Patents

Risk score generation method and device Download PDF

Info

Publication number
CN113743735A
CN113743735A CN202110915774.7A CN202110915774A CN113743735A CN 113743735 A CN113743735 A CN 113743735A CN 202110915774 A CN202110915774 A CN 202110915774A CN 113743735 A CN113743735 A CN 113743735A
Authority
CN
China
Prior art keywords
user
preset
evaluated
classification model
generating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202110915774.7A
Other languages
Chinese (zh)
Inventor
段祖宁
华波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Xingyun Digital Technology Co Ltd
Original Assignee
Nanjing Xingyun Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Xingyun Digital Technology Co Ltd filed Critical Nanjing Xingyun Digital Technology Co Ltd
Priority to CN202110915774.7A priority Critical patent/CN113743735A/en
Publication of CN113743735A publication Critical patent/CN113743735A/en
Priority to CA3169372A priority patent/CA3169372A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0635Risk analysis of enterprise or organisation activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Resources & Organizations (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Mathematical Physics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computing Systems (AREA)
  • Strategic Management (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)

Abstract

The application discloses a method and a device for generating a risk score, wherein the method comprises the steps of obtaining historical behavior data sets corresponding to a user to be evaluated in at least two continuous preset time windows, wherein the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively; generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window; and predicting and generating a risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model, so that the dependency of a risk evaluation process on the experience of modeling personnel is reduced, and the timeliness of model construction and the accuracy of an evaluation result of the model are improved.

Description

Risk score generation method and device
Technical Field
The invention relates to the field of data processing, in particular to a method and a device for generating a risk score.
Background
In order to reduce the risk, the enterprise generally needs to perform risk assessment on the user before performing financial transaction with the user, so as to avoid economic losses and the like on the enterprise caused by the fact that the user does not have corresponding financial transaction capability.
In the prior art, enterprises mainly adopt a traditional statistical model Scorecard model (Scorecard) to model and develop risk assessment processes. The model designs a processing derived variable according to data expression of a user in an observation period, and finally obtains the risk probability of the user by fitting the relation between an independent variable and a target variable through a binary logistic regression model.
In the whole modeling process of the model, a variable derivation link is important and is also the key for determining the distinguishing performance of the final model. The quality of variable derivation often affects various evaluation indexes of the evaluation card model and determines the risk evaluation capability, so that the variable derivation requires that a modeling worker not only know the risk service, but also understand the design of variable combination, and has high comprehensive quality requirements on the modeling worker.
Therefore, there is a need for a method for generating a risk score that does not depend on variable derivation and can evaluate the risk of a user based on the user behavior, so as to solve the above technical problems in the prior art.
Disclosure of Invention
In order to solve the deficiencies of the prior art, the main objective of the present invention is to provide a method and an apparatus for generating a risk score, so as to solve the above technical problems of the prior art.
In order to achieve the above object, the present invention provides a method for generating a risk score in a first aspect, the method comprising:
acquiring historical behavior data sets corresponding to at least two continuous preset time windows of a user to be evaluated, wherein the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively;
generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window;
and predicting and generating a risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
In some embodiments, the generating a timing matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimension in each preset time window includes:
generating a vector of the user to be evaluated corresponding to the preset behavior dimension according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimension, wherein the vector comprises the historical behavior data of the user to be evaluated corresponding to each preset time window;
and generating a time sequence matrix corresponding to the user to be evaluated according to the vector of the user to be evaluated corresponding to the preset behavior dimension.
In some embodiments, the historical behavior data includes execution times, and the method further includes, before obtaining the historical behavior data sets corresponding to at least two consecutive preset time windows of the user to be evaluated:
acquiring historical behavior information corresponding to the user to be evaluated in the preset historical time period;
determining the execution times of the historical behaviors of the user to be evaluated corresponding to each preset behavior dimension in the preset time window according to the historical behavior information;
and determining historical behavior data sets corresponding to at least two continuous preset time windows of the user to be evaluated according to the determined execution times of the user to be evaluated executing the historical behaviors corresponding to each preset behavior in each preset time window.
In some embodiments, before predicting and generating the risk score corresponding to the user to be evaluated according to the timing matrix by using the trained preset classification model, the method further includes training the preset classification model, where a training process of the preset classification model includes:
training the preset classification model by using training data contained in a training sample set;
and verifying whether the preset classification model meets preset conditions or not by using a test sample set, and determining that the preset classification model is a trained preset classification model when the preset classification model is judged to meet the preset conditions.
In some embodiments, the preset classification model includes long-term memory, and the training of the preset classification model using training data included in a training sample set includes:
the preset classification model determines and rejects a part to be rejected contained in the long-term memory according to a current node corresponding to current training data and the long-term memory corresponding to the classification model;
the preset classification model is used for superposing the removed long-term memory and the current node according to the training data to generate a superposition result;
and updating the long-term memory corresponding to the preset classification model by the preset classification model according to the superposition result.
In some embodiments, the at least two preset behavior dimensions include the number of borrowings of the user to be assessed in the corresponding preset time window, the number of loan applications of the user to be assessed in the corresponding preset time window, and the number of registrations of the user to be assessed in the corresponding preset time window at a loan institution.
In some embodiments, the preset classification model comprises a long-short term memory artificial neural network model.
In a second aspect, the present application provides an apparatus for generating a risk score, the apparatus comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring historical behavior data sets corresponding to a user to be evaluated in at least two continuous preset time windows, and the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively;
the generation module is used for generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window;
and the scoring module is used for predicting and generating the risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
In a third aspect, the present application provides a computer readable storage medium having a computer program stored thereon, wherein the program when executed by a processor implements a method as described in any of the above.
In a fourth aspect, the present application provides an electronic device comprising:
one or more processors;
and memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring historical behavior data sets corresponding to at least two continuous preset time windows of a user to be evaluated, wherein the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively;
generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window;
and predicting and generating a risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
The invention has the following beneficial effects:
the application provides a risk score generation method, which comprises the steps of obtaining historical behavior data sets corresponding to at least two continuous preset time windows of a user to be evaluated, wherein the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively; generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window; and predicting and generating the risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model, and generating the corresponding risk score based on the time sequence behavior of the user on the basis of not depending on variable derivation of a modeling worker, so that the dependency of a risk evaluation process on the experience of the modeling worker is reduced, and the timeliness of model construction and the accuracy of the evaluation result of the model are improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of a classification model provided by an embodiment of the present application;
fig. 2 is a timing characteristic structure diagram provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of a classification model provided herein;
FIG. 4 is a schematic diagram of an exemplary matrix structure provided by an embodiment of the present application;
FIG. 5 is a flow chart of a method provided by an embodiment of the present application;
FIG. 6 is a block diagram of an apparatus according to an embodiment of the present disclosure;
fig. 7 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As described in the background, conventional statistical models relying on variable derivation are typically used in the prior art for risk assessment prior to financial transactions with users, resulting in model performance that depends on the accuracy of the modeling derivation links performed by modelers.
In order to solve the above technical problems, as shown in fig. 1, the present application provides a risk score generation method, which can directly predict and obtain a corresponding risk score according to a time sequence behavior of a user through a trained classification model, thereby avoiding a problem that the performance of the model completely depends on the professional level of a modeler.
Example one
Specifically, the process of performing user risk assessment by using the risk scoring method disclosed in the embodiment of the present application includes:
s100, collecting historical behavior information corresponding to a user to be evaluated in a historical time period;
historical behavior information of the user in a preset historical time period can be collected through credit investigation reports, browsing records and the like of the user to be evaluated. Wherein, the historical behavior information includes an execution record of the user executing the behavior indicating the financial risk of the user in the historical time period.
The above actions that may indicate the financial risk of the user include, but are not limited to: borrowing from other people or enterprises, applying for loan to a loan institution such as a bank, registering account numbers in the loan institution, receiving payment acceleration information, and the like.
The preset historical time period may be determined according to business needs, and may be, for example, within the year before risk assessment.
S200, determining the execution times of the historical behaviors of the user to be evaluated corresponding to each preset behavior dimension in each preset time window according to the historical behavior information corresponding to the user to be evaluated; determining a historical behavior data set corresponding to a user to be evaluated according to the execution times;
the preset historical time period can be divided into at least two continuous and non-overlapping preset time windows according to a preset dividing rule. According to the preset time windows obtained by division, the execution times of the historical behaviors of the user to be evaluated corresponding to each preset behavior dimension in the corresponding preset time window can be obtained through statistics. According to the execution times, historical behavior data of the historical behavior of the user to be evaluated corresponding to each preset behavior dimension can be generated, and a historical behavior data set is formed by a set formed by the historical behavior data. The historical behavior data may further include behavior data such as an interval period of the corresponding historical behavior executed by the user within the corresponding preset time range, for example, an interval period of the borrowing behavior, which is not limited in the present application.
And generating a historical behavior data set corresponding to the user to be evaluated according to the execution times of the user to be evaluated executing the historical behavior of each preset behavior dimension in each preset time window.
S300, generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data set;
specifically, the generating process includes:
s310, generating a vector of the user to be evaluated corresponding to the preset behavior dimension according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimension;
for example, the preset behavior dimension is the application behavior, the historical time period is the past 12 months, and the corresponding vectors are shown in FIG. 1, including application-1 m, application-2 m, application-3 m, and application-4 m … … application-12 m. Each node contained in the vector is used for storing historical behavior data of the user to be evaluated in the corresponding preset time window. For example, application-1 m stores the application behavior data of the user to be evaluated in the first past month, application-2 m stores the historical behavior data of the user to be evaluated in the second past month, and so on.
The application behavior data may include historical behavior data related to application behavior dimensions, such as the number of loan applications, the number of credit card applications, the number of loan extension applications, and the like, corresponding to the user to be assessed in the preset time window.
In some embodiments, the preset behavior dimension may further include a borrowing behavior dimension, a registering behavior dimension, a receiving information dimension, and the like.
The historical behavior data corresponding to the borrowing behavior dimension can comprise historical behavior data related to the borrowing dimension, such as borrowing from enterprises or individuals by users to be assessed, consumption based on credit lines through credit cards, and the like.
The historical behavior data corresponding to the registration behavior dimension may include historical behavior data related to a registration dimension related to financial transactions, such as the behavior of a user to be evaluated registering an account number at a loan institution.
The historical behavior data corresponding to the receiving information dimension may include received payment urging information, received verification short message information of the loan institution and other historical behavior data related to the receiving information dimension.
S320, generating a time sequence matrix corresponding to the user to be evaluated according to all vectors corresponding to the user to be evaluated;
the vectors corresponding to each preset behavior dimension can be aligned and superposed according to the corresponding preset time window to form corresponding time sequence characteristics. Fig. 2 shows an exemplary structure diagram of the time-series feature, which is a matrix of M × N, where M represents the number of included preset behavior dimensions, and N represents the length of each vector, i.e., the number of time windows included in each vector.
S400, predicting corresponding risk scores according to the time sequence matrix corresponding to the user to be evaluated by using the trained classification model;
the classification model may be any machine learning or deep learning model that has the ability to process time series data to generate a corresponding classification prediction. Preferably, the classification model may be a time-cycled neural network (LSTM) model with long and short term memory that captures the impact of user temporal behavior on risk.
As shown in fig. 3, the timing matrix may be input into a trained classification model, so that the classification model outputs a prediction result of the risk score of the user to be evaluated.
And when the predicted risk score exceeds a preset threshold value, generating corresponding early warning information so as to judge whether to carry out financial transaction with the user to be evaluated according to the early warning information.
In order to apply a classification model to predict risk scores, the classification model needs to be trained in advance, and the training process of the classification model includes:
s500, acquiring a training sample set and a test sample set;
and generating a corresponding time sequence matrix according to the historical behavior information corresponding to the sample user. Specifically, the implementation step of generating the corresponding time sequence matrix according to the historical behavior information of the sample user is the same as the implementation step of generating the corresponding time sequence matrix according to the historical behavior information of the user to be evaluated, and is not repeated here.
And generating a corresponding training sample according to the time sequence characteristics corresponding to the sample user and the manually marked risk score. The plurality of training samples may constitute a training sample set and a testing sample set of the preset classification model. Specifically, 70% of the training samples may be divided into the training sample set, and 30% of the training samples may be divided into the test sample set.
S510, training a classification model by using training samples contained in a training sample set;
in some embodiments, when the classification model is a time-cycle neural network (LSTM) model with long-short term memory, the training of the classification model using the training samples included in the training sample set includes:
s511, inputting the training samples into the classification model according to a preset sequence;
s512, a forgetting gate of the classification model traverses each node in the training samples according to the currently input training samples and inputs the nodes into the classification model in sequence; specifically, the classification model comprises a forgetting gate, a memory gate, a learning gate and an output gate.
S513, after the node is input into the classification model, the forgetting gate firstly determines whether all or part of the contents of the history long-term memory need to be forgotten according to the node and the history long-term memory generated according to the preorder node of the node and generates corresponding forgotten long-term memory after all or part of the contents need to be forgotten are forgotten;
the historical long-term memory is the long-term memory obtained by the classification model on the basis of the preorder nodes of the currently input nodes and is used for storing the long-term state data of the classification model.
S514, the input gate of the classification model superposes the long-term memory after forgetting corresponding to the node and the learning gate of the classification model according to the short-term memory output by the currently input node to generate a superposition result;
and S515, generating long-term memory corresponding to the node by an output gate of the classification model according to the Sigmoid function and the superposition result.
After the processing of the node is completed, the subsequent nodes of the node will be input into the classification model so that the classification model continues to adjust the long-term memory of the model according to the subsequent nodes.
S520, after training of the classification model by all training samples in the training sample set is completed, verifying the classification model by using the training samples contained in the test sample set;
s530, after the test result of the classification model on the test sample set meets the preset condition, determining the classification model as a trained classification model.
According to the method, the LR (logical regression), XGboost and LSTM three different models are adopted to model the user time sequence behaviors, the obtained models are shown in figure 4, and each time sequence matrix comprises the characteristics of 24 time windows Window, 6 preset behavior dimensions EventType. The performance results obtained by tests using the obtained model are shown in table 1.
TABLE 1
Figure BDA0003205542480000091
From the index of Accuracy (Accuracy), the model performance obtained based on LSTM modeling is superior to that of XGboost model and LR model.
By continuously enriching the collected preset behavior dimensions, the LSTM performance can have better performance. The reason is mainly because the behavior events of the user have time sequence, and a certain dependency relationship exists between the front sequence and the back sequence, and the LSTM model can well capture the time sequence dependency relationship.
Along with the improvement of the performance of the model, the model can have better distinguishing capability on the credit risk of the user to be evaluated, so that the risk of financial behaviors of the user is reduced, a better risk control level is obtained, and higher profit is obtained.
Example two
Corresponding to the above embodiments, as shown in fig. 5, the present application provides a method for generating a risk score, where the method includes:
510. acquiring historical behavior data sets corresponding to at least two continuous preset time windows of a user to be evaluated, wherein the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively;
preferably, the historical behavior data includes execution times, and before the historical behavior data sets corresponding to at least two consecutive preset time windows of the user to be evaluated are acquired, the method further includes:
511. acquiring historical behavior information corresponding to the user to be evaluated in the preset historical time period;
512. determining the execution times of the historical behaviors of the user to be evaluated corresponding to each preset behavior dimension in the preset time window according to the historical behavior information;
513. and determining historical behavior data sets corresponding to at least two continuous preset time windows of the user to be evaluated according to the determined execution times of the user to be evaluated executing the historical behaviors corresponding to each preset behavior in each preset time window.
520. Generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window;
preferably, the generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimension in each preset time window includes:
521. generating a vector of the user to be evaluated corresponding to the preset behavior dimension according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimension, wherein the vector comprises the historical behavior data of the user to be evaluated corresponding to each preset time window;
522. and generating a time sequence matrix corresponding to the user to be evaluated according to the vector of the user to be evaluated corresponding to the preset behavior dimension.
530. And predicting and generating a risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
Preferably, before predicting and generating the risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model, the method further includes training the preset classification model, where a training process of the preset classification model includes:
531. training the preset classification model by using training data contained in a training sample set;
532. and verifying whether the preset classification model meets preset conditions or not by using a test sample set, and determining that the preset classification model is a trained preset classification model when the preset classification model is judged to meet the preset conditions.
Preferably, the preset classification model includes long-term memory, and the training of the preset classification model by using the training data included in the training sample set includes:
533. the preset classification model determines and rejects a part to be rejected contained in the long-term memory according to a current node corresponding to current training data and the long-term memory corresponding to the classification model;
534. the preset classification model is used for superposing the removed long-term memory and the current node according to the training data to generate a superposition result;
535. and updating the long-term memory corresponding to the preset classification model by the preset classification model according to the superposition result.
Preferably, the at least two preset behavior dimensions include the number of times of borrowing of the user to be evaluated in the corresponding preset time window, the number of times of loan application of the user to be evaluated in the corresponding preset time window, and the number of times of registration of the user to be evaluated in the corresponding preset time window at a loan institution.
Preferably, the preset classification model comprises a long-term and short-term memory artificial neural network model.
EXAMPLE III
Corresponding to the first and second embodiments, as shown in fig. 6, the present application provides a device for generating a risk score, including:
an obtaining module 610, configured to obtain historical behavior data sets corresponding to at least two consecutive preset time windows of a user to be evaluated, where the historical behavior data sets include historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions, respectively;
a generating module 620, configured to generate a timing matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimension in each preset time window;
and the scoring module 630 is configured to predict and generate a risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
Preferably, the generating module 620 is further configured to generate a vector of the to-be-evaluated user corresponding to the preset behavior dimension according to the historical behavior data of the to-be-evaluated user corresponding to the preset behavior dimension, where the vector includes the historical behavior data of the to-be-evaluated user corresponding to each preset time window; and generating a time sequence matrix corresponding to the user to be evaluated according to the vector of the user to be evaluated corresponding to the preset behavior dimension.
Preferably, the historical behavior data includes execution times, and the obtaining module 610 is further configured to collect historical behavior information corresponding to the user to be evaluated within the preset historical time period; determining the execution times of the historical behaviors of the user to be evaluated corresponding to each preset behavior dimension in the preset time window according to the historical behavior information; and determining historical behavior data sets corresponding to at least two continuous preset time windows of the user to be evaluated according to the determined execution times of the user to be evaluated executing the historical behaviors corresponding to each preset behavior in each preset time window.
Preferably, the device further comprises a training module, configured to train the preset classification model by using training data included in a training sample set; and verifying whether the preset classification model meets preset conditions or not by using a test sample set, and determining that the preset classification model is a trained preset classification model when the preset classification model is judged to meet the preset conditions.
Preferably, the preset classification model includes long-term memory, and the training module is further configured to determine and reject a part to be rejected, which is included in the long-term memory, according to a current node corresponding to current training data and the long-term memory corresponding to the classification model by the preset classification model; the preset classification model is used for superposing the removed long-term memory and the current node according to the training data to generate a superposition result; and updating the long-term memory corresponding to the preset classification model by the preset classification model according to the superposition result.
Preferably, the at least two preset behavior dimensions include the number of times of borrowing of the user to be evaluated in the corresponding preset time window, the number of times of loan application of the user to be evaluated in the corresponding preset time window, and the number of times of registration of the user to be evaluated in the corresponding preset time window at a loan institution.
Preferably, the preset classification model comprises a long-term and short-term memory artificial neural network model.
Example four
The present application provides a computer-readable storage medium, on which a computer program is stored, wherein the program, when executed by a processor, implements the method according to any of the second embodiments.
EXAMPLE five
Corresponding to the above method and apparatus, an embodiment of the present application provides an electronic device, including:
one or more processors; and memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring historical behavior data sets corresponding to at least two continuous preset time windows of a user to be evaluated, wherein the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively;
generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window;
and predicting and generating a risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
Fig. 7 illustrates an architecture of an electronic device, which may include, in particular, a processor 1510, a video display adapter 1511, a disk drive 1512, an input/output interface 1513, a network interface 1514, and a memory 1520. The processor 1510, video display adapter 1511, disk drive 1512, input/output interface 1513, network interface 1514, and memory 1520 may be communicatively coupled via a communication bus 1530.
The processor 1510 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solution provided by the present Application.
The Memory 1520 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1520 may store an operating system 1521 for controlling operation of the electronic device 1500, a Basic Input Output System (BIOS)1522 for controlling low-level operation of the electronic device 1500. In addition, a web browser 1523, a data storage management system 1524, an icon font processing system 1525, and the like can also be stored. The icon font processing system 1525 may be an application program that implements the operations of the foregoing steps in this embodiment of the application. In summary, when the technical solution provided by the present application is implemented by software or firmware, the relevant program codes are stored in the memory 1520 and called for execution by the processor 1510. The input/output interface 1513 is used for connecting an input/output module to realize information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The network interface 1514 is used to connect a communication module (not shown) to enable the device to communicatively interact with other devices. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
The bus 1530 includes a path to transfer information between the various components of the device, such as the processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, and the memory 1520.
In addition, the electronic device 1500 may also obtain information of specific pickup conditions from a virtual resource object pickup condition information database for performing condition judgment, and the like.
It should be noted that although the above devices only show the processor 1510, the video display adapter 1511, the disk drive 1512, the input/output interface 1513, the network interface 1514, the memory 1520, the bus 1530, etc., in a specific implementation, the devices may also include other components necessary for proper operation. Furthermore, it will be understood by those skilled in the art that the apparatus described above may also include only the components necessary to implement the solution of the present application, and not necessarily all of the components shown in the figures.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, or the like, and includes several instructions for enabling a computer device (which may be a personal computer, a cloud server, or a network device) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A method for generating a risk score, the method comprising:
acquiring historical behavior data sets corresponding to at least two continuous preset time windows of a user to be evaluated, wherein the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively;
generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window;
and predicting and generating a risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
2. The method for generating a risk score according to claim 1, wherein the generating a time sequence matrix corresponding to the user to be assessed according to the historical behavior data of the user to be assessed corresponding to the preset behavior dimension in each preset time window comprises:
generating a vector of the user to be evaluated corresponding to the preset behavior dimension according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimension, wherein the vector comprises the historical behavior data of the user to be evaluated corresponding to each preset time window;
and generating a time sequence matrix corresponding to the user to be evaluated according to the vector of the user to be evaluated corresponding to the preset behavior dimension.
3. The method for generating a risk score according to claim 1, wherein the historical behavior data includes execution times, and before the obtaining of the historical behavior data sets corresponding to at least two consecutive preset time windows of the user to be evaluated, the method further includes:
acquiring historical behavior information corresponding to the user to be evaluated in the preset historical time period;
determining the execution times of the historical behaviors of the user to be evaluated corresponding to each preset behavior dimension in the preset time window according to the historical behavior information;
and determining historical behavior data sets corresponding to at least two continuous preset time windows of the user to be evaluated according to the determined execution times of the user to be evaluated executing the historical behaviors corresponding to each preset behavior in each preset time window.
4. The method for generating a risk score according to any one of claims 1-3, wherein before the risk score corresponding to the user to be evaluated is predicted and generated according to the timing matrix by using the trained pre-set classification model, the method further comprises training the pre-set classification model, and the training process of the pre-set classification model comprises:
training the preset classification model by using training data contained in a training sample set;
and verifying whether the preset classification model meets preset conditions or not by using a test sample set, and determining that the preset classification model is a trained preset classification model when the preset classification model is judged to meet the preset conditions.
5. The method of generating a risk score according to claim 4, wherein the pre-set classification model includes long-term memory, and the training of the pre-set classification model using the training data included in the training sample set includes:
the preset classification model determines and rejects a part to be rejected contained in the long-term memory according to a current node corresponding to current training data and the long-term memory corresponding to the classification model;
the preset classification model is used for superposing the removed long-term memory and the current node according to the training data to generate a superposition result;
and updating the long-term memory corresponding to the preset classification model by the preset classification model according to the superposition result.
6. The method for generating a risk score according to any one of claims 1-3, wherein the at least two predetermined behavior dimensions include the number of loans made by the user to be assessed within the corresponding predetermined time window, the number of loan applications made by the user to be assessed within the corresponding predetermined time window, and the number of registrations made by the user to be assessed at a loan institution within the corresponding predetermined time window.
7. The method of generating a risk score according to any one of claims 1-3, wherein the predetermined classification model comprises a long-short term memory artificial neural network model.
8. An apparatus for generating a risk score, the apparatus comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring historical behavior data sets corresponding to a user to be evaluated in at least two continuous preset time windows, and the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively;
the generation module is used for generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window;
and the scoring module is used for predicting and generating the risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
9. A computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the method according to any one of claims 1-7.
10. An electronic device, characterized in that the electronic device comprises:
one or more processors;
and memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring historical behavior data sets corresponding to at least two continuous preset time windows of a user to be evaluated, wherein the historical behavior data sets comprise historical behavior data corresponding to the user to be evaluated in at least two preset behavior dimensions respectively;
generating a time sequence matrix corresponding to the user to be evaluated according to the historical behavior data of the user to be evaluated corresponding to the preset behavior dimensionality in each preset time window;
and predicting and generating a risk score corresponding to the user to be evaluated according to the time sequence matrix by using the trained preset classification model.
CN202110915774.7A 2021-08-10 2021-08-10 Risk score generation method and device Withdrawn CN113743735A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110915774.7A CN113743735A (en) 2021-08-10 2021-08-10 Risk score generation method and device
CA3169372A CA3169372A1 (en) 2021-08-10 2022-08-02 Method and apparatus for generating a risk score

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110915774.7A CN113743735A (en) 2021-08-10 2021-08-10 Risk score generation method and device

Publications (1)

Publication Number Publication Date
CN113743735A true CN113743735A (en) 2021-12-03

Family

ID=78730649

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110915774.7A Withdrawn CN113743735A (en) 2021-08-10 2021-08-10 Risk score generation method and device

Country Status (2)

Country Link
CN (1) CN113743735A (en)
CA (1) CA3169372A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108734338A (en) * 2018-04-24 2018-11-02 阿里巴巴集团控股有限公司 Credit risk forecast method and device based on LSTM models
CN109753049A (en) * 2018-12-21 2019-05-14 国网江苏省电力有限公司南京供电分公司 The exceptional instructions detection method of one provenance net load interaction industrial control system
CN112270547A (en) * 2020-10-27 2021-01-26 上海淇馥信息技术有限公司 Financial risk assessment method and device based on feature construction and electronic equipment
CN112990464A (en) * 2021-03-12 2021-06-18 东北师范大学 Knowledge tracking method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108734338A (en) * 2018-04-24 2018-11-02 阿里巴巴集团控股有限公司 Credit risk forecast method and device based on LSTM models
CN109753049A (en) * 2018-12-21 2019-05-14 国网江苏省电力有限公司南京供电分公司 The exceptional instructions detection method of one provenance net load interaction industrial control system
CN112270547A (en) * 2020-10-27 2021-01-26 上海淇馥信息技术有限公司 Financial risk assessment method and device based on feature construction and electronic equipment
CN112990464A (en) * 2021-03-12 2021-06-18 东北师范大学 Knowledge tracking method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨智伟;刘灏;毕天姝;杨奇逊;: "基于长短期记忆网络的PMU不良数据检测方法", 电力系统保护与控制, no. 07 *

Also Published As

Publication number Publication date
CA3169372A1 (en) 2023-02-10

Similar Documents

Publication Publication Date Title
CN108846520B (en) Loan overdue prediction method, loan overdue prediction device and computer-readable storage medium
US8838491B2 (en) Method and system for an integrated approach to collections cycle optimization
US20200090268A1 (en) Method and apparatus for determining level of risk of user, and computer device
CN110417721A (en) Safety risk estimating method, device, equipment and computer readable storage medium
US11854088B1 (en) Methods and systems for improving the underwriting process
CN106529773A (en) Online credit and fraud risk evaluation method based on identifying code type question answering
EP3706017A1 (en) System and method for determining reasons for anomalies using cross entropy ranking of textual items
CN112598294A (en) Method, device, machine readable medium and equipment for establishing scoring card model on line
CN112328869A (en) User loan willingness prediction method and device and computer system
CN112200402B (en) Risk quantification method, device and equipment based on risk portrait
CN113919432A (en) Classification model construction method, data classification method and device
US11488185B2 (en) System and method for unsupervised abstraction of sensitive data for consortium sharing
CN112734566A (en) Credit limit acquisition method and device and computer equipment
US20210133586A1 (en) System and method for unsupervised abstraction of sensitive data for realistic modeling
US20210133783A1 (en) System and method for unsupervised abstraction of sensitive data for detection model sharing across entities
CN113743735A (en) Risk score generation method and device
CN114298825A (en) Method and device for extremely evaluating repayment volume
CN111899093B (en) Method and device for predicting default loss rate
CN113987351A (en) Artificial intelligence based intelligent recommendation method and device, electronic equipment and medium
CN113159924A (en) Method and device for determining trusted client object
US20210133644A1 (en) System and method for unsupervised abstraction of sensitive data for consortium sharing
CN112967062A (en) User identity recognition method based on cautious degree
CN110570301A (en) Risk identification method, device, equipment and medium
CN110362981A (en) The method and system of abnormal behaviour are judged based on credible equipment fingerprint
CN116797343A (en) Risk assessment method, model training method, device, medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20211203

WW01 Invention patent application withdrawn after publication