CN117435999A

CN117435999A - Risk assessment method, apparatus, device and medium

Info

Publication number: CN117435999A
Application number: CN202311399849.6A
Authority: CN
Inventors: 李凯
Original assignee: Ping An Life Insurance Company of China Ltd
Current assignee: Ping An Life Insurance Company of China Ltd
Priority date: 2023-10-25
Filing date: 2023-10-25
Publication date: 2024-01-23

Abstract

The embodiment of the application provides a risk assessment method, a risk assessment device, risk assessment equipment and risk assessment media, wherein the risk assessment method comprises the following steps: acquiring object feature data of a target object, and inputting the object feature data into a pre-trained risk evaluation model to obtain a risk evaluation result; the risk evaluation model is obtained through the following steps: acquiring a training data set; training a plurality of decision trees according to the training data set to obtain a plurality of decision tree models; and respectively carrying out first runnability analysis and second runnability analysis in the running process on the plurality of decision tree models, determining one of the plurality of decision tree models as a risk evaluation model and deploying the risk evaluation model. By setting a plurality of decision tree models and determining a risk evaluation model matched with the object characteristic data collection condition in the current period service according to the operability analysis and updating at any time, the accuracy of risk evaluation is improved; and because the risk evaluation model can be replaced from a plurality of decision tree models, the risk evaluation process is suitable for various scenes of the same service.

Description

Risk assessment method, apparatus, device and medium

Technical Field

The present application relates to the field of financial science and technology, and in particular, to a risk assessment method, apparatus, device, and medium.

Background

At present, in financial industries such as loans, insurance, etc., risk assessment means are widely used for risk prevention and control of financial businesses. In the related art, risk assessment can be performed on financial services through a decision tree model.

In the risk assessment process by the technical means of the decision tree model, the risk assessment can be accurately carried out on the person to be assessed by acquiring more complete assessment data in the early stage. In the actual business process, the difficulty of acquiring complete evaluation data is high, so that the risk evaluation result of the decision tree model is inaccurate.

Disclosure of Invention

The embodiment of the application mainly aims to provide a risk assessment method, device, equipment and medium, aiming at improving the accuracy of risk assessment.

To achieve the above object, a first aspect of an embodiment of the present application proposes a risk assessment method, including:

acquiring object feature data of a target object;

inputting the object characteristic data into a pre-trained risk evaluation model to obtain a risk evaluation result;

wherein, the risk evaluation model is obtained by the following steps:

acquiring a training data set, wherein the training data set comprises object feature sample data corresponding to a plurality of sample objects and labels corresponding to the sample objects;

Performing multiple decision tree training according to object feature sample data corresponding to the multiple sample objects and labels corresponding to the sample objects to obtain multiple decision tree models;

respectively carrying out first operability analysis on a plurality of decision tree models, determining one of the decision tree models as the risk evaluation model according to an analysis result, and deploying the risk evaluation model;

and responding to a preset operation process detection condition, carrying out second operability analysis on the risk evaluation model, and when the risk evaluation model does not meet a preset deployment requirement, respectively carrying out second operability analysis on a plurality of candidate decision tree models so as to redetermine one of the candidate decision tree models as the risk evaluation model, wherein the candidate decision tree models are the decision tree models other than the current risk evaluation model.

In some possible embodiments of the present application, the preset operation process detection condition is a detection condition of a preset monitoring table, where the preset monitoring table is used to store the risk evaluation result and the object feature data; the preset deployment requirement is represented by a preset accuracy threshold;

after obtaining the risk evaluation result, the method further comprises:

Storing the risk evaluation result and the object characteristic data into the preset monitoring table;

the responding to the preset operation process detection condition, performing a second operation analysis on the risk evaluation model, and when the risk evaluation model does not meet the preset deployment requirement, performing the second operation analysis on the plurality of candidate decision tree models respectively to re-determine one of the plurality of candidate decision tree models as the risk evaluation model, wherein the method comprises the following steps:

responding to the detection conditions of the preset monitoring table, and carrying out accuracy analysis on the prediction result of the risk evaluation model according to the preset monitoring table to obtain an analysis result;

when the analysis result shows that the accuracy rate of the risk evaluation model is larger than or equal to the preset accuracy rate threshold, respectively judging the operation data of each candidate decision tree model according to the preset monitoring table to obtain the analysis result corresponding to each candidate decision tree model;

and re-determining one of the candidate decision tree models as the risk evaluation model according to the analysis result corresponding to each candidate decision tree model.

In some possible embodiments of the present application, the preset operation process detection condition includes data quality judgment; the preset deployment requirement is represented by a model tag;

performing data quality judgment on the object feature data to determine the model tag corresponding to the object feature data;

and determining one of the decision tree models as a target decision tree model according to the model label, and replacing the risk evaluation model with the target decision tree model when the target decision tree model is different from the current risk evaluation model.

In some possible embodiments of the present application, the object feature data includes basic feature data and business feature data;

inputting the object characteristic data into a pre-trained risk evaluation model to obtain a risk evaluation result, wherein the method comprises the following steps of:

performing disease decision on the basic characteristic data according to the risk evaluation model to obtain a personal disease risk evaluation result of the target object;

Carrying out default risk decision according to the personal disease risk evaluation result and the basic characteristic data to obtain a default risk decision result;

and judging the data missing of the business feature data, and determining the risk evaluation result according to the default risk decision result when the judging result indicates that the business feature data cannot support the subsequent default risk decision.

In some possible embodiments of the present application, the determining the risk assessment result according to the default risk decision result includes:

carrying out probability analysis on the default risk decision result according to the preset monitoring table and the training data set together to obtain risk probability corresponding to the default risk decision result;

and performing scoring mapping according to the risk probability to obtain a risk score corresponding to the target object, and determining the risk evaluation result according to the risk score.

In some possible embodiments of the present application, after responding to a preset running process detection condition to redefine the risk assessment model, the method further includes:

responding to disease statistics data updating, acquiring updated disease statistics data to update a plurality of disease labels, and updating a preset label pool according to the updated disease labels, wherein the disease labels are used for generating decision branches of the decision tree model; the preset label pool comprises a plurality of disease labels, and is used for storing the disease labels;

Retraining a plurality of decision tree models according to the training data set and the updated preset label pool so as to update the plurality of decision tree models.

In some possible embodiments of the present application, the training data set includes a training set and a validation set;

performing multiple decision tree training according to object feature sample data corresponding to the multiple sample objects and labels corresponding to the sample objects to obtain multiple decision tree models, including:

respectively training a plurality of decision trees according to the training set to obtain a plurality of initial decision tree models;

respectively verifying a plurality of initial decision tree models according to the verification set to obtain a plurality of decision tree models;

performing feature missing classification on the verification set to obtain a plurality of classification sets, wherein the classification sets correspond to a plurality of data missing categories one by one;

setting a corresponding model label for each decision tree model according to a plurality of verification results corresponding to each classification set by each decision tree model, wherein the verification results correspond to one of the verification sets through one of a plurality of initial decision tree models;

The determining the model tag corresponding to the object feature data by performing data quality judgment on the object feature data includes:

judging the data missing type according to the object characteristic data to obtain data missing type information;

and determining the model label corresponding to the object characteristic data according to the data missing class information.

To achieve the above object, a second aspect of the embodiments of the present application proposes a risk assessment apparatus, including:

the object feature data acquisition module is used for acquiring object feature data of a target object;

the risk evaluation result acquisition module is used for inputting the object characteristic data into a pre-trained risk evaluation model to obtain a risk evaluation result; the risk evaluation model is obtained through training the following steps:

Respectively carrying out first operability analysis on a plurality of decision tree models, and determining one of the decision tree models as the risk evaluation model according to an analysis result;

and responding to a preset deployment process detection condition, carrying out second operability analysis on the risk evaluation model, and when the risk evaluation model does not meet a preset deployment requirement, respectively carrying out second operability analysis on a plurality of candidate decision tree models so as to redetermine one of the candidate decision tree models as the risk evaluation model, wherein the candidate decision tree models are the decision tree models other than the current risk evaluation model.

To achieve the above object, a third aspect of the embodiments of the present application proposes an electronic device, which includes a memory and a processor, the memory storing a computer program, the processor implementing the method according to the first aspect when executing the computer program.

To achieve the above object, a fourth aspect of the embodiments of the present application proposes a computer-readable storage medium storing a computer program that, when executed by a processor, implements the method of the first aspect.

The application provides a risk assessment method, a risk assessment device, risk assessment equipment and risk assessment media, wherein the risk assessment method comprises the following steps: acquiring object feature data of a target object; inputting the object characteristic data into a pre-trained risk evaluation model to obtain a risk evaluation result; wherein, the risk evaluation model is obtained by the following steps: acquiring a training data set, wherein the training data set comprises object feature sample data corresponding to a plurality of sample objects and labels corresponding to the sample objects; performing multiple decision tree training according to object feature sample data corresponding to the multiple sample objects and labels corresponding to the sample objects to obtain multiple decision tree models; respectively carrying out first operability analysis on a plurality of decision tree models, determining one of the decision tree models as the risk evaluation model according to an analysis result, and deploying the risk evaluation model; and responding to a preset operation process detection condition, carrying out second operability analysis on the risk evaluation model, and when the risk evaluation model does not meet a preset deployment requirement, respectively carrying out second operability analysis on a plurality of candidate decision tree models so as to redetermine one of the candidate decision tree models as the risk evaluation model, wherein the candidate decision tree models are the decision tree models other than the current risk evaluation model. According to the embodiment of the application, by setting a plurality of decision tree models and determining a risk evaluation model matched with the evaluation data collection condition in the service in the current period according to the operability analysis and updating at any time, the influence of systematic data deletion caused by individual data deletion or service variation in the service process on the risk evaluation accuracy is reduced, so that the risk evaluation model can be matched with the current service condition, and the accuracy of risk evaluation is improved; and because the risk evaluation model can be replaced from a plurality of decision tree models, the risk evaluation process can adapt to various scenes of the same service.

Drawings

FIG. 1 is a schematic diagram of steps of a risk assessment method according to one embodiment of the present application;

FIG. 2 is a schematic diagram of steps of one embodiment of acquiring the risk assessment model of FIG. 1;

FIG. 3 is a schematic diagram illustrating steps of a risk assessment method according to another embodiment of the present application;

FIG. 4 is a step schematic diagram of a substep embodiment of step S204 in FIG. 2;

FIG. 5 is a step schematic diagram of another substep embodiment of step S204 in FIG. 2;

FIG. 6 is a step schematic diagram of a substep embodiment of step S102 of FIG. 1;

FIG. 7 is a step schematic diagram of a substep embodiment of step S603 in FIG. 6;

FIG. 8 is a schematic step diagram of another embodiment of the acquisition of the risk assessment model of FIG. 1;

FIG. 9 is a step schematic diagram of a substep embodiment of step S201 of FIG. 2;

FIG. 10 is a step schematic diagram of a substep embodiment of step S501 of FIG. 5;

FIG. 11 is a schematic structural diagram of a risk assessment device according to an embodiment of the present application;

fig. 12 is a schematic hardware structure of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

It should be noted that although functional block division is performed in a device diagram and a logic sequence is shown in a flowchart, in some cases, the steps shown or described may be performed in a different order than the block division in the device, or in the flowchart. The terms first, second and the like in the description and in the claims and in the above-described figures, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.

First, several nouns referred to in this application are parsed:

artificial intelligence (artificial intelligence, AI): is a new technical science for researching and developing theories, methods, technologies and application systems for simulating, extending and expanding the intelligence of people; artificial intelligence is a branch of computer science that attempts to understand the nature of intelligence and to produce a new intelligent machine that can react in a manner similar to human intelligence, research in this field including robotics, language recognition, image recognition, natural language processing, and expert systems. Artificial intelligence can simulate the information process of consciousness and thinking of people. Artificial intelligence is also a theory, method, technique, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, sense the environment, acquire knowledge, and use knowledge to obtain optimal results.

Based on the above, the embodiment of the application provides a risk assessment method, a risk assessment device and a risk assessment medium, which aim to improve the accuracy of risk assessment.

The risk assessment method, device, equipment and medium provided by the embodiment of the application are specifically described through the following embodiments, and the risk assessment method in the embodiment of the application is described first.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The risk assessment method provided by the embodiment of the application can be applied to the terminal, the server side and software running in the terminal or the server side. In some embodiments, the terminal may be a smart phone, tablet, notebook, desktop, etc.; the server side can be configured as an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, and a cloud server for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, basic cloud computing services such as big data and artificial intelligent platforms and the like; the software may be an application or the like that implements the risk assessment method, but is not limited to the above form.

The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

It should be noted that, in each specific embodiment of the present application, when related processing is required according to user information, user behavior data, user history data, user location information, and other data related to user identity or characteristics, permission or consent of the user is obtained first, and the collection, use, processing, and the like of these data comply with related laws and regulations and standards. In addition, when the embodiment of the application needs to acquire the sensitive personal information of the user, the independent permission or independent consent of the user is acquired through a popup window or a jump to a confirmation page or the like, and after the independent permission or independent consent of the user is explicitly acquired, necessary user related data for enabling the embodiment of the application to normally operate is acquired.

Referring to fig. 1, fig. 1 is a schematic diagram illustrating steps of a risk assessment method according to an embodiment of the present application, where the method in fig. 1 may include, but is not limited to, steps S101 to S102.

Step S101, object feature data of a target object is acquired.

Step S102, inputting the object characteristic data into a pre-trained risk evaluation model to obtain a risk evaluation result.

It should be understood that the object feature data herein includes, but is not limited to, basic feature data including, but not limited to, data on personal information such as name, birth place, usual place, hobbies, lifestyle, medical records, and the like, and business feature data; business characteristic data includes, but is not limited to, credit data, consumption data, behavior data, and the like, obtained from historical business behavior of the target object.

It should be understood that the risk assessment model herein is a decision tree model, and its specific form is various, and exemplary, such as an ID3 decision tree model, a C4.5 decision tree model, a CART decision tree model, etc., which is not limited in this embodiment of the present application.

Referring to fig. 2, fig. 2 is a schematic diagram illustrating steps of an embodiment of acquiring the risk assessment model in fig. 1. In some possible embodiments of the present application, the training step of the pre-trained risk assessment model includes, but is not limited to, the following steps.

Step S201, a training data set is acquired.

Step S202, training a plurality of decision trees according to object feature sample data corresponding to the plurality of sample objects and labels corresponding to the sample objects to obtain a plurality of decision tree models.

Step S203, a first operability analysis is performed on the plurality of decision tree models, and one of the plurality of decision tree models is determined as a risk evaluation model and deployed according to the analysis result.

Step S204, responding to the detection condition of the preset operation process, performing second operability analysis on the risk evaluation model, and when the risk evaluation model does not meet the preset deployment requirement, performing second operability analysis on the plurality of candidate decision tree models respectively so as to re-determine one of the plurality of candidate decision tree models as the risk evaluation model.

It should be appreciated that the training data set herein includes object feature sample data corresponding to a plurality of sample objects and labels corresponding to sample objects, where the object feature sample data includes, but is not limited to, base feature data, credit data, consumption data, and behavior data, and where the labels corresponding to sample objects are used to determine whether model predictions and true results are the same.

It should be understood that the candidate decision tree model herein refers to a decision tree model other than the current risk assessment model.

It should be understood that the specific form of the second operability analysis herein is various, and may be the following embodiments, or may be other embodiments, which are not limited in this application.

In an embodiment, the training data set includes a training set and a verification set, the preset operation process detection condition is that the risk evaluation model meets a certain operation time, the second operability judgment is to evaluate the accuracy of the risk evaluation model in actual use, and the preset deployment requirement is that the accuracy is highest. Training different initial decision tree models by using a training data set to obtain a plurality of decision tree models, respectively carrying out first operability analysis on each decision tree model by using a verification set before putting the plurality of decision tree models into actual risk assessment, putting one decision tree model with highest accuracy in the plurality of decision tree models into actual risk assessment as a risk assessment model, and taking other decision tree models which are not taken as risk assessment models as candidate decision tree models to replace the current risk assessment model at any time.

And after the risk evaluation model is put into practical use and operated for a period of time, the detection condition of the preset operation process is met, at the moment, the second operability analysis is carried out, and the current evaluation accuracy of the risk evaluation model is detected in the practical use. If the current evaluation accuracy is lower than a certain value after the risk evaluation model is operated for a period of time, the risk evaluation model has higher error evaluation probability.

Due to the service development, the services in different periods require different specific types in the object feature data of the acquired target object in the actual operation process, which may cause the evaluation accuracy to be lower than a certain value, i.e. the current risk evaluation model may not be suitable for the services performed in the current period, and one risk evaluation model needs to be replaced to match the services performed in the current period to reduce the risk probability of error evaluation. And determining a decision tree model with highest evaluation accuracy from a plurality of candidate decision tree models, and taking the decision tree model as a new risk evaluation model to perform risk evaluation.

In one embodiment, the training data set is used to respectively model and fuse an extreme gradient lifting (eXtreme Gradient Boosting, XGB) decision tree model, a slight gradient lifting (Light Gradient Boosting Machine, LGB) decision tree model and a synthetic decision tree model obtained by fusing the XGB decision tree model and the LGB decision tree model, wherein one or more synthetic decision tree models can be used according to different model fusion weights, and the XGB decision tree model, the LGB decision tree model and the synthetic decision tree model can be subjected to second operability analysis at different service stages to determine a risk evaluation model.

In an embodiment, the preset operation process detection condition is whether to prepare for inputting the step S101 into the risk evaluation model, the second operation analysis is an applicability analysis of the object feature data and the risk evaluation model, and the preset deployment requirement is matched with the object feature data. And carrying out a second operation analysis before inputting the object feature data into the risk evaluation model each time, determining an optimal decision tree model of the object feature data needing to be subjected to risk evaluation in a plurality of decision tree models by analyzing the object feature data, and determining one of other decision tree models as a risk evaluation model if the evaluation effect of the current risk evaluation model on the current object feature data is not optimal, namely determining one risk evaluation model each time when the object feature data is acquired, so as to realize flexible calling of the decision tree model.

By setting a plurality of decision tree models and determining a risk evaluation model matched with the evaluation data collection condition in the service in the current period according to the operability analysis and updating the risk evaluation model at any time, the influence of systematic data deletion caused by individual data deletion or service variation in the service process on the risk evaluation accuracy is reduced, the risk evaluation model can be matched with the current service condition, and the accuracy of risk evaluation is improved; and because the risk evaluation model can be replaced from a plurality of decision tree models, the risk evaluation process can adapt to various scenes of the same service.

Referring to fig. 3, fig. 3 is a schematic diagram illustrating steps of a risk assessment method according to another embodiment of the present application. In some possible embodiments of the present application, the preset operation process detection condition is a detection condition of a preset monitoring table, where the preset monitoring table is used to store the object feature data obtained in step S101 and the risk evaluation result obtained in step S102; the following steps are included after step S102, but are not limited to.

Step S301, storing the risk evaluation result and the object feature data in a preset monitoring table.

In an embodiment, after the risk evaluation model obtains a risk evaluation result according to the object feature data, the risk evaluation result and the object feature data are stored in a preset monitoring table for subsequent operational analysis.

It should be understood that, after the risk evaluation model is determined to be replaced, the processing manner of the stored data is various, and different processing manners of the preset detection table enable the data used in the next runnability analysis to be different, which may be specifically the following embodiments or other embodiments, and the embodiments of the present application are not limited to this.

In an embodiment, when it is determined to replace the risk evaluation model, all risk evaluation results and object feature data in the preset monitoring table are cleared.

In an embodiment, when it is determined to replace the risk evaluation model, all risk evaluation results and object feature data in the preset monitoring table are retained.

Taking a specific embodiment as an example, the decision tree model comprises an XGB decision tree model, an LGB decision tree model and a synthetic decision tree model of the XGB decision tree model and the LGB decision tree model, and when the previous risk evaluation model is the XGB decision tree model and the replaced risk evaluation model is the synthetic decision tree model, all risk evaluation results and object characteristic data in a preset monitoring table are reserved; when the previous risk evaluation model is an XGB decision tree model and the replaced risk evaluation model is an LGB decision tree model, all risk evaluation results and object feature data in a preset monitoring table are all cleared; when the previous risk evaluation model is an LGB decision tree model and the replaced risk evaluation model is a synthetic decision tree model, all risk evaluation results and object feature data in a preset monitoring table are reserved; when the previous risk evaluation model is an LGB decision tree model and the replaced risk evaluation model is an XGB decision tree model, all risk evaluation results and object feature data in a preset monitoring table are all cleared; when the previous risk evaluation model is a synthetic decision tree model, all risk evaluation results and object characteristic data in a preset monitoring table are reserved.

Referring to fig. 4, fig. 4 is a schematic step diagram of an embodiment of a substep of step S204 in fig. 2. In some possible embodiments of the present application, the preset deployment requirement is represented by a preset accuracy threshold; step S204 includes, but is not limited to, the following substeps.

Step S401, responding to detection conditions of a preset monitoring table, and carrying out accuracy analysis on a prediction result of the risk evaluation model according to the preset monitoring table to obtain an analysis result.

Step S402, when the analysis result indicates that the accuracy of the risk evaluation model is greater than or equal to a preset accuracy threshold, respectively judging the operation data of each candidate decision tree model according to a preset monitoring table to obtain an analysis result corresponding to each candidate decision tree model;

step S403, one of the candidate decision tree models is determined again as a risk evaluation model according to the analysis result corresponding to each candidate decision tree model.

It should be understood that the detection conditions of the preset monitoring table are various, and exemplary, if the risk evaluation results recorded in the preset monitoring table reach the preset number, the second runnability analysis is performed once; for another example, triggering according to the time length of recording the risk evaluation result in the preset monitoring table, and when the running time length reaches a certain degree, performing a second runnability analysis, etc., which is not limited in the embodiment of the present application.

In an embodiment, in response to a detection condition of a preset monitoring table, corresponding real risk assessment results are obtained according to all object feature data recorded in the preset monitoring table, and accuracy analysis is performed according to all the real risk assessment results and all the risk assessment results recorded in the table to obtain analysis results. When the analysis result shows that the accuracy of the risk evaluation model is lower than a preset accuracy threshold, the current risk evaluation model is not suitable for the service in the current period, and the risk evaluation model needs to be replaced, because the preset monitoring table records object characteristic data related to the service in the current period, operation data judgment is carried out on all candidate decision tree models through the preset monitoring table to obtain a plurality of analysis results, candidate decision tree models with higher matching degree with the service in the current period in the candidate decision tree models are determined according to the plurality of analysis results, and the candidate decision tree models are determined to be new risk evaluation models.

It should be appreciated that when a candidate decision tree model is determined to be a new risk assessment model, the old risk assessment model will become the candidate decision tree model until the next invocation.

According to the risk evaluation method and the risk evaluation device, the risk evaluation model which does not accord with the deployment index is monitored through the risk evaluation model, and the risk evaluation model is replaced in time, so that the risk evaluation accuracy is prevented from being reduced due to the fact that the risk evaluation model is not matched with the service in the current period, and the accuracy of risk evaluation is improved.

Referring to fig. 5, fig. 5 is a schematic step diagram of another substep embodiment of step S204 in fig. 2. In some possible embodiments of the present application, the preset operation process detection condition includes data quality judgment, and the preset deployment requirement is represented by a model tag; step S204 includes, but is not limited to, the following substeps.

In step S501, data quality judgment is performed on the object feature data to determine a model tag corresponding to the object feature data.

Step S502, one of the decision tree models is determined as a target decision tree model according to the model label, and when the target decision tree model is different from the current risk evaluation model, the risk evaluation model is replaced by the target decision tree model.

It should be understood that the data quality determination herein is various and may be implemented by the following embodiments, or may be implemented by other embodiments, which are not limited in this application.

In an embodiment, data quality judgment is performed on object feature data, the missing condition of various data in the object feature data is judged, a decision tree model applicable to current object feature data is analyzed according to the missing condition of the data of the object feature data, a model label of the decision tree model applicable to the current object feature data is determined, one decision tree model corresponding to the model label in a plurality of decision tree models is determined to be a target decision tree model according to the model label, when the target decision tree model is the same model as the current risk evaluation model, no model replacement action is performed, and when the target decision tree model is not the same model as the current risk evaluation model, the risk evaluation model is replaced with the target decision tree model, so that the target decision tree model is a new risk evaluation model.

In an embodiment, when there are a plurality of model labels corresponding to the object feature data, that is, there are a plurality of decision tree models applicable to the current object feature data, determining, according to the plurality of model labels, a plurality of candidate decision tree models corresponding to the plurality of decision tree models as candidate target decision tree models, determining a target decision tree model from the plurality of candidate target decision tree models, if the target decision tree model is not the same model as the current risk evaluation model, determining the target decision tree model as a new risk evaluation model, otherwise, not performing any model replacement action.

It should be understood that determining a risk assessment model from a plurality of candidate decision tree models is various, and exemplary, such as random selection, and further, such as the number of times the candidate decision tree model is selected, etc., embodiments of the present application are not limited in this regard.

According to the method and the device for judging the data quality of the object feature data, the decision tree model suitable for the current object feature data condition is selected, so that risk assessment can be achieved by the decision tree model in various conditions, and the probability of risk assessment errors caused by the data missing condition of the object feature data is reduced.

Referring to fig. 6, fig. 6 is a schematic step diagram of an embodiment of a substep of step S102 in fig. 1. In some possible embodiments of the present application, the object feature data includes basic feature data and business feature data, and step S102 includes, but is not limited to, the following substeps.

And step S601, performing disease decision on the basic characteristic data according to the risk evaluation model to obtain a personal disease risk evaluation result of the target object.

And step S602, carrying out default risk decision according to the personal disease risk evaluation result and the basic characteristic data to obtain a default risk decision result.

And step S603, judging the data missing of the business feature data, and determining a risk evaluation result according to the rule-breaking risk decision result when the judging result indicates that the business feature data cannot support the subsequent rule-breaking risk decision.

It should be understood that, here, the default risk decision result refers to a result obtained by implementing a decision action by one decision branch in a decision tree model, and one default risk decision result obtained by branching the last decision tree model enters a corresponding branch, and is obtained in the corresponding branch according to the object feature data.

It should be understood that the specific manner of determining the risk assessment result according to the default risk decision result herein is various, and exemplary, for example, the default risk decision result is converted into a risk score, and the risk assessment result is obtained according to the risk score; and if the breach risk decision result is directly determined as a risk evaluation result, the embodiment of the application is not limited to this.

In one embodiment, when training multiple decision tree models, the risk assessment effect is improved by collecting personal disease-prone data such as chronic disease and rare disease in various disease control center websites to add decision judgment on personal disease-prone in the decision tree models. In actual judgment, disease decision is carried out on the basic characteristic data through a risk evaluation model to obtain a personal disease risk evaluation result of the target object, so that the personal disease susceptibility of the target object is determined, and the influence on the business when the target object suffers from the personal disease susceptibility is also included in the risk evaluation category. After the personal disease risk evaluation result is obtained, decision is made according to some personal characteristics, and whether factors such as the common place and hobbies of a target object affect the business is determined. As the disease decision is made, a specific branch of personal disease risk evaluation results is fallen into, and on the basis of the branch, an illegal risk decision is made according to basic characteristic data, so as to obtain an illegal risk decision result. After the training of the decision tree model is completed, the input data requirement is determined, namely, the input data specifically needs to comprise what type of data, and the data input into the decision tree model cannot meet the data requirement due to the absence of certain type of data, so that the decision tree model cannot make a decision in a certain specific branch, and the decision cannot be made in a subsequent stage or a subsequent branch, or the decision cannot be made or is wrong, so that the risk evaluation result is wrong.

The business feature data are difficult to obtain, so that the situation of data missing is very easy to occur, therefore, after the decision of the default risk is made according to the basic feature data, the business feature data are required to be subjected to data missing judgment, the situation of missing of the business feature data is determined, when all the business feature data are missing, the part of the decision tree model for making the decision according to the business feature data cannot be realized, at the moment, the subsequent decision of the decision tree model is stopped, the risk evaluation result is determined according to the decision result of the default risk obtained by the decision of the basic feature data, and the risk evaluation is completed.

In an embodiment, some decision tree models including special algorithm frameworks, such as XGB decision tree models, have the capability of making decisions according to incomplete data, and these decision tree models can make predictive decisions on branches of a certain missing data according to previous training conditions, so that the decision tree model can always obtain a branch decision result in the branches of the missing data and extend to the next branch to support the complete decision of the whole decision tree model, and the predictive decisions have a larger risk of decision offset, so that the final decision result of the model has a systematic offset risk.

For the actual scene of risk assessment, when the commercial feature data has partial missing condition and the commercial feature data missing part leads the offset risk of the risk assessment result to exceed a certain threshold value, the default risk decision is not carried out according to the commercial feature data after the decision is made according to the basic feature data, and the risk assessment result is determined according to the default risk decision result obtained by the decision of the basic feature data so as to reduce the offset risk.

According to the method and the device for risk assessment, through setting of disease decision and business feature data missing judgment, the risk assessment process can reduce the degree of unreliability of risk assessment results caused by business feature data missing, and the utilization rate of basic feature data is improved.

Referring to fig. 7, fig. 7 is a step schematic diagram of an embodiment of a substep of step S603 in fig. 6. In some possible embodiments of the present application, step S603 includes, but is not limited to, the following embodiments.

And step 701, carrying out probability analysis on the default risk decision result according to a preset monitoring table and a training data set together to obtain risk probability corresponding to the default risk decision result.

Step S702, performing scoring mapping according to the risk probability to obtain a risk score corresponding to the target object, and determining a risk evaluation result according to the risk score.

Because the acquired training data set and the historical object feature data which is subjected to risk assessment in the preset monitoring table are directly related to the service, the risk probability corresponding to the default risk decision result can be calculated, and the real occurrence probability of the default risk decision result in the real service is determined, so that the accuracy of risk assessment is improved.

In an embodiment, a corresponding real risk occurrence result is obtained according to object feature data having undergone risk assessment in a preset monitoring table, and risk probability of the real result corresponding to the default risk decision result of the current object feature data is calculated according to the default risk decision result and the real result of historical object feature data of similar or same type as the current object feature data and the real result corresponding to object feature sample data of similar or same type as the current object feature data.

It should be understood that the specific form of determining the risk evaluation result according to the risk score herein is various, and may be the following embodiment, or may be other embodiments, which are not limited in this application.

In one embodiment, the risk score is compared to a preset score threshold to determine if the target object is at high risk on the business.

In an embodiment, the service is divided into multiple grades, each grade is provided with a score interval, and the corresponding service grade number is determined according to the score interval where the risk score is located.

According to the risk evaluation method and device, the risk score is obtained by calculating the risk probability, and the risk evaluation result is determined through the risk score, so that the situation that a large gap exists between the risk evaluation result and the real result of the target object due to the fact that the special data type is missing in the object characteristic data is reduced, and the accuracy of risk evaluation is improved.

Referring to fig. 8, fig. 8 is a schematic step diagram of another embodiment of acquiring the risk assessment model in fig. 1. In some possible embodiments of the present application, the following steps are also included after step S204, but are not limited thereto.

Step S801, in response to the update of the disease statistics, acquires updated disease statistics to update a plurality of disease tags, and updates a preset tag pool according to the updated disease tags.

Step S802, retraining the plurality of decision tree models according to the training data set and the updated preset label pool so as to update the plurality of decision tree models.

It should be understood that, the disease labels herein are used to generate decision branches of the decision tree model, the preset label pool herein includes a plurality of disease labels, the preset label pool is used to store the disease labels, and the decision tree model determines that a plurality of disease labels are used to generate a plurality of decision branches of the decision tree model by determining a plurality of disease labels in the preset label pool.

In one embodiment, when the disease statistics are updated, updated disease statistics are obtained, and the preset tag pool is updated by deleting the original individual disease tags and/or modifying the original individual disease tags and/or adding new disease tags according to the updated disease statistics. Determining a plurality of important disease labels according to the influence degree of the diseases corresponding to each disease label in the updated preset label pool on the service, updating the corresponding decision branches of each decision tree model according to the disease labels, and retraining each updated decision tree model according to the training data set so as to update the decision tree model.

In one embodiment, after updating the preset tag pool, a knowledge graph about the business is built according to each disease tag in the updated preset tag pool, and the correlation between each disease tag and the business and the relation between each disease tag and other contents are determined, so that the disease tags useful for business risk assessment are determined, and the step of updating the decision tree model is performed according to the disease tags.

According to the embodiment of the application, the time efficiency of the risk evaluation model is improved by updating the decision tree model, so that the risk evaluation of the business can be kept matched with the latest disease data, and the risk evaluation model can still keep higher evaluation accuracy along with the development of time.

Referring to fig. 9 and 10, fig. 9 is a schematic step diagram of an embodiment of a sub-step of step S201 in fig. 2, and fig. 10 is a schematic step diagram of an embodiment of a sub-step of step S501 in fig. 5. In some possible embodiments of the present application, step S201 includes, but is not limited to, the following substeps.

Step S901, training a plurality of decision trees according to the training set to obtain a plurality of initial decision tree models.

Step S902, respectively verifying the plurality of initial decision tree models according to the verification set to obtain a plurality of decision tree models.

And step S903, performing feature missing classification on the verification set to obtain a plurality of classification sets.

Step S904, according to the verification result of each decision tree model for each classification set, corresponding model labels are generated for each decision tree model.

Step S501 includes, but is not limited to, the following substeps.

In step S1001, data missing class determination is performed according to the object feature data, so as to obtain data missing class information.

Step S1002, determining a model tag corresponding to the object feature data according to the data missing class information.

It should be appreciated that multiple class sets correspond one-to-one to multiple data loss classes.

In one embodiment, the verification set is used to verify the plurality of initial decision tree models to modify weight parameters in the plurality of initial decision tree models, and then the verification is performed to obtain a plurality of decision tree models. Because the object feature sample data contained in the verification set is various, some object feature sample data can have partial type data missing conditions, so that the input requirement is not met, and for this purpose, feature missing classification is performed on the data in the verification set, so that a plurality of classification sets are obtained, for example, all commercial feature data missing is a type, only credit data missing is a type, and the like.

After classification, each decision tree model is verified on each data in the verification set, and a verification result of each data in the verification set is obtained. And for one decision tree model, calculating the risk assessment accuracy of each decision tree model for each classification set according to the verification results corresponding to all data in each classification set.

And when the risk assessment accuracy of the decision tree model for one or more of the plurality of classification sets is higher than a preset accuracy threshold, setting one or more model labels for the decision tree model correspondingly. After the risk evaluation model is put into use, judging what data missing type condition appears in the object characteristic data, obtaining data missing type information, and determining one or more model labels according to the data missing type information so as to determine the corresponding risk evaluation model.

In an embodiment, the model labels are set with the highest risk assessment accuracy, specifically, for a classification set, all the decision tree models are ranked for the risk assessment accuracy of the classification set, and the model labels corresponding to the classification set are set for the decision tree model with the highest risk assessment accuracy.

Referring to fig. 11, an embodiment of the present application further provides a risk assessment apparatus, which may implement the risk assessment method, where the apparatus 1100 includes:

the object feature data obtaining module 1101 is configured to obtain object feature data of a target object.

The risk evaluation result obtaining module 1102 is configured to input the object feature data to a pre-trained risk evaluation model, and obtain a risk evaluation result.

It should be understood that the risk assessment model herein is obtained by:

performing multiple decision tree training according to object feature sample data corresponding to multiple sample objects and labels corresponding to the sample objects to obtain multiple decision tree models;

respectively carrying out first operability analysis on the plurality of decision tree models, and determining one of the plurality of decision tree models as a risk evaluation model according to an analysis result;

and responding to a detection condition of a preset deployment process, performing second operability analysis on the risk evaluation model, and when the risk evaluation model does not meet the preset deployment requirement, performing second operability analysis on the plurality of candidate decision tree models respectively so as to redetermine one of the plurality of candidate decision tree models as the risk evaluation model.

It should be appreciated that the candidate decision tree model is a decision tree model other than the current risk assessment model.

The specific implementation of the risk assessment device is substantially the same as the specific embodiment of the risk assessment method described above, and will not be described herein.

The embodiment of the application also provides electronic equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the risk assessment method when executing the computer program. The electronic equipment can be any intelligent terminal including a tablet personal computer, a vehicle-mounted computer and the like.

Referring to fig. 12, fig. 12 illustrates a hardware structure of an electronic device according to another embodiment, and an electronic device 1200 includes:

the processor 1201 may be implemented by a general purpose CPU (central processing unit), a microprocessor, an application specific integrated circuit (ApplicationSpecificIntegratedCircuit, ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solutions provided by the embodiments of the present application;

memory 1202 may be implemented in the form of read-only memory (ReadOnlyMemory, ROM), static storage, dynamic storage, or random access memory (RandomAccessMemory, RAM). Memory 1202 may store an operating system and other application programs, and when implementing the technical solutions provided in the embodiments of the present application through software or firmware, relevant program codes are stored in memory 1202, and the risk assessment method for executing the embodiments of the present application is invoked by processor 1201;

An input/output interface 1203 for implementing information input and output;

the communication interface 1204 is configured to implement communication interaction between the device and other devices, and may implement communication in a wired manner (e.g., USB, network cable, etc.), or may implement communication in a wireless manner (e.g., mobile network, WIFI, bluetooth, etc.);

a bus 1205 for transferring information between various components of the device such as the processor 1201, memory 1202, input/output interface 1203, and communication interface 1204;

wherein the processor 1201, the memory 1202, the input/output interface 1203 and the communication interface 1204 enable communication connection between each other inside the device via a bus 1205.

The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the risk assessment method when being executed by a processor.

The memory, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs as well as non-transitory computer executable programs. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The embodiments described in the embodiments of the present application are for more clearly describing the technical solutions of the embodiments of the present application, and do not constitute a limitation on the technical solutions provided by the embodiments of the present application, and as those skilled in the art can know that, with the evolution of technology and the appearance of new application scenarios, the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.

It will be appreciated by those skilled in the art that the technical solutions shown in the figures do not constitute limitations of the embodiments of the present application, and may include more or fewer steps than shown, or may combine certain steps, or different steps.

The above described apparatus embodiments are merely illustrative, wherein the units illustrated as separate components may or may not be physically separate, i.e. may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

Those of ordinary skill in the art will appreciate that all or some of the steps of the methods, systems, functional modules/units in the devices disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof.

The terms "first," "second," "third," "fourth," and the like in the description of the present application and in the above-described figures, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

It should be understood that in this application, "at least one" means one or more, and "a plurality" means two or more. "and/or" for describing the association relationship of the association object, the representation may have three relationships, for example, "a and/or B" may represent: only a, only B and both a and B are present, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b or c may represent: a, b, c, "a and b", "a and c", "b and c", or "a and b and c", wherein a, b, c may be single or plural.

In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the above-described division of units is merely a logical function division, and there may be another division manner in actual implementation, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described above as separate components may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including multiple instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods of the various embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing a program.

Preferred embodiments of the present application are described above with reference to the accompanying drawings, and thus do not limit the scope of the claims of the embodiments of the present application. Any modifications, equivalent substitutions and improvements made by those skilled in the art without departing from the scope and spirit of the embodiments of the present application shall fall within the scope of the claims of the embodiments of the present application.

Claims

1. A risk assessment method, the risk assessment method comprising:

acquiring object feature data of a target object;

wherein, the risk evaluation model is obtained by the following steps:

2. The risk assessment method according to claim 1, wherein the preset operation process detection condition is a detection condition of a preset monitoring table for storing the risk assessment result and the object feature data; the preset deployment requirement is represented by a preset accuracy threshold;

after obtaining the risk evaluation result, the method further comprises:

3. The risk assessment method according to claim 1, wherein the preset operation process detection condition includes data quality judgment; the preset deployment requirement is represented by a model tag;

4. The risk assessment method of claim 2, wherein the object feature data comprises base feature data and business feature data;

5. The risk assessment method of claim 4, wherein the determining the risk assessment result from the default risk decision result comprises:

6. The risk assessment method according to claim 1, wherein after redefining the risk assessment model in response to a preset running process detection condition, the method further comprises:

7. A risk assessment method according to claim 3, wherein the training data set comprises a training set and a validation set;

8. A risk assessment apparatus, characterized in that the risk assessment apparatus comprises:

The risk evaluation result acquisition module is used for inputting the object characteristic data into a pre-trained risk evaluation model to obtain a risk evaluation result; wherein, the risk evaluation model is obtained by the following steps:

9. An electronic device comprising a memory storing a computer program and a processor implementing the method of any of claims 1 to 7 when the computer program is executed by the processor.

10. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the method of any one of claims 1 to 7.