CN117611362A

CN117611362A - Protocol scheme pushing method based on pay risk prediction and related equipment

Info

Publication number: CN117611362A
Application number: CN202410008332.8A
Authority: CN
Inventors: 黄俊强
Original assignee: Ping An Property and Casualty Insurance Company of China Ltd
Current assignee: Ping An Property and Casualty Insurance Company of China Ltd
Priority date: 2024-01-03
Filing date: 2024-01-03
Publication date: 2024-02-27

Abstract

The application belongs to the field of financial science and technology, and relates to a protocol scheme method and related equipment based on payment risk prediction, wherein the protocol scheme method comprises the steps of cleaning and preprocessing a historical claim case original data set to obtain a historical claim case data set; carrying out characteristic engineering construction on the historical claim case data set to obtain a claim payment characteristic data set; adding a data tag into the claim characteristic data set to obtain a claim sample data set, and dividing the claim sample data set into a training set and a test set; training a pre-constructed neural network model by using a training set, and verifying by using a testing set to obtain a pay risk prediction model; inputting real-time to-be-predicted odds data into a odds risk prediction model to obtain a risk prediction result; and adjusting the procedure proportion based on the risk prediction result. In addition, the present application relates to blockchain technology in which historical claim datasets may be stored. The method and the device improve accuracy and stability of risk prediction of the model and realize dynamic adjustment of procedure proportion.

Description

Protocol scheme pushing method based on pay risk prediction and related equipment

Technical Field

The application relates to the technical field of artificial intelligence and financial science and technology, in particular to a protocol scheme pushing method based on claim risk prediction and related equipment.

Background

Reinsurance refers to the act of an insurer transferring some of the risk to other insurers or reinsurance companies. In the reinsurance business, the reservation distribution agreement refers to an agreement between the reinsurance company and the underwriter for specifying the risk distribution proportion accepted by the reinsurance company within a certain period of time. At present, a reinsurer usually determines the procedure proportion in a reservation and distribution protocol according to historical claim data and a risk assessment model, but the method has the defects that, for example, market change cannot be timely dealt with, the historical claim data cannot be fully utilized, and the like, so that risk prediction is inaccurate, and the procedure proportion of the protocol is difficult to determine.

Disclosure of Invention

An objective of the embodiments of the present application is to provide a method for pushing a protocol scheme based on risk prediction of reimbursement and related equipment, so as to solve the technical problems in the prior art that the risk prediction of reimbursement cannot timely cope with market changes and historical claim data cannot be fully utilized, resulting in inaccurate model risk prediction, and difficult determination of the procedure proportion of the protocol.

In order to solve the above technical problems, the embodiments of the present application provide a protocol scheme method based on pay risk prediction, which adopts the following technical scheme:

Acquiring a historical claim case original data set, and cleaning and preprocessing the historical claim case original data set to obtain a historical claim case data set;

carrying out characteristic engineering construction on the historical claim case data set to obtain a claim payment characteristic data set;

according to the classification result, based on a preset data labeling tool, adding a data tag to each piece of the pay data in the pay characteristic data set to obtain a pay sample data set, and dividing the pay sample data set into a training set and a test set;

inputting the training set into a pre-constructed neural network model for training to obtain a trained neural network model;

inputting the test set into the trained neural network model for verification, and outputting a pay risk prediction model;

obtaining the data to be predicted for the claims in real time, and inputting the data to be predicted for the claims into the claim risk prediction model to obtain a risk prediction result;

and adjusting the procedure proportion based on the risk prediction result, and outputting a corresponding protocol scheme according to the adjusted procedure proportion.

Further, the step of performing feature engineering construction on the historical claim case data set to obtain a claim payment feature data set includes:

Extracting features of the historical claim case data set to obtain feature variables of multiple dimensions;

dividing the historical claim case data set into a plurality of sub-training sets, and constructing classification features and condition features corresponding to each sub-training set according to all the feature variables;

calculating a base index of each classification feature in each sub-training set;

adding the base indexes of the classification features in each sub-training set to obtain the feature importance of each feature variable;

screening out feature variables with the feature importance degree smaller than or equal to a preset threshold value as the pay features;

and constructing the pay characteristic data set by the pay characteristic and the corresponding characteristic data.

Further, the neural network model comprises an input layer, a hidden layer, an attention layer, a feature fusion layer and an output layer; the step of inputting the training set into a pre-constructed neural network model for training to obtain a trained neural network model comprises the following steps:

inputting the training set into the input layer for feature extraction to obtain a pay feature vector;

feature learning is carried out on the pay feature vectors through the hiding layer, so that a plurality of pay hidden vectors and hidden weights of each pay hidden vector are obtained;

Adopting the attention layer to extract attention characteristics of the pay implicit vector and the implicit weight to obtain an attention characteristic vector;

carrying out feature fusion on the attention feature vector and the pay feature vector through the feature fusion layer to obtain a pay enhancement feature vector;

calculating the pay reinforcing feature vector through the output layer to obtain a prediction classification result;

calculating a loss value of the prediction classification result according to a preset loss function;

and adjusting network parameters of the neural network model based on the loss value, and continuing to train iteratively until the model converges, and outputting the trained neural network model.

Further, the step of inputting the training set into the input layer to perform feature extraction to obtain a pay feature vector includes:

inputting the training set into the input layer, and calling an encoding module of the input layer to encode each piece of pay data in the training set to obtain a plurality of encoding feature vectors;

and calling a convolution module of the input layer to carry out convolution feature extraction on the plurality of coding feature vectors to obtain the pay feature vector.

Further, the step of performing feature learning on the pay feature vectors through the hiding layer to obtain a plurality of pay hidden vectors and hidden weights of each pay hidden vector includes:

Extracting features of the pay feature vectors of each piece of pay feature data through a forward layer and a backward layer of the hidden layer to respectively obtain a forward hidden state feature and a backward hidden state feature;

splicing the forward hidden state features and the backward hidden state features according to positions to obtain hidden state features of each piece of the pay feature data;

and calculating a plurality of pay implicit vectors and implicit weights of each pay implicit vector according to the hidden layer state characteristics.

Further, the step of adjusting the network parameters of the neural network model based on the loss value, continuing iterative training until the model converges, and outputting the trained neural network model includes:

according to the loss value and the back propagation algorithm, iteratively updating network parameters of the neural network model until the neural network model converges;

and determining the network parameters converged by the current neural network model as target parameters, and obtaining a trained neural network model according to the target parameters.

Further, the step of adjusting the procedure proportion based on the risk prediction result includes:

calculating estimated odds based on the risk prediction result;

And calculating according to the estimated odds ratio to obtain risk agreement cost, and adjusting procedure proportion based on the risk agreement cost.

In order to solve the above technical problems, the embodiments of the present application further provide a protocol scheme device based on claim risk prediction, which adopts the following technical scheme:

the acquisition module is used for acquiring a historical claim case original data set, and cleaning and preprocessing the historical claim case original data set to obtain a historical claim case data set;

the characteristic engineering module is used for carrying out characteristic engineering construction on the historical claim case data set to obtain a claim payment characteristic data set;

the labeling module is used for adding a data tag to each piece of the pay data in the pay characteristic data set based on a preset data labeling tool according to the classification result to obtain a pay sample data set, and dividing the pay sample data set into a training set and a test set;

the training module is used for inputting the training set into a pre-constructed neural network model for training to obtain a trained neural network model;

the test module is used for inputting the test set into the trained neural network model for verification and outputting a pay risk prediction model;

The prediction module is used for acquiring the to-be-predicted pay data in real time, inputting the to-be-predicted pay data into the pay risk prediction model, and obtaining a risk prediction result;

and the adjusting module is used for adjusting the procedure proportion based on the risk prediction result and outputting a corresponding protocol scheme according to the adjusted procedure proportion.

In order to solve the above technical problems, the embodiments of the present application further provide a computer device, which adopts the following technical schemes:

the computer device includes a memory having stored therein computer readable instructions which when executed implement the steps of the pay risk prediction based protocol scheme method described above.

In order to solve the above technical problems, embodiments of the present application further provide a computer readable storage medium, which adopts the following technical solutions:

the computer readable storage medium has stored thereon computer readable instructions which, when executed by a processor, implement the steps of the protocol scheme method based on pay risk prediction as described above.

Compared with the prior art, the embodiment of the application has the following main beneficial effects:

According to the method, the obtained historical claim case data set is subjected to feature engineering construction to obtain the claim feature data set, the claim feature data set is marked to obtain the claim sample data set, the claim sample data set is used for training and verifying the neural network model to obtain the claim risk prediction model, the historical claim case data can be fully utilized for model training, and the accuracy and stability of the model on risk prediction are improved; predicting real-time to-be-predicted pay data through a pay risk prediction model so as to fully consider market changes and further improve risk prediction accuracy; and the procedure proportion is adjusted according to the risk prediction result, so that the rationality and the reliability of the output protocol scheme are improved, and the risk control capability and the benefit level of the reinsurance reservation separation protocol are further improved.

Drawings

For a clearer description of the solution in the present application, a brief description will be given below of the drawings that are needed in the description of the embodiments of the present application, it being obvious that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow chart of one embodiment of a method of agreement proposal pushing based on claim risk prediction according to the present application;

FIG. 3 is a flow chart of one embodiment of step S202 of FIG. 2;

FIG. 4 is a flow chart of one embodiment of step S204 of FIG. 2;

FIG. 5 is a schematic structural view of one embodiment of a claim risk prediction based protocol scenario pushing device according to the present application;

FIG. 6 is a schematic structural diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description and claims of the present application and in the description of the figures above are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

In order to better understand the technical solutions of the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings.

The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.

Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

The present application provides a method for pushing a protocol scheme based on claim risk prediction, which can be applied to a system architecture 100 shown in fig. 1, where the system architecture 100 may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.

It should be noted that, the protocol scheme pushing method based on the pay risk prediction provided in the embodiments of the present application is generally executed by a server/terminal device, and accordingly, the protocol scheme pushing device based on the pay risk prediction is generally disposed in the server/terminal device.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to fig. 2, a flowchart of one embodiment of a pay risk prediction based protocol scenario pushing method according to the present application is shown, comprising the steps of:

step S201, acquiring a historical claim case original data set, and cleaning and preprocessing the historical claim case original data set to obtain the historical claim case data set.

In this embodiment, the original data set of the history claim case may be transmitted through a wired connection manner or a wireless connection manner. It should be noted that the wireless connection may include, but is not limited to, 3G/4G connections, wiFi connections, bluetooth connections, wiMAX connections, zigbee connections, UWB (ultra wideband) connections, and other now known or later developed wireless connection means.

The original data set of the historical claim case can be collected from a plurality of data sources, the data can be obtained by compiling a web crawler, and performing targeted crawling after the data sources are set, a large amount of original data of the historical claim case can be obtained, and other modes can be adopted to obtain the original data of the historical claim case, so that the original data set of the historical claim case is not limited. The acquired original data of the historical claim case comprises basic information, scheme information and historical claim payment information, wherein the basic information comprises insurance application time, insurance application identity information, insurance application age, insurance application contact information and the like; the scheme information comprises scheme type contract form, business major class, genus group, dangerous seed, insurance amount, premium, discount rate, commission rate, procedure cost and the like; the historical reimbursement information includes the master risk, date of validation, expiration date, location of the accident, time of the accident, policy reimbursement conditions, and the like.

In this embodiment, data cleaning is performed on each piece of original data of the original data set of the historical claim cases, and the data cleaning is a process of rechecking and checking the data, so as to delete duplicate information and correct existing errors.

After cleaning, preprocessing the cleaned original data set of the history claim case, specifically as follows:

1) Missing value filling: different filling methods are adopted for different types of fields, for example, qualitative fields such as the identity information of the applicant, the identity information of the insured, the age of the insured, the contract form of the scheme type and the like are represented by preset numbers, for example, -1; for quantitative fields such as insurance, premium, discount rate, commission rate, procedure cost, etc., mode filling is used;

2) And encoding each piece of original data of the history claim case in the cleaned original data set of the history claim case, wherein single-heat encoding can be adopted, and average encoding can also be adopted.

In this embodiment, after the original data of the historical claim case is cleaned and preprocessed, the data of the historical claim case is obtained, so that the accuracy and the integrity of the data can be ensured.

It is emphasized that the historical claim data set may also be stored in a node of a blockchain in order to further ensure the privacy and security of the historical claim data set.

The blockchain referred to in the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.

And S202, performing feature engineering construction on the historical claim case data set to obtain a claim payment feature data set.

In this embodiment, feature variables that can describe the case of the claim are abstracted by feature extraction and processing of the historical claim dataset.

In some optional implementations, feature screening is performed on the historical claim case data set by using a CART algorithm, and the step of performing feature engineering construction on the historical claim case data set to obtain the claim feature data set includes:

step S301, extracting features of a historical claim case data set to obtain feature variables of multiple dimensions;

step S302, dividing a historical claim case data set into a plurality of sub-training sets, and constructing classification features and condition features corresponding to each sub-training set according to all feature variables;

step S303, calculating the base index of each classification feature in each sub-training set;

step S304, adding the base indexes of the classification features in each sub-training set to obtain the feature importance of each feature variable;

step S305, screening out feature variables with standard feature importance less than or equal to a preset threshold as the pay features;

and step S306, the pay characteristic and the corresponding characteristic data form a pay characteristic data set.

Feature extraction is performed according to feature attribute factors of each piece of historical claim data in the historical claim data set, wherein the feature attribute factors are feature variables representing the condition of the claim, and include, but are not limited to, guarantee application time, guarantee application identity information, guarantee application age, guarantee application contact mode, contract form, business class, genus group, risk, insurance amount, premium, discount rate, commission rate, procedure cost, main risk, effective date, expiration date, accident site, accident time, insurance policy pay condition and the like.

Randomly extracting data with preset proportion from the historical claim case data set to be used as a sub-training set in a put-back mode, and repeating for a plurality of times to obtain a plurality of sub-training sets. For example, the preset ratio may be selected to be 70%.

And constructing classification features and conditional features of each sub training set according to all feature variables, constructing a decision tree based on the classification features and the conditional features, wherein the conditional features are prediction results of the decision tree. It should be appreciated that the classification features and the condition features of each sub-training set are not identical.

In this embodiment, the calculation formula of the keni index of each classification feature is as follows:

wherein P is a sub-training set, gini (P) is a radix index of each classification feature in the sub-training set, k is the number of predicted categories of the classification feature, and P _i Sample duty ratios of different prediction categories under the same classification characteristic.

The larger the base index is, the larger the uncertainty of the sample is, the smaller the base index is, the larger the correlation of the representative features is, and the purer the classification is, so that the classification feature with the smallest base coefficient in each training set is sequentially selected as the optimal feature to construct the decision tree corresponding to each sub training set.

And (3) completing the construction of the decision tree, correspondingly calculating the base index of each classification feature in the sub-training set, adding the base indexes of each classification feature in each sub-training set to obtain the base index sum of each classification feature, wherein the classification feature is a feature variable, and the base index sum is used as the feature importance of the corresponding feature variable.

And comparing the feature importance of each feature variable with a preset threshold value, and screening feature variables with feature importance smaller than or equal to the preset threshold value as the pay features, wherein the preset threshold value can be set according to actual conditions.

And obtaining feature data corresponding to the pay feature from the historical pay table data set, and constructing a pay feature data set according to the pay feature and the corresponding feature data.

By constructing the claim characteristic data set through characteristic engineering, the historical claim case data can be fully utilized to obtain the claim characteristic with larger correlation with the claim risk evaluation, and model training is facilitated by relying on the claim characteristic data set, so that the method is more accurate and objective and has high reliability.

Step S203, according to the classification result, based on a preset data labeling tool, adding a data tag to each piece of the pay data in the pay characteristic data set to obtain a pay sample data set, and dividing the pay sample data set into a training set and a test set.

In this embodiment, the claim feature data set is classified according to a preset claim risk condition to obtain a classification result, and a corresponding classification result is added to each piece of claim data in the claim feature data set as a data tag according to the classification result and a preset data labeling tool.

The data labeling tool is used for labeling the data, and can be a tool such as doccano.

Dividing the pay sample dataset into a training set and a test set, wherein the training set: test set = 8:2.

Step S204, inputting the training set into a pre-constructed neural network model for training, and obtaining a trained neural network model.

The neural network model comprises an input layer, a hiding layer, an attention layer, a characteristic fusion layer and an output layer, wherein the input layer, the hiding layer, the attention layer, the characteristic fusion layer and the output layer are sequentially connected.

The training process of the neural network model is specifically as follows:

And S401, inputting the training set into an input layer for feature extraction to obtain the pay feature vector.

The input layer comprises a coding module and a convolution module, a training set is input into the input layer, and the coding module is called to code each piece of pay data in the training set to obtain a plurality of coding feature vectors; and calling a convolution module to carry out convolution feature extraction on the plurality of coding feature vectors to obtain the pay feature vector.

The encoding module performs an embedding operation (embedding) and Positional Encoding operation, preferably performs an embedding operation on the pay data to obtain an embedded feature, and performs Positional Encoding operation on the embedded feature to obtain a plurality of encoding feature vectors containing position information.

The formula for Positional Encoding operation is as follows:

wherein pos represents the position of the current character in the payoff data; d, d _model Representing the dimension size of the model; i represents the position of the current character embedding matrix, and when i is even, formula (1) is employed, and when i is odd, formula (2) is employed.

The convolution module adopts a preset convolution check to carry out convolution on the coding feature vector, so that a dense low-dimensional vector, namely a pay feature vector, is obtained.

The accuracy of the extracted features can be improved by performing feature extraction on the pay data through the encoding and convolution operations of the input layer.

And step S402, performing feature learning on the pay feature vectors through the hidden layer to obtain a plurality of pay hidden vectors and hidden weights of each pay hidden vector.

The hiding layer comprises a forward layer and a backward layer, and the feature extraction is carried out on the pay feature vector of each piece of pay feature data through the forward layer and the backward layer to respectively obtain a forward hiding state feature and a backward hiding state feature; the front hidden state features and the back hidden state features are spliced according to the positions to obtain hidden layer state features of each piece of pay feature data; and calculating a plurality of pay implicit vectors and implicit weights of each pay implicit vector according to the hidden layer state characteristics.

Assume that the pay feature vector X is (X) ₁ ，x ₂ ，…，x _t-1 ，x _t ，x _t+1 ，…，x _n ) Wherein x is _t The multidimensional claim feature vector representing the moment t, the dimension input at the moment t is the number of input features, and the output forward hidden state features are as follows

The forward layer comprises a forgetting gate, an input gate, an output gate and a memory unit.

Based on the forward hidden state characteristics of the previous momentCurrent input payoff feature vector x _t Forgetting gate weight matrix W _f And forgetting door offset vector b _f Calculating to obtain the value f of the forgetting door _t ，f _t The value is used to determine whether to let the last learned information C _t-1 The calculation formula is as follows:

wherein f _t ∈[0,1]The selection weight of the node at the time t to the memory unit at the time t-1 is represented; sigma is a nonlinear function sigma (x) =1/(1+e) ^-x ) X represents the input of the activation function.

Determining what new information to add to the memory cell by input gate computation: based on the forward hidden state characteristics of the previous momentCurrent input payoff feature vector x _t Input gate weight matrix W _i And input gate bias vector b _i Calculating to obtain the value i of the input gate _t The calculation formula is as follows:

wherein i is _t ∈[0,1]The selection weight of the node at the time t to the current node information is represented, namely the weight coefficient of the updated information.

By activating the function tanh, the last-moment hidden state feature is usedAnd the current input odds vector x _t Generating temporary State of memory cell->The calculation formula is as follows:

wherein W is _C Representing a memory cell weight matrix, b _C Representing the cell bias vector.

According to the value f of the forgetting door _t And the value of input gate i _t Updating the state of the old memory cell to add new information, the state of the memory cell at the current moment is as follows:

the output of the memory cell also needs to be based onAnd x _t To judge, firstly, calculating judging conditions, namely outputting the value of the gate, wherein the calculating formula is as follows:

Wherein o is _t Selection weight of memory cell memory information at time t, W _o Representing the output gate weight matrix, b _o Representing the output gate offset vector.

Finally, according to the state of the memory unit at the current moment and the value of the output gate, the forward hidden state characteristic at the current moment is calculated, and the calculation formula is as follows:

in the above-mentioned description of the invention,representing vector x _t And->And (5) spliced vectors.

In this embodiment, the forward hidden state features at all times are spliced to obtain the forward hidden state features

In this embodiment, the backward layer also includes a forgetting gate, an input gate, an output gate and a memory unit, and the calculation process is the same as the calculation process of the forward hidden state feature, which is not described herein.

The pay feature vector X is defined as (X) ₁ ，x ₂ ，…，x _t-1 ，x _t ，x _t+1 ，…，x _n ) Inputting a backward layer through which to performFeature extraction to obtain backward hidden state features

The front hidden state features and the back hidden state features are spliced according to the positions to obtain hidden layer state features of each piece of pay feature dataI.e. h= (H ₁ ，h ₂ ，…，h _t-1 ，h _t ，h _t+1 ，…，h _m )。

According to hidden layer state characteristic h= (H) ₁ ，h ₂ ，…，h _t-1 ，h _t ，h _t+1 ，…，h _m ) And calculating to obtain a plurality of pay implicit vectors and the implicit weight of each pay implicit vector.

By forward and backward feature learning of the pay feature vector, pay related features can be more fully learned, the receptive field of the vector is improved, and therefore the calculation processing speed of the model is improved.

In step S403, the attention layer is used to extract attention features of the implicit vector and the implicit weight of the pay, so as to obtain an attention feature vector.

And (3) adopting an attention mechanism to calculate attention according to the implicit vector and the implicit weight of the payment, and helping the model to select the characteristics favorable for prediction based on calculation of attention characteristics.

And step S404, carrying out feature fusion on the attention feature vector and the pay feature vector through a feature fusion layer to obtain a pay enhancement feature vector.

And carrying out feature fusion on the attention feature vector and the pay feature vector to obtain an optimal feature pay enhancement feature vector for representing pay conditions.

And step S405, calculating the pay reinforcing feature vector through the output layer to obtain a prediction classification result.

The output layer can adopt a softmax layer, calculate the pay reinforcing feature vector through the softmax layer, and output the prediction classification result.

Step S406, calculating the loss value of the prediction classification result according to the preset loss function.

The calculation formula of the preset loss function is as follows:

where n is the number of samples of the training set,to predict the classification result, y _i Is the true classification result, namely the data label.

And S407, adjusting network parameters of the neural network model based on the loss value, and continuing to train iteratively until the model converges, and outputting the trained neural network model.

According to the loss value and the back propagation algorithm, iteratively updating the network parameters of the neural network model until the neural network model converges; and determining the network parameters converged by the current neural network model as target parameters, and obtaining the trained neural network model according to the target parameters.

The Back Propagation (BP) algorithm refers to a process of adjusting the weights and biases of network parameters according to the prediction error.

It should be appreciated that the smaller the loss value, the closer the predicted outcome is to the actual outcome. And the server iteratively adjusts the network parameters of the model according to the magnitude of the loss value, then trains again and calculates the corresponding loss value until the model converges, and determines the network parameters with the optimal effect, thereby obtaining the trained neural network model and improving the accuracy and stability of the model.

And step S205, inputting the test set into the trained neural network model for verification, and outputting the risk prediction model for reimbursement.

In this embodiment, the average absolute error (Mean Absolute Error, MAE) and root mean square error (Root Mean Squared Error, RMSE) evaluation functions can be used to verify the performance of the model and to perform error analysis on the prediction results.

Specifically, inputting the test set into the trained neural network model, and outputting a test predicted value; respectively calculating root mean square error and average absolute error according to the test predicted value and the truth value label; and when the root mean square error is smaller than or equal to a first preset threshold value and the average absolute error is smaller than or equal to a second preset threshold value, outputting the trained neural network model as a pay risk prediction model.

The root mean square error is calculated as follows:

the calculation formula of the average absolute error is as follows:

wherein m is the number of sample data in the test set;a truth data label representing the ith sample; />Representing the test predicted value of the i-th sample.

And retraining the pre-constructed neural network model when the root mean square error is greater than a first preset threshold or the average absolute error is greater than a second preset threshold.

The first preset threshold value and the second preset threshold value can be the same value or different values, and can be set according to actual needs. The model is verified through the RMSE and the MAE, so that the model fitting effect can be better evaluated, and the model prediction accuracy is improved.

And S206, obtaining the to-be-predicted pay data in real time, and inputting the to-be-predicted pay data into the pay risk prediction model to obtain a risk prediction result.

The method has the advantages that the data to be predicted and paid is obtained in real time, the predicting and analyzing are carried out by using the pay risk predicting model, and the method can help the reinsurer to accurately predict future pay risks, so that the reinsurer can make decisions timely and accurately.

Step S207, adjusting the procedure proportion based on the risk prediction result, and outputting the corresponding protocol scheme according to the adjusted procedure proportion.

Specifically, calculating estimated odds based on risk prediction results; and calculating according to the estimated odds ratio to obtain risk agreement cost, and adjusting procedure proportion based on the risk agreement cost.

The procedure proportion, namely the procedure fee proportion, is based on the mapping relation between the risk prediction result and the odds, the estimated odds are obtained, the higher the odds are, the higher the risk agreement cost is, the risk agreement cost obtained through current calculation is compared with the original risk agreement cost, and the procedure fee proportion is adjusted according to the comparison result.

Future odds are calculated based on the risk prediction result, and the reinsurance company can better control the risk and dynamically adjust the commission proportion through accurate odds prediction so as to improve the income level.

The subject application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by computer readable instructions stored in a computer readable storage medium that, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

With further reference to fig. 5, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a protocol scheme pushing device based on risk prediction for reimbursement, where an embodiment of the device corresponds to the embodiment of the method shown in fig. 2, and the device may be specifically applied to various electronic devices.

As shown in fig. 5, the protocol scheme pushing device 400 based on the risk prediction of payment according to the present embodiment includes: an acquisition module 501, a feature engineering module 502, a labeling module 503, a training module 504, a testing module 505, a prediction module 506, and an adjustment module 507. Wherein:

The acquisition module 501 is configured to acquire a historical claim case original data set, and clean and preprocess the historical claim case original data set to obtain a historical claim case data set;

the feature engineering module 502 is configured to perform feature engineering construction on the historical claim case data set to obtain a claim payment feature data set;

the labeling module 503 is configured to add a data tag to each piece of the pay data in the pay feature data set according to the classification result based on a preset data labeling tool, obtain a pay sample data set, and divide the pay sample data set into a training set and a test set;

the training module 504 is configured to input the training set into a pre-constructed neural network model for training, so as to obtain a trained neural network model;

the test module 505 is configured to input the test set into the trained neural network model for verification, and output a risk prediction model for reimbursement;

the prediction module 506 is configured to obtain, in real time, the claim data to be predicted, and input the claim data to be predicted into the claim risk prediction model, so as to obtain a risk prediction result;

the adjustment module 507 is configured to adjust a procedure proportion based on the risk prediction result, and output a corresponding protocol scheme according to the adjusted procedure proportion.

Based on the protocol scheme pushing device 500 based on the claim risk prediction, the obtained historical claim case data set is subjected to feature engineering construction to obtain a claim feature data set, the claim feature data set is marked to obtain a claim sample data set, and the claim sample data set is used for training and verifying a neural network model to obtain a claim risk prediction model, so that the historical claim case data can be fully utilized for model training, and the accuracy and stability of the model on risk prediction are improved; predicting real-time to-be-predicted pay data through a pay risk prediction model so as to fully consider market changes and further improve risk prediction accuracy; and the procedure proportion is adjusted according to the risk prediction result, so that the rationality and the reliability of the output protocol scheme are improved, and the risk control capability and the benefit level of the reinsurance reservation separation protocol are further improved.

In some alternative implementations, the feature engineering module 502 includes:

the extraction submodule is used for carrying out feature extraction on the historical claim case data set to obtain feature variables with multiple dimensions;

the computing sub-module is used for computing the base index of each classification characteristic in each sub-training set;

the adding sub-module is used for adding the base-Ni indexes of the classification features in each sub-training set to obtain the feature importance of each feature variable;

the screening submodule is used for screening out feature variables with the feature importance degree smaller than or equal to a preset threshold value as the pay features;

and the composition sub-module is used for composing the pay characteristic and the corresponding characteristic data into the pay characteristic data set.

In some alternative implementations, the neural network model includes an input layer, a hidden layer, an attention layer, a feature fusion layer, and an output layer; training module 504 includes:

The feature extraction sub-module is used for inputting the training set into the input layer to perform feature extraction so as to obtain a pay feature vector;

the implicit characteristic learning sub-module is used for carrying out characteristic learning on the pay characteristic vectors through the hidden layer to obtain a plurality of pay implicit vectors and the implicit weight of each pay implicit vector;

the attention calculation sub-module is used for extracting attention characteristics of the pay implicit vector and the implicit weight by adopting the attention layer to obtain attention characteristic vectors;

the fusion sub-module is used for carrying out feature fusion on the attention feature vector and the pay feature vector through the feature fusion layer to obtain a pay enhancement feature vector;

the prediction sub-module is used for calculating the pay reinforcing feature vector through the output layer to obtain a prediction classification result;

the loss calculation sub-module is used for calculating the loss value of the prediction classification result according to a preset loss function;

and the iteration updating sub-module is used for adjusting network parameters of the neural network model based on the loss value, continuing to iterate training until the model converges, and outputting the trained neural network model.

In some optional implementations of this embodiment, the feature extraction submodule includes:

the coding unit is used for inputting the training set into the input layer, and calling a coding module of the input layer to code each piece of pay data in the training set to obtain a plurality of coding feature vectors;

and the convolution unit is used for calling a convolution module of the input layer to carry out convolution feature extraction on the plurality of coding feature vectors so as to obtain a pay feature vector.

In some optional implementations of the present embodiment, the implicit feature learning submodule includes:

the hidden feature extraction unit is used for extracting features of the pay feature vectors of each piece of pay feature data through a forward layer and a backward layer of the hidden layer to respectively obtain a forward hidden state feature and a backward hidden state feature;

the splicing unit is used for splicing the forward hidden state features and the backward hidden state features according to positions to obtain hidden layer state features of each piece of the pay feature data;

and the implicit characteristic calculation unit is used for calculating a plurality of implicit vectors of the pay and the implicit weight of each implicit vector of the pay according to the hidden layer state characteristics.

In some optional implementations of the present embodiment, the iterative update submodule includes:

the iteration unit is used for carrying out iterative updating on the network parameters of the neural network model according to the loss value and the back propagation algorithm until the neural network model converges;

and the determining unit is used for determining the network parameters converged by the current neural network model as target parameters and obtaining a trained neural network model according to the target parameters.

And iteratively adjusting the network parameters of the model through the magnitude of the loss values, training again, calculating the corresponding loss values, and determining the network parameters with the optimal effect until the model converges, so that a trained neural network model is obtained, and the accuracy and stability of the model are improved.

In some alternative implementations, the adjustment module 507 includes:

the odds ratio calculating sub-module is used for calculating estimated odds ratio based on the risk prediction result;

and the adjustment sub-module is used for calculating the risk agreement cost according to the estimated odds ratio and adjusting the procedure proportion based on the risk agreement cost.

Through accurate odds predictions, reinsurers can better control risk and dynamically adjust the commission proportion to increase revenue levels.

In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 6, fig. 6 is a basic structural block diagram of a computer device according to the present embodiment.

The computer device 6 comprises a memory 61, a processor 62, a network interface 63 communicatively connected to each other via a system bus. It is noted that only computer device 6 having components 61-63 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.

The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.

The memory 61 includes at least one type of readable storage media including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6. In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 6. Of course, the memory 61 may also comprise both an internal memory unit of the computer device 6 and an external memory device. In this embodiment, the memory 61 is generally used for storing an operating system and various application software installed on the computer device 6, such as computer readable instructions of a protocol scheme pushing method based on risk prediction. Further, the memory 61 may be used to temporarily store various types of data that have been output or are to be output.

The processor 62 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 62 is typically used to control the overall operation of the computer device 6. In this embodiment, the processor 62 is configured to execute computer readable instructions stored in the memory 61 or process data, for example, execute computer readable instructions of the protocol scheme pushing method based on the risk prediction of pays.

The network interface 63 may comprise a wireless network interface or a wired network interface, which network interface 63 is typically used for establishing a communication connection between the computer device 6 and other electronic devices.

According to the embodiment, the steps of the protocol scheme pushing method based on the claim risk prediction in the embodiment are realized when the processor executes the computer readable instructions stored in the memory, the obtained historical claim case data set is subjected to feature engineering construction to obtain a claim feature data set, the claim feature data set is marked to obtain a claim sample data set, the claim sample data set is used for training and verifying a neural network model to obtain a claim risk prediction model, the historical claim case data can be fully utilized for model training, and the accuracy and stability of the model on risk prediction are improved; predicting real-time to-be-predicted pay data through a pay risk prediction model so as to fully consider market changes and further improve risk prediction accuracy; and the procedure proportion is adjusted according to the risk prediction result, so that the rationality and the reliability of the output protocol scheme are improved, and the risk control capability and the benefit level of the reinsurance reservation separation protocol are further improved.

The application further provides another embodiment, namely, a computer readable storage medium is provided, the computer readable storage medium stores computer readable instructions, the computer readable instructions can be executed by at least one processor, so that the at least one processor executes the steps of the protocol scheme pushing method based on the claim risk prediction, the obtained historical claim case dataset is subjected to feature engineering construction to obtain a claim feature dataset, the claim feature dataset is marked to obtain a claim sample dataset, a claim risk prediction model is obtained by training and verifying a neural network model by using the claim sample dataset, model training can be performed by fully utilizing the historical claim case data, and accuracy and stability of the model on the risk prediction are improved; predicting real-time to-be-predicted pay data through a pay risk prediction model so as to fully consider market changes and further improve risk prediction accuracy; and the procedure proportion is adjusted according to the risk prediction result, so that the rationality and the reliability of the output protocol scheme are improved, and the risk control capability and the benefit level of the reinsurance reservation separation protocol are further improved.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk), comprising several instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the embodiments of the present application.

It is apparent that the embodiments described above are only some embodiments of the present application, but not all embodiments, the preferred embodiments of the present application are given in the drawings, but not limiting the patent scope of the present application. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a more thorough understanding of the present disclosure. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing, or equivalents may be substituted for elements thereof. All equivalent structures made by the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the protection scope of the application.

Claims

1. The protocol scheme pushing method based on the risk prediction of the payment is characterized by comprising the following steps of:

2. The method for pushing a protocol scheme based on claim 1, wherein the step of performing feature engineering construction on the historical claim case data set to obtain a claim feature data set includes:

3. The pay risk prediction-based protocol scheme pushing method according to claim 1, wherein the neural network model comprises an input layer, a hidden layer, an attention layer, a feature fusion layer and an output layer; the step of inputting the training set into a pre-constructed neural network model for training to obtain a trained neural network model comprises the following steps:

4. The method for pushing a protocol scheme based on claim 3, wherein the step of inputting the training set into the input layer to perform feature extraction and obtain the claim feature vector comprises:

5. The method for pushing a protocol scheme based on claim 3, wherein the step of performing feature learning on the claim feature vectors through the hidden layer to obtain a plurality of claim hidden vectors and hidden weights of each claim hidden vector comprises:

6. The method for pushing a protocol scheme based on claim 3, wherein the step of adjusting network parameters of the neural network model based on the loss value, continuing iterative training until the model converges, and outputting the trained neural network model comprises:

7. The method for pushing a protocol scheme based on claim 1, wherein the step of adjusting a procedure ratio based on the risk prediction result includes:

calculating estimated odds based on the risk prediction result;

8. An agreement scheme pushing device based on claim risk prediction, comprising:

9. A computer device comprising a memory having stored therein computer readable instructions which when executed implement the steps of the pay risk prediction based protocol scheme pushing method of any of claims 1 to 7.

10. A computer readable storage medium, wherein computer readable instructions are stored on the computer readable storage medium, which when executed by a processor, implement the steps of the pay risk prediction based protocol scenario pushing method of any one of claims 1 to 7.