CN117787497B

CN117787497B - Multi-objective optimization method and terminal applied to automobile insurance pricing

Info

Publication number: CN117787497B
Application number: CN202311868143.XA
Authority: CN
Inventors: 赵宏; 谢礼光; 刘静
Original assignee: Guangzhou Institute of Technology of Xidian University
Current assignee: Guangzhou Institute of Technology of Xidian University
Priority date: 2023-12-29
Filing date: 2023-12-29
Publication date: 2024-06-25
Anticipated expiration: 2043-12-29
Also published as: CN117787497A

Abstract

The invention discloses a multi-objective optimization method and a terminal applied to automobile insurance pricing, which relate to the technical field of computer application, and in the multi-objective problem of automobile insurance pricing calculation, a genetic algorithm is used, and meanwhile, a non-dominant solution stored in an external archive is also participated in a process of guiding particle updating through a particle swarm updating strategy based on three learning directions, so that the convergence rate of the algorithm is improved; finally, a group of optimal solution meeting the requirements of different pricing strategies is obtained; the problem of premature convergence and slow convergence speed of the genetic algorithm is solved; in addition, a local optimal updating strategy based on the concept of the summer ratio is also provided, the number of the optimized targets and the optimized degree are compared by using the local optimal solution of the previous generation and the particles of the new generation, and statistics is carried out to measure the advantages and disadvantages of the new solution and the old solution, so that a better optimization result is obtained.

Description

Multi-objective optimization method and terminal applied to automobile insurance pricing

Technical Field

The invention relates to the technical field of computer application, in particular to a multi-objective optimization method and a terminal applied to automobile insurance pricing.

Background

Along with the rapid development of the China insurance industry, automobile insurance pricing is gradually changed from the traditional pricing strategy of only focusing on static factors such as the age, sex, vehicle type and driving age of an applicant to an emerging UBI pricing mode combining dynamic driving behavior and habit data of an owner so as to promote accurate measurement of driving risk and individuation of premium pricing. Because the traditional static pricing guidelines cannot meet the high-accuracy and high-efficiency requirements of the modern automobile insurance industry for evaluating and predicting the risk and loss of automobile insurance in the face of big data environments, continuous 'Gao Baofei, high-pay, low-efficiency' automobile insurance market states are caused, and new pricing strategies are needed.

Current premium pricing strategies based on driving behavior (UBI) in modern automotive insurance industry can be divided into four phases:

(1) Collecting and analyzing vehicle information and driving behavior data;

(2) Effective screening of driving behavior factors;

(3) Constructing a driving behavior evaluation model;

(4) And calculating the driving score and linking the hook with the rate adjustment coefficient to obtain the pricing scheme. Specifically, this is a rate-splitting approach that insurance business directs, as shown in fig. 1. The method comprises the steps of firstly, collecting and primarily analyzing driving behavior data through an automobile terminal in a large quantity, establishing a preset evaluation system, then carrying out weight assignment on various indexes participating in evaluation, and finally calculating driving scores of all the applicant so as to facilitate rate specification.

An effective pricing strategy should consider that customer satisfaction can be improved as a guide to improve enterprise competitiveness while guaranteeing benefit pursuit of vehicle insurance enterprises, and finally, fair and reasonable accurate differentiated dynamic premium definition is realized. The method forms a group of classical multi-objective weighing problems, the calculation difficulty of the problems is NP-level, the factors to be weighed in the whole vehicle risk pricing process are more comprehensive, but at the same time the factors are more complex, an accurate algorithm for calculating the optimal solution by solving a mathematical model cannot obtain an answer within a certain time, the problems of low calculation efficiency and low accuracy exist, and the obtained result is difficult to provide reference for decision.

The patent application document with publication number CN114565131A provides a multi-electricity price demand response pricing system facing the peak shifting phenomenon based on a genetic algorithm, the scheme also solves the multi-objective problem, solves the optimal electricity price combination through the genetic algorithm, and finally can effectively realize peak clipping and valley filling, and improves the social energy efficiency level. However, using genetic algorithms alone to solve the multi-objective problem has a number of drawbacks including:

(1) Programming of genetic algorithm is complex, the problem needs to be encoded, and the problem needs to be decoded after the optimal solution is found;

(2) The local searching capability of the genetic algorithm is poor, the searching efficiency is low in the later period of evolution, a good solution can be obtained by carrying out multiple iterations, and the computing time is long, namely the convergence speed is low;

(3) The genetic algorithm is easy to converge earlier, and the obtained result is not accurate, namely the problem of 'early ripening';

in summary, it is needed to propose a multi-objective optimization method and a terminal applied to automobile insurance pricing, so as to solve the above problems.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention provides a multi-objective optimization method and a terminal applied to automobile insurance pricing, which are higher in convergence speed, higher in calculation efficiency and higher in accuracy compared with the prior art.

The technical scheme of the invention is realized as follows:

a multi-objective optimization method applied to automobile insurance pricing comprises the following specific steps:

S1, collecting information of a plurality of insurance users, and extracting driving behavior factors in the information, wherein the driving behavior factors comprise historical risk-giving times of the insurance users; screening the driving behavior factors by using a Lasso regression method, wherein the driving behavior factors obtained by screening are used as evaluation indexes for measuring risks of the insuring users; wherein the historical risk number does not participate in the screening; a plurality of driving behavior factors of one of the insuring users is one sample; the plurality of screened samples form a data set;

The driving behavior factor further includes a total month history of the user's driving of the vehicle, a peak early and late driving time, a weekend driving time, a night driving time, a time duty ratio of a driving speed greater than 80km/h, a time duty ratio of a driving speed greater than 120km/h, a rapid acceleration number, a rapid deceleration number, a rapid turning number, and a number of violations of a traffic rule.

S2, setting a sampling rate; dividing the samples in the data set into a class A sample and a class B sample according to a preset label; generating a plurality of analog data samples by using an SMOTE algorithm, and filling the analog data samples into the data set until the ratio of the class A samples to the class B samples is equal to the sampling rate;

in view of the problem of unbalance of the driving behavior label of the vehicle insurance, the ratio of the number of people at risk to the number of people not at risk often deviates seriously from 1:1, expanding an unbalanced sample by adopting a synthetic minority over-sampling technology (SMOTE), and generating simulation data on the basis of carrying out feature screening on a real data set by utilizing Lasso regression; the SMOTE algorithm, which is a technology for synthesizing minority class oversampling, is commonly used for balancing a data set, and generates synthesized samples by interpolating minority class samples based on a characteristic space of the samples, so that the imbalance problem among different class samples can be effectively solved; the accuracy and the robustness of the model are improved;

S3, training by utilizing the data set and the CART decision tree algorithm to obtain a decision agent model, wherein the decision agent model is used for predicting whether each user is in danger or not;

S4, constructing a scoring mechanism based on data driving by combining the decision agent model, the current actual pricing strategy of the vehicle insurance enterprise and the historical insurance times, and calculating a target value score _e of the vehicle insurance enterprise taking enterprise benefits as a guide and initial values of a target value score _c;score_e and a score _c of the insurance user taking the satisfaction of the insurance user as a guide;

And S5, utilizing a multi-target particle swarm algorithm and combining the methods of the steps S1 to S4 to explore and assign the weights of the evaluation indexes, and obtaining an optimal solution of the pricing strategy of the automobile insurance as the latest actual pricing strategy.

Compared with the prior art, which only optimizes the pricing mode of a certain specific guide, the scheme can realize the definition of more reasonable and more accurate differentiated dynamic premium, and has important practical significance.

As a further optimization of the above scheme, in the Lasso regression method of step S1, the loss function of the Lasso regression is:

Where m is the number of samples; p is the number of driving behaviour factors in the sample; label _i indicates whether the ith insured user is at risk, 0 indicates no risk, 1 indicates risk; x is a matrix of m x p, where x _i is a vector containing the driving behavior factor of the i-th sample; λ is a non-negative constant representing the regularization parameter; beta and beta _j are model coefficients of Lasso regression, wherein beta is a p-dimensional vector, which represents the coefficients of all driving behavior factors, beta _j is a scalar, which represents the coefficient of the jth driving behavior factor; calculating a plurality of coefficients through a loss function, and selecting num driving behavior factors with the maximum coefficients as evaluation indexes;

Lasso regression is a method of adding a penalty of L1 norm to the coefficients of a model on the basis of a least square method, namely the sum of absolute values of the coefficients; as the coefficients of the penalty term (also referred to as regularization parameters) increase, some insignificant coefficients will be shrunk to zero; meanwhile, because the L1 norm is non-smooth, the model can generate sparse decomposition, namely, only a few coefficients are not zero, which is beneficial to reducing the complexity and variance of the model, so that the coefficients of factors with small influence on the risk number are compressed to 0, thereby achieving the purpose of variable screening;

It is noted that when λ=0, lasso regression will degenerate into a normal least squares linear regression, where no penalty term is present, and the model only considers the goodness of fit; the L1 norm penalty term is introduced, lambda >0 is set, a certain degree of penalty and regularization effect are achieved at the moment, and the model considers the balance between the goodness of fit and the complexity. But the choice of regularization parameter λ has a large impact on the performance of the model. If lambda is too small, then the effect of the penalty term is not obvious and the model may be overfitted; if lambda is too large, the penalty term works too strongly and the model may be under-fitted.

Therefore, a method is needed to select the optimal lambda value so that the model has the best generalization ability;

The ten fold crossover method is a common method to select the optimal parameter value. The basic idea is to divide the data set of the driving behavior of the applicant into ten parts (or other parts), take one part of the data set as a test set at a time, and take the other nine parts as training sets. The Lasso regression model is then fitted with different parameter values over the training set and the prediction error of the model is calculated over the test set. After ten repetitions, the average prediction error over ten test sets is found for each parameter value, and the parameter value corresponding to the simplest model (i.e., the least number of coefficients) that minimizes or minimizes the average prediction error plus a standard error range is selected. Although the ten-fold intersection method is a common method, the ten-fold intersection method has some limitations, firstly, the difference of the division modes of the data sets can cause certain randomness and uncertainty of the results, secondly, the ten-fold intersection method needs to set the range and the step length of the parameter values, and if the range of the parameter values is too large or too small, or the step length is too thick or too thin, the error or inefficiency of model selection can be caused. The last ten-fold intersection method can only select among limited candidate parameter values and cannot take into account all possible values in the whole parameter space. Therefore, the ten-fold intersection method may ignore some of the more optimal parameter values;

Therefore, the invention also uses the Bayesian information criterion to verify the result of the ten-fold intersection method, and can increase the reliability and accuracy of model selection. The Bayesian Information Criterion (BIC) is a statistic proposed based on bayesian theory, and is used for comparing the relative advantages and disadvantages between different models, and can be expressed as follows:

BIC＝-2*ln(L)+k*ln(n)；

Where L is the maximum of the likelihood function of the estimated model, k is the number of free parameters in the model, and n is the number of samples. The smaller the BIC value, the better the representation model. The verification steps are as follows:

(1) Fitting a Lasso regression model on the whole data set by using the optimal parameter value selected by a ten-fold intersection method, and calculating a likelihood function and a BIC value of the Lasso regression model;

(2) Fitting a Lasso regression model with other candidate parameter values over the entire dataset and calculating likelihood functions and BIC values thereof;

(3) Comparing the BIC values under different parameter values, and selecting the parameter value with the minimum BIC value;

(4) If the optimal parameter value selected by the ten-fold intersection method is consistent with or close to the optimal parameter value selected by the BIC, the ten-fold intersection method is effective and reliable; if the discrepancy or phase difference is large, it is stated that the ten-fold intersection method may be biased or unstable, requiring reconsideration of the range and step size of the parameter values, or using other methods to select the parameter values.

As a further optimization of the above scheme, in step S2, the specific steps for generating the analog data samples are:

s2-1, converting the sample into a feature vector v ^r;

S2-2, for each eigenvector CalculationEuclidean distance between the nearest neighbor feature vector and other feature vectors is obtained

S2-3 according toAndThe random linear difference between the two creates a new feature vector v _new, namely:

S2-4, storing v _new in the dataset.

As a further optimization of the above scheme, in step S4, the specific procedure of the scoring mechanism is as follows:

s4-1, respectively carrying out forward processing on a plurality of evaluation indexes in a sample corresponding to an insurance user, and calculating to obtain a driving behavior score of the insurance user by combining weight distribution of each evaluation index in the current actual pricing strategy of the vehicle insurance enterprise, namely:

Wherein s _i represents the driving behavior score of the ith user, num represents the number of evaluation indexes, x _k represents the kth evaluation index, and w _k represents the weight of the kth evaluation index; the larger w _k is, the larger the influence of the index on whether the danger is raised or not is represented;

The metrics considered by the insurance company are generally categorized into maxima, minima, intermediaries, and regions, which all mean that some values will take a better dominant position the more toward a maximum/minimum or closer to some number or range of values. Taking the night time as an example, night time driving is generally regarded as a behavior that may bring about more danger, so the smaller the night time driving is, the smaller the probability of the applicant to be at risk. So the smaller and the better this index is, we can say that the night time running time index is very small data. A forward process is required. The forward process is generally an operation of unified trend for different indexes, so that the final indexes can be summarized and analyzed more conveniently and clearly. Here we do forward operation to convert these indices of the mini, the meta, and the intermediate to the very large, where the method of converting the mini to the very large is max (xi) -x;

S4-2, predicting the predicted risk situation of each of the insuring users by using the decision agent model 0 Represents no risk, 1 represents risk; counting the sum of the numbers of the insurance users which are not in danger, and marking the sum as a decision agent critical value n, namely the maximum number which can grant premium benefit and is determined after the data analysis of all the current insurance users; sorting the plurality of driving behavior scores, wherein the nth score of the descending rank is denoted as p _n, the smallest driving behavior score is denoted as s _min, and the largest driving behavior score is denoted as s _max; if s _i＜p_n, the applicant can not obtain the benefit of the corporate premium at this time; if s _i≥p_n, the explanation applicant will be included in the consideration of the company premium policy;

A decision agent (decision agent) generally refers to an autonomous entity that is able to observe the surrounding environment and take action based on its goals. Intelligent agents are a core concept of current artificial intelligence research, governing and linking research in various sub-fields. Basically, the decision agent is an autonomous decision system based on an algorithm, which is based on big data and uses machine learning and deep learning algorithms as cores, so that an autonomous decision system with an algorithm decision function can be formed. Decision making is the core of behavior, and intelligent agents can solve problems through learning and reasoning and make decisions based on their goals and circumstances. They can acquire information by interacting with the environment and use this information to update their internal models and knowledge bases. These models and knowledge bases can be used to guide the agent's behavior and help it make better decisions.

S4-3, traversing the predicted risk situation of each insurance userAnd driving behavior score s _i, calculating an insurance company target value score _e directed to the interests of the enterprise and initial values of an insurance user satisfaction target value score _c;score_e and score _c directed to the satisfaction of the insuring user, both of which are 0.score _e and score _c both pursue score maximization, indicating that current actual pricing strategies are more in line with demand;

As a further optimization of the above scheme, the specific calculation method of the target value score _e of the vehicle insurance company is as follows:

If it is And s _i＜p_n, thenScore _e＝score_e+2+Reward₁, wherein Reward ₁ is the first additional score; i.e., score _e plus 2 for clients predicted to be at risk; meanwhile, a first additional score is provided, and for the users who go out of danger, the driving behavior score is closer to the lowest score, so that the users who go out of danger can be accurately judged, the rate coefficient of the applicant is increased, and the profit of a company is improved;

If it is And s _i≥p_n, score _e＝score_e -2; score _e is decremented by 2 when no unset clients are detected;

If it is And s _i＜p_n, score _e＝score_e -1; customers who do not go out of danger, cannot get premium benefits, score _e is decremented by 1.

As a further optimization of the above scheme, the specific calculation method of the user satisfaction target score _c is as follows:

If it is And s _i＜p_n, score _c＝score_c -2; predicting that the insurance client has no dangerous condition and no premium preference, and subtracting score _c from 2;

If it is And s _i≥p_n, then

Score _c＝score_c+1+Reward₂, wherein Reward ₂ is a second additional score; the insurance client predicts no dangerous condition and can enjoy premium benefits, score _c is added by 1; simultaneously providing a second additional score, wherein the driving behavior score of the current user is higher than the score of the decision agent critical, and the score _c is higher as the difference of the driving behavior score and the score is smaller; if it isAnd s _i＜p_n isScore _c＝score_c+Reward₃, wherein Reward ₃ is a third additional score; the driving behavior score of the current user is lower than the score of the decision agent critical, and the score _c is higher as the difference of the score of the current user and the score of the current user is smaller;

If it is And s _i≥p_n, score _c＝score_c +2, the insurance client predicts no dangerous situation and can normally enjoy the premium preference, score _c plus 2 points.

As a further optimization of the above scheme, the specific process of step S5 is:

S5-1, setting maximum evaluation times MaxGen and population scale NP; initializing a particle population pop, namely randomly generating NP particle codes; the particle code comprises two D dimension vectors, namely a position vector X _i＝(X_i1,X_i2,…,X_iD) and a speed vector V _i＝(V_i1,V_i2,…,V_iD), wherein each value in the position vector corresponds to the weight of each evaluation index; evaluating the adaptation values of all particle-encoded individuals in the pop according to the methods from step 1 to step 4; the adaptation values include a vehicle insurance company target value score _e and an applied user satisfaction target value score _c; initializing a plurality of local optimal solutions pBest, which respectively correspond to each individual in the current pop;

In the multi-target particle swarm optimization algorithm, each particle has a D-dimensional decision search space, the particles can move at a certain speed in the search space, and the moving speed can be dynamically adjusted according to own experience and flight experience of a companion; the historical best position of each particle is recorded as pBest = (pBest ₁,pBest₂,…,pBest_Np); s5-2, calculating a non-dominant solution in the pop according to the adaptive value, and storing the obtained non-dominant solution into an external archive A; non-dominant solution is a concept in a multi-objective optimization problem, which refers to a solution that is not dominant by other solutions in a set of possible solutions; wherein dominant means that all objective functions are better than or equal to another solution;

s5-3, generating or updating an optimal solution gBest on a single target value;

S5-4, updating the particle population pop by applying a three-direction learning strategy, namely:

V_id＝w*V_id+c₁*r₁*(pBest_id-X_id)+c₂*r₂*(gBest_id-X_id)+c₃*r₃*(Arch-X_id);X_id＝X_id+V_id;

Wherein V _id represents the speed of the ith particle in the d-th dimension, X _id represents the position information of the ith particle in the d-th dimension, d [1, D ]; w is a preset inertia weight value, and c ₁、c₂、c₃ is a preset acceleration factor respectively; arch is a randomly selected individual in the external profile a;

S5-5, updating a local optimal solution pBest by using a summer ratio idea strategy; in the multi-objective optimization implementation process, how to determine whether a new solution is better than an old solution is a worth discussing. This situation often makes it difficult to further compare solutions in conventional strategies, resulting in two non-dominant solutions. The invention references the summer ratio that is often used in investment portfolios to measure the cost performance of creating new solutions. The greater the value of the odds among the investments, the higher the payoff that can be obtained, representing the risk to which the fund is subjected.

S5-6, updating the individuals in the external archive A;

S5-7, judging whether the current iteration reaches the maximum evaluation times, if so, ending the iteration, and taking all obtained non-dominant solutions as optimal solutions of the pricing strategies; otherwise, step S5-3 is executed, and iteration is continued.

As a further optimization of the above scheme, the specific steps of step S5-5 are:

S5-5-1, respectively comparing the local optimal solution pBest with two target values of individuals at corresponding positions in sequence in the population pop, if the target value of the individuals in the population pop is greater than the target value of the individuals in pBest, optimizing the corresponding target value of the surface, otherwise, indicating that the corresponding target value is weakened; the optimized or attenuated amount of the target value is the absolute value of the difference of the same target value for individuals with pop and pBest at the same location;

s5-5-2, counting the number of optimized target values in a single individual And the number of target values to be weakenedRespectively obtaining an optimization quantity maximum value, an optimization quantity minimum value, a weakening quantity maximum value and a weakening quantity minimum value on a single target;

S5-5-3, carrying out normalization treatment on the optimized quantity or weakening quantity of the individual on a single target value, namely:

The calculation method of (single target optimization quantity-optimization quantity minimum value)/(optimization quantity maximum value-optimization quantity minimum value);

the calculation method of (single target weakening amount-weakening amount minimum value)/(weakening amount maximum value-weakening amount minimum value);

Wherein, OrThe optimized value and the weakening value of the individual on a single target value are respectively normalized,

If both target values of the individual are optimized, the individual is ultimatelyNormalizing the sum of the values for the individual's optimization variables over two target values,

If both target values of the individual are impairedIndividual finalNormalizing the sum of the values for the individual's amount of attenuation at the two target values;

s5-5-4, sequentially calculating the value of the summer average value of a single individual in the process, namely:

if Dominate _i =1, then pBest is updated, i.e., the ith individual of pop becomes a new pBest at the same location.

As a further optimization of the above scheme, in step S5-6, the individual in the external archive a is updated with elite policy, specifically the steps are:

s5-6-1, setting a population E, wherein each individual E _i in E is equal to an individual Arch _i in A; gaussian perturbation of data in a randomly selected dimension R in each E _i, namely:

E_i,R＝E_i,R+(X_max,R-X_min,R)*Gaussian(0,1)；

Wherein E _i,R represents the position data stored in the R-th dimension by E _i, R is more than or equal to 1 and less than or equal to D; x _max,R and X _min,R are the upper and lower bounds of E _i,R, respectively, and are preset data; gaussian (0, 1) is a random value generated from a Gaussian distribution with an average value of 0 and a standard deviation of 1;

S5-6-2, checking whether E _i,R after disturbance is in the search range of [ X _max,R,X_min,R ]; if E _i,R＞X_max,R, E _i,R＝X_max,R; if E _i,R＜X_min,R, E _i,R＝X_min,R;

s5-6-3, calculating an adaptation value of each E _i according to the method of the step 1 to the step 4;

S5-6-4, mixing the individuals in the external archive A, the individuals in the local optimal solution pBest and the individuals in the population E to form a population S;

S5-6-5, after non-dominant sorting is carried out on individuals of S, preserving non-dominant solutions, and rejecting the rest solutions, if the number of individuals in S is smaller than or equal to that of individuals in A at the moment, directly using all solutions of S as new archive A; otherwise, calculating the crowding degree of each individual in the S, and selecting a new archiving solution from large to small according to the crowding degree until the preset capacity of the external archiving A is reached; the congestion degree is calculated by the following steps: sequencing all individuals in the S according to a single target value, and respectively obtaining a maximum target value f _max and a minimum target value f _min; the target value of the first individual after sequencing is f _rs, and the crowding degree of the individual on the single target value is (f _rs+1-f_rs-1)(f_max-f_min); the plurality of congestion degrees of each individual are accumulated as the individual final congestion degree. By adding elite strategy and file updating strategy after non-dominant ordering, the exploring capability of algorithm can be improved in each generation of particle learning process, and file solution can also reversely guide the updating of new generation of particles, so that the quality of solution is improved.

The invention also provides a terminal which comprises a storage device for storing a plurality of instructions and a processor for executing the instructions in the storage device, wherein the instructions are suitable for the processor to load and execute the multi-objective optimization method applied to automobile insurance pricing.

Compared with the prior art, the invention has the following beneficial effects:

(1) The invention provides an improved multi-target particle swarm algorithm with an external archive, which particularly ensures that non-dominant solutions stored in the external archive participate in the process of guiding particle update through particle swarm updating strategies based on three learning directions, thereby improving the convergence rate of the algorithm; finally, a group of optimal solution meeting the requirements of different pricing strategies is obtained;

(2) The invention provides a local optimal updating strategy based on the concept of the summer ratio aiming at target characteristics, wherein the number of the optimized targets and the optimized degree are compared by using the local optimal solution of the previous generation and the particles of the new generation and counted to measure the advantages and disadvantages of the new solution and the old solution, so as to obtain a better optimizing result.

Drawings

FIG. 1 is a flow chart of a multi-objective optimization method for automobile insurance pricing according to an embodiment of the present invention;

Fig. 2 is a schematic flow chart of a multi-target particle swarm algorithm according to an embodiment of the present invention;

FIG. 3 is a graph of a fit of a best solution to a pricing strategy provided by an embodiment of the invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, the embodiment provides a multi-objective optimization method applied to automobile insurance pricing, which specifically includes the following steps:

S1, collecting information of a plurality of insurance users, extracting driving behavior factors in the information, wherein the driving behavior factors comprise historical risk-out times of the insurance users, in the embodiment, the driving behavior factors further comprise a month total history of the insurance users driving a vehicle, a morning and evening peak driving time, a weekend driving time, a night driving time, a time proportion of driving speed greater than 80km/h, a time proportion of driving speed greater than 120km/h, a rapid acceleration time, a rapid deceleration time, a rapid turning time and a traffic rule violation time, and descriptive statistical information of the indexes is shown in the following table;

Continuous watch

Screening the driving behavior factors by using a Lasso regression method, wherein the driving behavior factors obtained by screening are used as evaluation indexes for measuring risks of the insuring users; wherein the historical risk number does not participate in the screening; a plurality of driving behavior factors of one of the insuring users is one sample; the plurality of screened samples form a data set;

The loss function of Lasso regression is:

S2, setting a sampling rate; dividing the samples in the data set into a class A sample and a class B sample according to a preset label; generating a plurality of analog data samples by using an SMOTE algorithm, and filling the analog data samples into the data set until the ratio of the class A samples to the class B samples is equal to the sampling rate; the specific steps for generating the analog data sample are as follows:

s2-1, converting the sample into a feature vector v ^r;

V_new＝V_i ^r+(V_i ^r-V_i ^near)*rand(0,1)；

S2-4, storing v _new in the dataset.

The sampling rate refers to the ratio of the number of minority class samples to the number of majority class samples at the time of oversampling. For example, if the raw dataset has 100 samples for class a and 20 samples for class B, then the sampling rate is 0.2. The sampling rate is used to determine the number of minority class samples in the oversampled dataset. For example, if we set the sampling rate to 1, we mean that we want 100 samples for both class a and class B in the oversampled dataset, so we need to synthesize 80 new class B samples. If we set the sampling rate to 0.5, we mean that we want 100 samples for class a and 50 samples for class B in the oversampled dataset, so we need to synthesize 30 new class B samples.

S4, constructing a scoring mechanism based on data driving by combining the decision agent model, the current actual pricing strategy of the vehicle insurance enterprise and the historical insurance times, and calculating a target value score _e of the vehicle insurance enterprise taking enterprise benefits as a guide and initial values of a target value score _c;score_e and a score _c of the insurance user taking the satisfaction of the insurance user as a guide; the specific process is as follows:

the metrics considered by the insurance company are generally categorized into maxima, minima, intermediaries, and regions, which all mean that some values will take a better dominant position the more toward a maximum/minimum or closer to some number or range of values. Taking the night time as an example, night time driving is generally regarded as a behavior that may bring about more danger, so the smaller the night time driving is, the smaller the probability of the applicant to be at risk. So the smaller and the better this index is, we can say that the night time running time index is very small data. A forward process is required. The forward process is generally an operation of unified trend for different indexes, so that the final indexes can be summarized and analyzed more conveniently and clearly. Here we do forward operation to convert these indices of the mini, the meta, and the intermediate to the very large, where the method of converting the mini to the very large is max (x _i) -x;

The specific calculation method of the vehicle insurance company target value score _e is as follows:

The specific calculation method of the target value score _c of the satisfaction degree of the user comprises the following steps:

If it is And s _i≥p_n, then

And S5, utilizing a multi-target particle swarm algorithm and combining the methods of the steps S1 to S4 to explore and assign the weights of the evaluation indexes, and obtaining an optimal solution of the pricing strategy of the automobile insurance as the latest actual pricing strategy. As shown in fig. 2, the specific process is as follows:

In the multi-target particle swarm optimization algorithm, each particle has a D-dimensional decision search space, the particles can move at a certain speed in the search space, and the moving speed can be dynamically adjusted according to own experience and flight experience of a companion; the historical best position of each particle is recorded as pBest = (pBest ₁,pBest₂,…,pBest_Np); s5-2, calculating a non-dominant solution in the pop according to the adaptive value, and storing the obtained non-dominant solution into an external archive A; non-dominant solution is a concept in a multi-objective optimization problem, which refers to a solution that is not dominant by other solutions in a set of possible solutions; wherein dominant means that all objective functions are better than or equal to another solution; if there are two pricing strategies, the adaptation values are [80,91], [78,85] and [79,83], respectively, then the adaptation values are [80,91] non-dominant solutions, [78,85], [79,83] dominant solutions.

S5-3, searching or updating an optimal solution gBest on a single target value;

S5-5, updating a local optimal solution pBest by using a summer ratio idea strategy; in the multi-objective optimization implementation process, how to determine whether a new solution is better than an old solution is a worth discussing. This situation often makes it difficult to further compare solutions in conventional strategies, resulting in two non-dominant solutions. The invention references the summer ratio that is often used in investment portfolios to measure the cost performance of creating new solutions. The greater the value of the odds among the investments, the higher the payoff that can be obtained, representing the risk to which the fund is subjected. The method comprises the following specific steps:

S5-5-1, respectively comparing the local optimal solution pBest with two target values of individuals at corresponding positions in sequence in the population pop, if the target value of the individuals in the population pop is greater than the target value of the individuals in pBest, the corresponding target value is optimized, otherwise, the corresponding target value is weakened; the optimized or attenuated amount of the target value is the absolute value of the difference of the same target value for individuals with pop and pBest at the same location;

s5-5-2, counting the number of optimized target values in a single individual And the number of target values to be weakenedRespectively obtaining an optimization quantity maximum value, an optimization quantity minimum value, a weakening quantity maximum value and a weakening quantity minimum value on a single target; for example, the adaptation value of the current particle is (5, 2), the adaptation value of the corresponding pBest is (1, 3), then only the first target is optimized, the degree of optimization is 5-1=4, the second target is weakened, and the degree of weakening is 3-2=1.

S5-5-3, carrying out normalization treatment on the optimized quantity or weakening quantity of the individual on a single target value so as to avoid dimensional differences, namely:

the calculation method of (single target optimization quantity-optimization quantity minimum value)/(optimization quantity maximum value-optimization quantity minimum value); assuming that the minimum value of the optimization quantity of the current population on the first target is 1 and the maximum value of the optimization quantity is 11, then Is also similar. /(I)

S5-6, updating the individuals in the external archive A by adopting an elite strategy; the method comprises the following specific steps:

E_i,R＝E_i,R+(X_max,R-X_min,R)*Gaussian(0,1)；

s5-6-5, after non-dominant sorting is carried out on individuals of S, preserving non-dominant solutions, and rejecting the rest solutions, if the number of individuals in S is smaller than or equal to that of individuals in A at the moment, directly using all solutions of S as new archive A; otherwise, calculating the crowding degree of each individual in the S, and selecting a new archiving solution from large to small according to the crowding degree until the preset capacity of the external archiving A is reached; the congestion degree is calculated by the following steps:

Sequencing all individuals in the S according to a single target value, and respectively obtaining a maximum target value f _max and a minimum target value f _min; the target value of the first individual after sequencing is f _rs, and the crowding degree of the individual on the single target value is (f _rs+1-f_rs-1)(f_max-f_min); the plurality of congestion degrees of each individual are accumulated as the individual final congestion degree. By adding elite strategy and file updating strategy after non-dominant ordering, the exploring capability of algorithm can be improved in each generation of particle learning process, and file solution can also reversely guide the updating of new generation of particles, so that the quality of solution is improved.

S5-7, judging whether the current iteration reaches the maximum evaluation times, if so, ending the iteration, and taking all obtained non-dominant solutions as optimal solutions of the pricing strategies, wherein the optimal solutions are shown in FIG. 3; otherwise, step S5-3 is executed, and iteration is continued.

Compared with the prior art, which only optimizes a certain specific guide pricing mode, the scheme can realize more reasonable and more accurate definition of differentiated dynamic premium, improve customer satisfaction and enterprise competitiveness, and has important practical significance.

In the scheme, in solving the multi-objective problem of automobile insurance pricing, on the basis of a genetic algorithm, based on particle swarm updating strategies in three learning directions, non-dominant solutions stored in an external archive are also involved in the process of guiding particle updating, so that the convergence speed of the algorithm is improved; finally, a group of optimal solution meeting the requirements of different pricing strategies is obtained; meanwhile, a local optimal updating strategy based on the concept of the summer ratio is provided, the number of the optimized targets and the optimized degree are compared by using the local optimal solution of the previous generation and the particles of the new generation, and statistics is carried out to measure the quality of the new solution and the old solution, so that a better optimization result is obtained. The technical problem that a single genetic algorithm is converged too early and has low convergence rate is solved.

The embodiment also provides a terminal, which comprises a storage device for storing a plurality of instructions and a processor for executing the instructions in the storage device, wherein the instructions are suitable for loading and executing the multi-objective optimization method applied to automobile insurance pricing by the processor.

Variations and modifications to the above would be obvious to persons skilled in the art to which the invention pertains from the foregoing description and teachings. Therefore, the invention is not limited to the specific embodiments disclosed and described above, but some modifications and changes of the invention should be also included in the scope of the claims of the invention. In addition, although specific terms are used in the present specification, these terms are for convenience of description only and do not limit the present invention in any way.

Claims

1. A multi-objective optimization method applied to automobile insurance pricing is characterized by comprising the following specific steps:

S4, constructing a scoring mechanism based on data driving by combining the decision agent model, the current actual pricing strategy of the vehicle insurance enterprise and the historical insurance times, and calculating initial values of a target value score _e of the vehicle insurance enterprise and an initial value score _c,score_e and a score _c of the insurance user, wherein the target value score _e of the vehicle insurance enterprise and the initial value of the target value score _c,score_e and the initial value of the score _c of the insurance user are both 0, and the concrete process is as follows:

wherein s _i represents the driving behavior score of the ith user, num represents the number of evaluation indexes, x _k represents the kth evaluation index, and w _k represents the weight of the kth evaluation index;

S4-2, predicting the predicted risk situation of each of the insuring users by using the decision agent model 0 Represents no risk, 1 represents risk; counting the sum of the number of the non-risky insurance users, and recording the sum as a decision agent critical value n; sorting the plurality of driving behavior scores, wherein the nth score of the descending rank is denoted as p _n, the smallest driving behavior score is denoted as s _min, and the largest driving behavior score is denoted as s _max;

S4-3, traversing the predicted risk situation of each insurance user And driving behavior score s _i, calculating a driver's insurance company target value score _e directed to the interests of the enterprise and an insurance user satisfaction target value score _c directed to the satisfaction of the insurance user;

If it is And s _i＜p_n, thenScore _e＝score_e+2+Reward₁, wherein Reward ₁ is the first additional score;

If it is And s _i≥p_n, score _e＝score_e -2;

If it is And s _i＜p_n, score _e＝score_e -1;

If it is And s _i＜p_n, score _c＝score_c -2;

If it is And s _i≥p_n, thenScore _c＝score_c+1+Reward₂, wherein Reward ₂ is a second additional score;

If it is And s _i＜p_n, thenScore _c＝score_c+Reward₃, wherein Reward ₃ is a third additional score;

If it is And s _i≥p_n, score _c＝score_c +2;

S5, utilizing a multi-target particle swarm algorithm and combining the methods from the step S1 to the step S4 to explore and assign the weights of the evaluation indexes to obtain an optimal solution of the pricing strategy of the automobile insurance, wherein the optimal solution is used as the latest actual pricing strategy; the specific process is as follows:

S5-2, calculating a non-dominant solution in the pop according to the adaptive value, and storing the obtained non-dominant solution into an external archive A;

s5-5, updating a local optimal solution pBest by using a summer ratio idea strategy;

s5-6, updating the individuals in the external archive A;

2. The multi-objective optimization method for automobile insurance pricing according to claim 1, wherein in the Lasso regression method of step S1, a loss function of the Lasso regression is:

Where m is the number of samples; p is the number of driving behaviour factors in the sample; label _i indicates whether the ith insured user is at risk, 0 indicates no risk, 1 indicates risk; x is a matrix of m x p, where x _i is a vector containing the driving behavior factor of the i-th sample; λ is a non-negative constant representing the regularization parameter; beta and beta _j are model coefficients of Lasso regression, wherein beta is a p-dimensional vector, which represents the coefficients of all driving behavior factors, beta _j is a scalar, which represents the coefficient of the jth driving behavior factor; and calculating a plurality of coefficients through the loss function, and selecting num driving behavior factors with the maximum coefficients as evaluation indexes.

3. The multi-objective optimization method for vehicle insurance pricing according to claim 1, wherein in step S2, the specific steps of generating the simulated data samples are:

s2-1, converting the sample into a feature vector v ^r;

V_new＝V_i ^r+(V_i ^r-V_i ^near)*rand(0,1)；

S2-4, storing v _new in the dataset.

4. The multi-objective optimization method for vehicle insurance pricing according to claim 1, wherein the specific steps of step S5-5 are:

s5-5-2, counting the number of optimized target values in a single individual And the number of target values weakenedRespectively obtaining an optimization quantity maximum value, an optimization quantity minimum value, a weakening quantity maximum value and a weakening quantity minimum value on a single target;

5. The multi-objective optimization method for vehicle insurance pricing according to claim 1, wherein in step S5-6, the individuals in the external archive a are updated with elite policy, specifically comprising the steps of:

E_i,R＝E_i,R+(X_max,R-X_min,R)*Gaussian(0,1)；

Sequencing all individuals in the S according to a single target value, and respectively obtaining a maximum target value f _max and a minimum target value f _min; the target value of the first individual after sequencing is f _rs, and the crowding degree of the individual on the single target value is (f _rs+1-f_rs-1)(f_max-f_min); the plurality of congestion degrees of each individual are accumulated as the individual final congestion degree.

6. A terminal comprising a memory means storing a plurality of instructions and a processor for executing the instructions in the memory means, wherein the instructions are adapted to load and execute a multi-objective optimization method for use in automotive insurance pricing according to any of claims 1 to 5.