CN111986808B - Health insurance risk assessment and control method, device and medium - Google Patents

Health insurance risk assessment and control method, device and medium Download PDF

Info

Publication number
CN111986808B
CN111986808B CN202010751841.1A CN202010751841A CN111986808B CN 111986808 B CN111986808 B CN 111986808B CN 202010751841 A CN202010751841 A CN 202010751841A CN 111986808 B CN111986808 B CN 111986808B
Authority
CN
China
Prior art keywords
data
health insurance
index
health
insurance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010751841.1A
Other languages
Chinese (zh)
Other versions
CN111986808A (en
Inventor
王涵
杨杰
吴锋
周肖树
黄业坚
刘状
孙嘉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Institute Of Advanced Technology Chinese Academy Of Sciences Co ltd
Original Assignee
Zhuhai Institute Of Advanced Technology Chinese Academy Of Sciences Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Institute Of Advanced Technology Chinese Academy Of Sciences Co ltd filed Critical Zhuhai Institute Of Advanced Technology Chinese Academy Of Sciences Co ltd
Priority to CN202010751841.1A priority Critical patent/CN111986808B/en
Publication of CN111986808A publication Critical patent/CN111986808A/en
Application granted granted Critical
Publication of CN111986808B publication Critical patent/CN111986808B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance

Abstract

The invention relates to a method, a device and a medium technical proposal for health insurance risk assessment and control, comprising the following steps: collecting one or more pieces of dimension data related to health insurance risk assessment through a big data environment; calculating a health insurance fine calculation index; preprocessing and regularizing according to the dimension data and the health insurance calculation index to obtain evaluation index data; calculating a health insurance risk assessment index according to the assessment index data and the health insurance calculation index data, constructing three-dimensional panel data, predicting, and outputting an optimal prediction result; and taking the accuracy, recall rate, AUC value and the like of the time sequence deep learning model as evaluation parameters of the health insurance risk evaluation model. The beneficial effects of the invention are as follows: based on the deep learning and big data cloud computing platform, each big index of the health insurance fine calculation work is realized, and the risk of inverse selection of the insurance supply unit is reduced.

Description

Health insurance risk assessment and control method, device and medium
Technical Field
The invention relates to the field of computers, in particular to a method, a device and a medium for evaluating and controlling health insurance risks.
Background
Health insurance is aimed at the physical health of a person, and is insurance for medical expenses caused by diseases or accidental injuries or loss of income caused by diseases or accidental disabilities, and in addition, health insurance also includes insurance for giving economic compensation for long-term care required for aging, diseases and disabilities. Social medical insurance and commercial health insurance both belong to health insurance. One of the basic works for the health insurance management is the fine calculation work, which is mainly divided into three major parts of rate making, odds making and gold extraction preparation. Rate making is the primary task of the fine calculation work. Pricing and fairness are two fundamental principles of healthy insurance rate formulation. The odds ratio establishment is one of main indexes for evaluating the fine calculation. The preparation gold extraction is used for measuring whether the health insurance enterprise can normally operate within a period of time or measuring the quantity of insurance policies which can be accepted by the health insurance enterprise within a period of time.
The prior art has no effective technical proposal for realizing the risk assessment and control of the health insurance.
Disclosure of Invention
The invention aims to at least solve one of the technical problems in the prior art, and provides a method, a device and a medium for evaluating and controlling health insurance risks.
The technical scheme of the invention comprises a method for evaluating and controlling health insurance risk, which is characterized by comprising the following steps: s100, constructing a big data environment; s200, collecting one or more pieces of dimension data related to health insurance risk assessment through the big data environment, and calculating health insurance calculation indexes of the dimension data S300; s400, preprocessing and regularizing are carried out according to the dimension data and the health insurance calculation index to obtain evaluation index data, wherein the preprocessing comprises the quantization processing of text data and/or qualitative data; s500, calculating a health insurance risk assessment index according to the assessment index data and the health insurance calculation index data, and constructing corresponding three-dimensional panel data; s600, predicting the three-dimensional panel data serving as training data and verification data through a time sequence deep learning model, and outputting an optimal prediction result.
The method for health insurance risk assessment and control according to the above, wherein the method further comprises: s700, randomly selecting a plurality of panel data as test data, training through the time sequence deep learning model, and taking the accuracy, recall rate, AUC value and the like of the time sequence deep learning model as evaluation parameters of a health insurance risk evaluation model.
The method for health insurance risk assessment and control according to the above, wherein S700 includes: s710, expanding a database of the training data and the verification data, and repeatedly training the time series deep learning model input into the S600; and S720, extracting training and verification data again, changing extraction proportion, inputting the extraction proportion into the time sequence deep learning model for model training and parameter selection of S600, and repeatedly evaluating the time sequence deep learning model when the time sequence deep learning model is subjected to S700 until the evaluation parameters are stable in a set range, and determining optimal model parameters.
The method for health insurance risk assessment and control according to the above, wherein S200 includes: calling interfaces of one or more platforms to obtain corresponding dimension data, wherein the dimension data comprises evaluation related factor index data and evaluation index data; the evaluation related factor index data comprise internal operation data of a health insurance company, related data of contracted claim conditions and related data of health management of an applicant; the evaluation index data mainly comprises, but is not limited to, the participation rate of the specified insurance type, the fund credit limit, the renewal rate, the fund balance rate and the annual consumption amount of people in the insurance type.
The method for health insurance risk assessment and control according to the above, wherein S300 includes: the method comprises the steps of performing fine calculation on disease insurance of contracted diseases, medical insurance of contracted medical behaviors, incapacitation income losing insurance of incapacitation type caused by contracted diseases or accidental injury and nursing insurance of nursing requirement type caused by contracted daily life ability disorder; the corresponding fine calculation indexes comprise fine calculation indexes for rate making, pay rate indexes and preparation gold indexes; the fine calculation indexes for rate making include, but are not limited to, sum of claims, waiting period, no-claims, warranty renewal rate, warranty failure rate, interest rate and safety margin, cost amount range, no-claims, maximum limit and communique proportion; the odds index includes, but is not limited to, an expiration odds, a yearly odds, and a comprehensive odds; the readiness index includes, but is not limited to, having an under-reported liability readiness occurred and having an under-reported liability readiness occurred.
The method for health insurance risk assessment and control according to the above, wherein S400 includes: preprocessing and regularizing according to the dimension data and the health insurance calculation index; the quantization processing comprises the steps of quantizing qualitative data or text to be within the range of [0,1] by using one-hot and other quantization methods, and removing useless data to fill missing data, wherein 0 represents the minimum index degree and 1 represents the maximum index degree; and regularizing the preprocessed data through Python.
The method for health insurance risk assessment and control according to the above, wherein S500 includes: by the formulaCalculating the value of the health risk assessment, and taking the value of the health risk assessment as a health insurance risk assessment value; and constructing three-dimensional panel data by taking time as an X axis, regularized evaluation related factor index data and health insurance calculation index data as a Y axis and the number of evaluation indexes as a Z axis.
The method for health insurance risk assessment and control according to the above, wherein S600 includes: taking the three-dimensional panel data as training data and verification data, randomly extracting the training data and the verification data according to the proportion of 3:2, inputting a time sequence deep learning prediction model, wherein the time sequence deep learning prediction model is selected from but not limited to deep AR, and automatically selecting a group of model parameters with highest accuracy and AUC closest to 1 as an optimal result.
The technical scheme of the invention also comprises a health insurance risk assessment and control device which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, and is characterized in that any one of the method steps is realized when the processor executes the computer program.
The technical solution of the present invention further comprises a computer-readable storage medium storing a computer program, characterized in that the computer program realizes any of the method steps when being executed by a processor.
The beneficial effects of the invention are as follows: based on the deep learning and big data cloud computing platform, each big index of the health insurance fine calculation work is realized, and the risk of inverse selection of the insurance supply unit is reduced.
Drawings
The invention is further described below with reference to the drawings and examples;
FIG. 1 is a general flow diagram according to an embodiment of the present invention;
FIG. 2 is a flow chart of health insurance risk assessment according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating classification of data related to health insurance risk assessment according to an embodiment of the present invention;
FIG. 4 is a flow chart of health insurance risk control according to an embodiment of the present invention;
fig. 5 shows an apparatus and a medium diagram according to an embodiment of the invention.
Detailed Description
Reference will now be made in detail to the present embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein the accompanying drawings are used to supplement the description of the written description so that one can intuitively and intuitively understand each technical feature and overall technical scheme of the present invention, but not to limit the scope of the present invention.
In the description of the present invention, a number means one or more, a number means two or more, and greater than, less than, exceeding, etc. are understood to not include the present number, and above, below, within, etc. are understood to include the present number.
In the description of the present invention, the continuous reference numerals of the method steps are used for facilitating examination and understanding, and by combining the overall technical scheme of the present invention and the logic relationships between the steps, the implementation sequence between the steps is adjusted without affecting the technical effect achieved by the technical scheme of the present invention.
In the description of the present invention, unless explicitly defined otherwise, terms such as arrangement and the like should be construed broadly, and those skilled in the art can reasonably determine the specific meaning of the terms in the present invention in combination with the specific contents of the technical scheme.
Fig. 1 shows a general flow chart according to an embodiment of the invention. The process comprises the following steps: s100, constructing a big data environment; s200, acquiring one or more dimensional data related to health insurance risk assessment through a big data environment; s300, calculating a health insurance fine calculation index of the dimensional data; s400, preprocessing and regularizing according to dimension data and health insurance calculation indexes to obtain evaluation index data, wherein the preprocessing comprises the quantization processing of text data and/or qualitative data; s500, calculating a health insurance risk assessment index according to the assessment index data and the health insurance calculation index data, and constructing corresponding three-dimensional panel data; s600, predicting the three-dimensional panel data serving as training data and verification data through a time sequence deep learning model, and outputting an optimal prediction result; s700, randomly selecting a plurality of panel data as test data, training through a time sequence deep learning model, and taking the accuracy, recall rate, AUC value and the like of the time sequence deep learning model as evaluation parameters of a health insurance risk evaluation model.
Fig. 2 is a flowchart of health insurance risk assessment according to an embodiment of the present invention, and the following embodiments are provided in combination with the flowchart of fig. 1 and the flowchart shown in fig. 2:
big data environment construction, data acquisition related to health insurance risk assessment, data preprocessing and regularization processing, calculation of health insurance calculation indexes, calculation of health risk assessment indexes, health risk assessment based on deep learning, health insurance risk assessment model evaluation and optimization.
Step one: the big data environment construction comprises the steps of constructing a big data development environment based on spark+mapreduce+hive+pyspark+python+mysql, combining multi-core multi-thread multi-process parallel computing development and operation, and improving development and background operation speed and data unit throughput.
Step two: the collection of the related data of the health insurance risk assessment comprises the steps of collecting the related data of the health insurance risk assessment by calling various data collection modes such as an API (application program interface), and the like, wherein the data sets are divided into two main types, namely: evaluating the related factor index data; and evaluating index data. The evaluation related factor index data comprise internal operation data of a health insurance company, related data of contracted claim conditions and related data of health management of an applicant; the evaluation index data mainly comprises, but is not limited to, the participation rate of the specified insurance type, the fund credit limit, the renewal rate, the fund balance rate and the annual consumption amount of people in the insurance type.
Among them, the health insurance company internal operation data includes, but is not limited to:
1. the operation related data of the company, such as department KPI standard condition, company incumbent number, company monthly training times, company management standard self-scoring number, company management strictness self-scoring number, company cultural construction self-scoring number and the like, can be obtained through various modes of investigation, questionnaire, access, interview, company financial audit, enterprise investigation, online public praise and the like;
2. the product service related data of the company, such as the company business process self-evaluation and public praise score, the product design rationality self-evaluation and public praise score, the smoothness degree self-evaluation and public praise score of the underwriting and claims, can be obtained by various modes of investigation, questionnaire, access, seat talk, company financial audit, enterprise audit, online public praise and the like.
The contracted term-related data includes, but is not limited to: the reasonability self-evaluation of the contracted claim protection condition and the public praise score, the company loss amount and the number of insurance sheets caused by the inexact of the contracted claim protection condition, the number of contracted claim protection conditions of the appointed insurance type, the number of legal disputes caused by the contracted claim protection condition in the past, the number of legal disputes caused by the same claim protection condition in the market, and the like. The data can be obtained by means of crawlers, questionnaires, corporate document management, market analysis and the like.
The applicant health management related data includes, but is not limited to: the group feature data of the insurance applicant of the specific insurance type, the health file data of the insurance applicant of the specific insurance type, the health and health service record form of the insurance applicant of the specific insurance type, the social data of the insurance applicant of the specific insurance type, the family health condition data of the insurance applicant of the specific insurance type, the insurance application, the claim data and the like. The data may be obtained by collaboration with third party institutions, surveys, questionnaires, access, interviews, and the like. Referring specifically to the health insurance risk assessment related data classification diagram shown in fig. 3.
Step three, calculating the health insurance calculation index comprises the following steps: for the disease insurance of the contracted disease, the medical insurance of the contracted medical behavior, the incapacitation income loss insurance of the incapacitation type caused by the contracted disease or accidental injury, and the nursing insurance of the nursing requirement type caused by the contracted daily life capacity disorder, the corresponding accurate calculation indexes comprise: the fine target indicators for rate formulation include, but are not limited to: claim total, waiting period, claim free, policy renewal rate, policy failure rate, interest rate and safety margin, expense (consumption caused by contracted insurance range), claim free, maximum limit, and communique proportion; the odds indicators include, but are not limited to: full-term odds, annual odds, comprehensive odds; the gold preparation criteria include, but are not limited to: an unreported responsibility preparation gold (IBNR) has occurred, an inadequately reported responsibility preparation gold (INBER) has occurred.
Step four, data preprocessing and regularization processing mainly comprise: preprocessing the evaluation index data and the health insurance calculation index data in the second and third steps, and quantifying qualitative data or texts to be within the range of [0,1] by using one-hot and other quantification methods, wherein 0 represents minimum index degree (maximum risk degree or minimum company benefit if risk or company benefit is involved), and 1 represents maximum index degree (minimum risk degree or maximum company benefit if risk or company benefit is involved); and removing useless data to fill the missing data. In addition, the pre-processed data is regularized using a sklearn.pre-processing package of Python. And processing various data by adopting an information entropy method.
Step five, calculating the health risk assessment index comprises the following steps: and (3) solving the health risk assessment index for the regularized evaluation index data and the health insurance accurate calculation index data in the step four to serve as a health insurance risk evaluation value. The calculation formula is as follows. And finally, the data are arranged into panel data with time as an X axis, regularized evaluation related factor index data and health insurance calculation index data as a Y axis and evaluation index number as a Z axis. The higher the health insurance risk assessment result, the greater the risk and the smaller the insurance unit benefit.
Step six, health risk assessment based on deep learning comprises the following steps:
the above-mentioned panel data that has been collated is as training data and verification data, randomly extracts training data and verification data proportion 3:2, inputting a time sequence deep learning prediction model, wherein the prediction model is selected from but not limited to deep AR, and automatically selecting a group of model parameters which enable the accuracy to be highest and the AUC (area under Area Under ROC Curve ROC curve) to be closest to 1.
Step seven, evaluating and optimizing the health insurance risk assessment model comprises the following steps:
model evaluation: one third of panel data is randomly selected as test data, and the method is not limited to the adoption of the accuracy, recall rate, AUC value and the like of the model as the evaluation parameters of the health insurance risk evaluation model.
The optimization process comprises the following steps:
1. expanding a training and verifying database, and inputting the training and verifying database into the step 6 for repeated training;
2. and (3) extracting training and verification data again, changing extraction proportion, inputting the extraction proportion into a model, carrying out model training and parameter selection in the step (six), carrying out model evaluation in the step (7) on the model, repeating until the evaluation parameters of the health insurance risk evaluation model are stabilized in a certain range, and determining the optimal model parameters at the moment.
Fig. 4 is a flowchart of health insurance risk control according to an embodiment of the present invention, and in conjunction with fig. 2, the present invention further proposes health insurance risk control, which is implemented as follows:
the building of the health insurance risk control system is divided into 10 steps respectively: big data environment construction, health insurance risk control related data acquisition, data preprocessing and regularization processing, risk control based on deep learning clause design, risk control based on deep learning nuclear insurance, risk control based on deep learning claim settlement, judgment reinsurance range based on deep learning, other health insurance risk control based on deep learning, health insurance risk control mode selection, health insurance risk control model evaluation and optimization.
Step one: big data environment builds and includes:
and the large data development environment based on spark+mapreduce+hive+pyspark+python+mysql combines the parallel computing development and operation of multi-core multi-thread multi-process, and improves the development and background operation speed and data unit throughput.
Step two: the health insurance risk control related data acquisition includes:
by means of surveys, questionnaires, interviews, crawlers, APIs, company document arrangement, insurance fine calculation and the like, a health insurance risk control related database is established in combination with a health insurance risk assessment system database, wherein the database comprises risk control data during clause design, risk control data of nuclear insurance, risk control data of claim settlement, whether cases are reinsurance and other health insurance risk control data.
Wherein the risk control data in terms design includes terms design of cases for risk management in the health insurance risk assessment system database, for example: the claims, the proportion, the insurance payment limit, the exclusionary duty, the waiting period, etc.
The risk control data of the underwriting includes information about underwriting risk management measures in the cases in the health insurance risk assessment system database, such as whether to underwriting standard risk underwriting, whether to underwriting sub-standard risk underwriting, whether to report, etc.
The risk control data of the claim comprises the risk management measures about the claim in the case of the health insurance risk assessment system database, whether the insurer withholds the prior medical history, whether the insurer projects for medical services and other insurance items beyond the expected range of the insurance unit, whether the insurance item related medical institution or the nursing institution and the insurer are mutually hooked, whether grafting cost exists and the like.
Whether the cases are reinsurance, namely whether the cases in the health insurance risk assessment system database have risk avoidance behaviors for reinsurance of the insurance units, reinsurance final benefits and the like.
Other health insurance risk control data mainly includes:
1. control data for a medical service procedure, such as: the medical service utilizes the examination data, the second diagnosis and treatment opinion related data, the medical service monitoring data and the like;
2. medical service compensation mode data, for example: whether the prepayment mode realizes service item payment, prepayment is made according to disease types, prepayment according to personnel heads, prepayment according to diagnosis related classification and the like;
3. gratuitous preferential treatment and other profit sharing measures such as whether to provide free physical examination, provide health service types, premium return terms and amounts, preferential treatment amounts and terms, profit sharing amounts, etc.;
4. health management mechanism related data, such as: whether there are regular check services, whether there are free health plan services, whether there are reservation expert services, whether there are health hotlines set up, whether there are health knowledge lectures to be issued, whether there are health knowledge manuals to be compiled, and other health management measures, etc.;
5. the administrative medical treatment includes whether the case insurer is using an optional medical services website for the insurer, an attending physician, etc.
Step three: the data preprocessing and regularization processing comprises the following steps: preprocessing the collected data: 1: for risk control data in term design, risk control data of a nuclear insurance, risk control data of a claim and other health insurance risk control data, quantifying qualitative data or text into 0 or 1 by adopting one-hot, setting a threshold value and other quantification methods, wherein 0 represents the minimum degree and 1 represents the deepest degree; and removing useless data to fill the missing data. In addition, the pre-processed data is regularized using a sklearn.pre-processing package of Python. Processing various data by adopting an information entropy method; 2: for whether the case is reinsurance data, quantifying qualitative data or text to be within the range of [0,1] by using one-hot and other quantification methods, wherein 0 represents the minimum degree and 1 represents the deepest degree; and removing useless data to fill the missing data. In addition, the pre-processed data is regularized using a sklearn.pre-processing package of Python. And processing various data by adopting an information entropy method.
Step four: risk control in terms design based on deep learning includes:
establishing a clause design risk control prediction model: and D, combining the risk control data and the health risk assessment result data in the item design in the step three, taking time as an X axis, taking the item design related data of risk management as a Y axis, taking the health risk assessment result data as a Z axis, and establishing panel data. The above-mentioned panel data that has been collated is as training data and verification data, randomly extracts training data and verification data proportion 3:2, inputting a time sequence deep learning prediction model, wherein the prediction model is selected from but not limited to deep AR, and automatically selecting a group of model parameters which enable the accuracy to be highest and the AUC (area under Area Under ROC Curve ROC curve) to be closest to 1.
Establishing a test data set: if the risk control index data, such as the indices a and B, are freely combined when designing the clauses, the test data are shown in table 1 below:
index (I) A B
Test data combination 1 0 1
Test data combination 2 0 0
Test data combination 3 1 0
Test data combination 4 1 1
Table 1 test data
And inputting the test data set into a clause design risk control prediction model, and selecting a group of Y-axis index combinations with minimum predicted values of the health risk assessment result data as optimal risk control measures in clause design.
Step five: the risk control of deep learning-based underwriting includes:
establishing a nuclear protection risk control prediction model: combining the risk control data of the underwriting and the health risk assessment result data in the third step, taking time as an X axis, taking the underwriting related data of the risk management as a Y axis, taking the health risk assessment result data as a Z axis, and establishing panel data.
The above-mentioned panel data that has been collated is as training data and verification data, randomly extracts training data and verification data proportion 3:2, inputting a time sequence deep learning prediction model, wherein the prediction model is selected from but not limited to deep AR, and automatically selecting a group of model parameters which enable the accuracy to be highest and the AUC (area under Area Under ROC Curve ROC curve) to be closest to 1.
Establishing a test data set: and (3) freely combining the risk control index data of the underwriting, wherein the combination mode is the test data and the establishment mode in the step four.
And inputting the test data set into a nuclear protection risk control prediction model, and selecting a group of Y-axis index combinations with the smallest predicted values of the health risk assessment result data as optimal risk control measures in nuclear protection.
Step six: risk control for deep learning based claims includes:
establishing a claim risk control prediction model: and (3) combining the risk control data of the claim settlement and the health risk assessment result data in the step (III), taking time as an X axis, taking the claim settlement related data of the risk management as a Y axis, taking the health risk assessment result data as a Z axis, and establishing panel data. The above-mentioned panel data that has been collated is as training data and verification data, randomly extracts training data and verification data proportion 3:2, inputting a time sequence deep learning prediction model, wherein the prediction model is selected from but not limited to deep AR, and automatically selecting a group of model parameters which enable the accuracy to be highest and the AUC (area under Area Under ROC Curve ROC curve) to be closest to 1.
Establishing a test data set: combining the risk control index data of the claim, wherein the combination mode is the test data in the step four and the establishment mode
And inputting the test data set into a claim risk control prediction model, and selecting a group of Y-axis index combinations with the smallest predicted values of the health risk assessment result data as risk control optimal measures in claim settlement.
Step seven: the judgment reinsurance range based on deep learning includes:
establishing a reinsurance risk control prediction model: combining the regularized evaluation index data in the fourth step of the health insurance evaluation system with the health insurance calculation index data and the reinsurance related data for risk management in the third step, establishing panel data, taking time as an X axis, taking the reinsurance regularized data in the reinsurance related data as a Z axis, and taking the regularized index data of the health insurance evaluation system and the index data except the reinsurance as a Y axis. The above-mentioned panel data that has been collated is as training data and verification data, randomly extracts training data and verification data proportion 3:2, inputting a time sequence deep learning prediction model, wherein the prediction model is selected from but not limited to deep AR, and automatically selecting a group of model parameters which enable the accuracy to be highest and the AUC (area under Area Under ROC Curve ROC curve) to be closest to 1.
Establishing a test data set: and (3) freely combining the Y-axis index data in a combination mode such as the test data in the step four and the establishment mode.
Inputting the test data set into a claim risk control prediction model, and selecting: 1. the reinsurance value is 1 (reinsurance behavior exists), the predicted value of the income data is larger than 0, and one or more groups of Y-axis index combinations in a larger range are used as the range needing reinsurance, and under the index of the combination, the reinsurance behavior is needed to be provided by a insurance company; 2. the reinsurance value is 0 (no reinsurance action), the profit data predictive value is greater than 0, and one or more groups of Y-axis index combinations in a larger range are used as the range without reinsurance, and under the indexes of the combinations, the insurance company does not need to provide reinsurance action.
Step eight: other health insurance risk controls based on deep learning include:
establishing other risk control prediction models: and D, combining the other health insurance risk control related data and the health risk assessment result data in the step three, taking time as an X axis, taking the other health insurance risk control related data as a Y axis, taking the health risk assessment result data as a Z axis, and establishing panel data. The above-mentioned panel data that has been collated is as training data and verification data, randomly extracts training data and verification data proportion 3:2, inputting a time sequence deep learning prediction model, wherein the prediction model is selected from but not limited to deep AR, and automatically selecting a group of model parameters which enable the accuracy to be highest and the AUC (area under Area Under ROC Curve ROC curve) to be closest to 1.
Establishing a test data set: freely combining other health insurance risk control related index data in a combination mode such as test data in the fourth step and an establishment mode
And inputting the test data set into other risk control prediction models, and selecting a group of Y-axis index combinations with the smallest predicted values of the health risk assessment result data as risk control optimal measures in other times.
Step nine: the health insurance risk control mode selection comprises the following steps:
and (3) combining the risk control in the clause design in the fourth, fifth, sixth and eighth steps, the risk control of the kernel insurance, the risk control of the claim settlement and other health insurance risk control modes together, namely, the optimal health insurance risk control mode, and judging whether the specific condition is in an insurance range according to the result in the seventh step, if the specific condition avoids the risk in the reinsurance mode, otherwise, not adopting the mode.
Step ten: the health insurance risk control model evaluation and optimization comprises the following steps:
model evaluation: aiming at each model in the fourth to eighth steps, respectively randomly selecting one third of the panel data as test data, and not limited to adopting the accuracy, recall rate, AUC value and the like of the model as evaluation parameters of the health insurance risk evaluation model.
The optimization process comprises the following steps:
1. expanding a training and verifying database, and inputting the training and verifying database into the step 6 for repeated training;
2. and re-extracting training and verifying data, changing the extraction proportion, inputting the extraction proportion into a model for model training and parameter selection in the fourth to eighth steps, carrying out model evaluation in the tenth step on the model, and repeating until the evaluation parameters of the health insurance risk evaluation model are stabilized in a certain range, and determining the optimal model parameters at the moment.
Fig. 5 shows an apparatus and a medium diagram according to an embodiment of the invention. Fig. 5 shows a schematic view of an apparatus according to an embodiment of the invention. The apparatus comprises a memory 100 and a processor 200, wherein the processor 200 stores a computer program for executing: collecting one or more pieces of dimension data related to health insurance risk assessment through a big data environment; calculating a health insurance fine calculation index; preprocessing and regularizing according to the dimension data and the health insurance calculation index to obtain evaluation index data; calculating a health insurance risk assessment index according to the assessment index data and the health insurance calculation index data, constructing three-dimensional panel data, predicting, and outputting an optimal prediction result; and taking the accuracy, recall rate, AUC value and the like of the time sequence deep learning model as evaluation parameters of the health insurance risk evaluation model. Wherein the memory 100 is used for storing data.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of one of ordinary skill in the art without departing from the spirit of the present invention.

Claims (3)

1. A method for health insurance risk assessment and control, the method comprising:
s100, constructing a big data environment;
s200, collecting one or more pieces of dimension data related to health insurance risk assessment through the big data environment,
s300, calculating a health insurance fine calculation index for the dimensional data;
s400, preprocessing and regularizing according to the dimension data and the health insurance calculation index to obtain evaluation index data, wherein the preprocessing comprises the quantization of text data and/or qualitative data;
s500, calculating a health insurance risk assessment index according to the assessment index data and the health insurance calculation index, and constructing corresponding three-dimensional panel data;
s600, predicting the three-dimensional panel data serving as training data and verification data through a time sequence deep learning model, and outputting an optimal prediction result;
s700, randomly selecting a plurality of panel data as test data, training through the time sequence deep learning model, and taking the accuracy, recall rate and AUC value of the time sequence deep learning model as evaluation parameters of a health insurance risk evaluation model;
the S700 includes: s710, expanding a database of the training data and the verification data, and repeatedly training the time series deep learning model input into the S600; s720, re-extracting training and verification data, changing extraction proportion, inputting the extraction proportion into the time sequence deep learning model for model training and parameter selection of S600, and repeatedly evaluating the time sequence deep learning model when the time sequence deep learning model is subjected to S700 until the evaluation parameters are stable in a set range, and determining optimal model parameters;
the S200 includes: calling interfaces of one or more platforms to obtain corresponding dimension data, wherein the dimension data comprises evaluation related factor index data and evaluation index data; the evaluation related factor index data comprise internal operation data of a health insurance company, related data of contracted claim conditions and related data of health management of an applicant; the evaluation index data mainly comprises the participation rate, the fund credit limit, the renewal rate, the fund balance rate and the annual consumption amount of people in the insurance type;
the S300 includes: the method comprises the steps of performing fine calculation on disease insurance of contracted diseases, medical insurance of contracted medical behaviors, incapacitation income losing insurance of incapacitation type caused by contracted diseases or accidental injury and nursing insurance of nursing requirement type caused by contracted daily life ability disorder; the corresponding fine calculation indexes comprise fine calculation indexes for rate making, pay rate indexes and preparation gold indexes; the fine calculation indexes for rate making comprise sum of claims, waiting period, non-claims, warranty renewal rate, warranty failure rate, interest rate and safety margin, expense amount range, non-claims, maximum limit and communique proportion; the odds index comprises an expiration odds ratio, an annual odds ratio and a comprehensive odds ratio; the preparation gold index comprises the preparation gold with the responsibility of not reporting the case and the preparation gold with the responsibility of not reporting the case sufficiently;
the S400 includes: preprocessing and regularizing according to the dimension data and the health insurance calculation index; the quantization processing comprises the steps of quantizing qualitative data or text to be within the range of [0,1] by using one-hot and other quantization methods, and removing useless data to fill missing data, wherein 0 represents the minimum index degree and 1 represents the maximum index degree; regularizing the preprocessed data through Python;
the S500 includes: by the formula
Calculating the value of the health risk assessment, and taking the value of the health risk assessment as a health insurance risk assessment value; constructing three-dimensional panel data by taking time as an X axis, regularized evaluation related factor index data and health insurance calculation index as a Y axis and the number of evaluation indexes as a Z axis;
the S600 includes: taking the three-dimensional panel data as training data and verification data, randomly extracting the training data and the verification data according to the proportion of 3:2, inputting a time sequence deep learning prediction model, wherein the time sequence deep learning prediction model adopts deep AR, and automatically adopts a group of model parameters with highest accuracy and AUC closest to 1 as an optimal result.
2. An apparatus for health insurance risk assessment and control, the apparatus comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the method steps of claim 1 when executing the computer program.
3. A computer readable storage medium storing a computer program, which when executed by a processor performs the method steps of claim 1.
CN202010751841.1A 2020-07-30 2020-07-30 Health insurance risk assessment and control method, device and medium Active CN111986808B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010751841.1A CN111986808B (en) 2020-07-30 2020-07-30 Health insurance risk assessment and control method, device and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010751841.1A CN111986808B (en) 2020-07-30 2020-07-30 Health insurance risk assessment and control method, device and medium

Publications (2)

Publication Number Publication Date
CN111986808A CN111986808A (en) 2020-11-24
CN111986808B true CN111986808B (en) 2023-12-12

Family

ID=73445619

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010751841.1A Active CN111986808B (en) 2020-07-30 2020-07-30 Health insurance risk assessment and control method, device and medium

Country Status (1)

Country Link
CN (1) CN111986808B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734582A (en) * 2021-01-15 2021-04-30 深轻(上海)科技有限公司 Method for improving running speed of life insurance actuarial model
CN116313072A (en) * 2023-02-10 2023-06-23 京大(北京)技术有限公司 Old people disability occurrence rate prediction and assessment method based on artificial intelligence big data

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008064334A2 (en) * 2006-11-21 2008-05-29 American International Group, Inc. Method and system for determining rate of insurance
US7392201B1 (en) * 2000-11-15 2008-06-24 Trurisk, Llc Insurance claim forecasting system
US7966203B1 (en) * 2009-02-27 2011-06-21 Millennium Information Services Property insurance risk assessment using application data
US8027850B1 (en) * 2005-11-28 2011-09-27 Millennium Information Services Property insurance risk assessment processing system and method
US8239246B1 (en) * 2009-08-27 2012-08-07 Accenture Global Services Limited Health and life sciences payer high performance capability assessment
CN107910068A (en) * 2017-11-29 2018-04-13 平安健康保险股份有限公司 Insure health risk Forecasting Methodology, device, equipment and the storage medium of user
CN108573752A (en) * 2018-02-09 2018-09-25 上海米因医疗器械科技有限公司 A kind of method and system of the health and fitness information processing based on healthy big data
CN109300040A (en) * 2018-08-29 2019-02-01 中国科学院自动化研究所 Overseas investment methods of risk assessment and system based on full media big data technology
CN109754157A (en) * 2018-11-30 2019-05-14 畅捷通信息技术股份有限公司 A kind of methods of marking and system for reflecting enterprise's health management, financing and increasing letter
CN109801094A (en) * 2018-12-07 2019-05-24 珠海中科先进技术研究院有限公司 The method and system of prediction model are recommended in a kind of business analysis management
CN111353584A (en) * 2020-02-20 2020-06-30 中山大学 Deep learning training task behavior prediction method based on time series analysis
KR20200080466A (en) * 2018-12-26 2020-07-07 가천대학교 산학협력단 System and method for beach risk assessment based on multiple linear regression and computer program for the same

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8359209B2 (en) * 2006-12-19 2013-01-22 Hartford Fire Insurance Company System and method for predicting and responding to likelihood of volatility

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7392201B1 (en) * 2000-11-15 2008-06-24 Trurisk, Llc Insurance claim forecasting system
US8027850B1 (en) * 2005-11-28 2011-09-27 Millennium Information Services Property insurance risk assessment processing system and method
WO2008064334A2 (en) * 2006-11-21 2008-05-29 American International Group, Inc. Method and system for determining rate of insurance
US7966203B1 (en) * 2009-02-27 2011-06-21 Millennium Information Services Property insurance risk assessment using application data
US8239246B1 (en) * 2009-08-27 2012-08-07 Accenture Global Services Limited Health and life sciences payer high performance capability assessment
CN107910068A (en) * 2017-11-29 2018-04-13 平安健康保险股份有限公司 Insure health risk Forecasting Methodology, device, equipment and the storage medium of user
CN108573752A (en) * 2018-02-09 2018-09-25 上海米因医疗器械科技有限公司 A kind of method and system of the health and fitness information processing based on healthy big data
CN109300040A (en) * 2018-08-29 2019-02-01 中国科学院自动化研究所 Overseas investment methods of risk assessment and system based on full media big data technology
CN109754157A (en) * 2018-11-30 2019-05-14 畅捷通信息技术股份有限公司 A kind of methods of marking and system for reflecting enterprise's health management, financing and increasing letter
CN109801094A (en) * 2018-12-07 2019-05-24 珠海中科先进技术研究院有限公司 The method and system of prediction model are recommended in a kind of business analysis management
KR20200080466A (en) * 2018-12-26 2020-07-07 가천대학교 산학협력단 System and method for beach risk assessment based on multiple linear regression and computer program for the same
CN111353584A (en) * 2020-02-20 2020-06-30 中山大学 Deep learning training task behavior prediction method based on time series analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Community-based care for chronic wound management: an evidence-based analysis.;Medical Advisory secretariat;Ontario health technology assessment series;第9卷(第18期);1-24 *
基于慢病风险评估的自助式健康档案管理系统;王甜;曹煜明;蒋巍巍;;现代医院(第08期);1-2 *

Also Published As

Publication number Publication date
CN111986808A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
Toumeh et al. Surplus free cash flow, stock market segmentations and earnings management: The moderating role of independent audit committee
Kolstad et al. Mandate-based health reform and the labor market: Evidence from the Massachusetts reform
Rust et al. How social security and medicare affect retirement behavior in a world of incomplete markets
Bound et al. Health, economic resources and the work decisions of older men
Cerqueiro et al. Rules versus discretion in loan rate setting
Shen et al. Hospital ownership and financial performance: what explains the different findings in the empirical literature?
Cribb et al. What happens when employers are obliged to nudge? Automatic enrolment and pension saving in the UK
US20050171918A1 (en) Method and system of cost variance analysis
US20130246086A1 (en) Health quant data modeler
CN111986808B (en) Health insurance risk assessment and control method, device and medium
Lu Broken-heart, common life, heterogeneity: Analyzing the spousal mortality dependence
Kahraman et al. Fuzzy multiattribute consumer choice among health insurance options
Button et al. Do stronger employment discrimination protections decrease reliance on Social Security Disability Insurance? Evidence from the US Social Security reforms
Choukhmane et al. Efficiency in Household Decision Making: Evidence from the Retirement Savings of US Couples
Falk Comparing benefits and total compensation in the federal government and the private sector
Falk et al. The Effect of the Employer Match and Defaults on Federal Workers' Savings Behavior in the Thrift Savings Plan
Capatina et al. Health shocks and the evolution of consumption and income over the life-cycle
Demirhan et al. Predicting the Financial Failures of Manufacturing Companies Trading in the Borsa Istanbul (2007-2019)
OKOTH Effects of Tax Incentives and Subsidies on Economic Growth in Developing Economies
MacDonald New Canada Pension Plan enhancements: What will they mean for Canadian seniors?
Almwajeh Applying Altman's Z-Score model of bankruptcy for the prediction of financial distress of rural hospitals in Western Pennsylvania
Bertoni et al. Pension reforms, longer working horizons and depression. Does the risk of automation matter?
Ma et al. Aircraft insurance costs management for sustainable general aviation: insights from general aviation enterprises in China
Bonilla et al. Organizational hierarchies in the Slovenian manufacturing sector
Cutler et al. NBER Retirement Research Center Working Paper NB04-05

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant