WO2019169826A1 - Risk control method for determining irregular medical insurance behavior by means of data analysis - Google Patents

Risk control method for determining irregular medical insurance behavior by means of data analysis Download PDF

Info

Publication number
WO2019169826A1
WO2019169826A1 PCT/CN2018/097746 CN2018097746W WO2019169826A1 WO 2019169826 A1 WO2019169826 A1 WO 2019169826A1 CN 2018097746 W CN2018097746 W CN 2018097746W WO 2019169826 A1 WO2019169826 A1 WO 2019169826A1
Authority
WO
WIPO (PCT)
Prior art keywords
medical
control object
risk control
data
behavior
Prior art date
Application number
PCT/CN2018/097746
Other languages
French (fr)
Chinese (zh)
Inventor
程吉安
Original Assignee
平安医疗健康管理股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安医疗健康管理股份有限公司 filed Critical 平安医疗健康管理股份有限公司
Publication of WO2019169826A1 publication Critical patent/WO2019169826A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Definitions

  • a risk control method for inferring a medical insurance violation behavior by data analysis characterized in that the purpose of the medical treatment object is divided into normal purpose and abnormal purpose, and the method comprises the following steps: Step 1. Obtain current and historical medical behavior data of the wind control object, and personal information and public data related to the wind control object; Step 2.
  • Extract features related to the visiting behavior of the wind control object from the data acquired in step 1 Step 3 According to the extracted features, each visit behavior in the current and historical visiting behaviors of the wind control object is divided into corresponding medical treatment categories to form a medical treatment category sequence; Step 4, the medical treatment category formed in step 3 The sequence is used as an observation sequence, and the purpose of the treatment is regarded as an implicit state, and the most likely sequence of hidden states is calculated according to the hidden Markov model; step 5, if one or more hidden in the sequence of the most likely implicit state is included The containing state corresponds to the abnormal purpose, and outputs medical data related to the current and historical visiting behavior of the wind control object.
  • Step 1 Obtain current and historical visit behavior data of the wind control object, and personal information and public data related to the wind control object; Step 2. Extract the visit with the risk control object from the data acquired in step 1. Behavior-related features; Step 3. According to the extracted features, each visit behavior in the current and historical visit behavior of the wind control object is divided into corresponding treatment categories to form a treatment category sequence; Step 4, will be in step 3.
  • the formed medical treatment sequence is used as an observation sequence, and the purpose of the treatment is regarded as an implicit state, and the most likely implicit state sequence is calculated according to the hidden Markov model; step 5, if one of the most likely implicit state sequences is included Or a plurality of implied states corresponding to the abnormal purpose, outputting a doctor related to the current and historical visit behavior of the risk control object Data.
  • FIG. 1 is a schematic flow chart of a risk control method for inferring a medical insurance violation behavior by data analysis according to an embodiment of the present application
  • FIG. 2 is a functional block diagram of a risk control system for inferring a medical insurance violation by data analysis according to an embodiment of the present application
  • Figure 3 shows the overall flow of judging whether or not to include abnormal surveillance after calculating the implied sequence of treatment objectives based on the HMM model
  • FIG. 1 is a flow chart showing a risk control method for inferring a medical insurance violation by data analysis according to an embodiment of the present application.
  • the information of the insured person includes the current and historical medical treatment behavior data of the insured person, and the personal information related to the insured person;
  • the insured person information includes medical insurance settlement data of the insured person, demographic information of the insured person's age, gender, culture, occupation, etc.
  • the public information includes the scale of the medical institution obtained by combining the public database, such as the company's industrial and commercial registration data, Geographical information, grade data, doctor's title, registration status, etc., the status of medical insurance fund audits and notifications from public information such as news, and case behaviors identified as problems in previous medical insurance fund audits, etc.
  • Demographic characteristics including age, gender, occupation, and cultural level
  • Geographical features the insured person's place of origin, the company's geographic location, the location of the medical institution, the insured area, etc.;
  • Time characteristics time of visit, interval between visits, time for admission
  • Medical characteristics diagnosis, treatment department, medical consumption list, medical institution scale, medical institution level, medical institution's previous case label, doctor's title, etc.;
  • Cost characteristics single cost, total time period, cost component ratio (medical expenses, drug costs, inspection and inspection costs, material costs);
  • Step S300 establishing a classification of the medical treatment category, that is, according to the characteristics extracted in step S200, dividing each visiting behavior in the current and historical medical treatment behavior of the insured into the corresponding medical treatment category, forming a sequence of medical treatment categories,
  • the medical care settlement data is clustered into different types using the unsupervised learning method.
  • Step S400 establishing a hidden Markov model (HMM) model
  • the implied states constructed by the insured person's dimensions include: physical examination, initial diagnosis, follow-up, dispensing, hospitalization, and abnormality.
  • the medical treatment category established in step S300 is used as an observation sequence, and the hidden Markov model parameters (observation probability matrix and state transition matrix) are derived using the Baum-Welch algorithm.
  • Step S500 Using the HMM model established in step S400, output the inference result and the evidence chain according to the current behavior data of the insured and the historical behavior data.
  • the treatment category sequence formed in step S300 is taken as an observation sequence, and the purpose of the treatment is taken as an implicit state, and the most likely implicit state sequence is calculated according to the hidden Markov model.
  • the medical insurance settlement data and the demographic characteristics of the insured are provided by the social security settlement system of the implementation;
  • the business registration information of the company can be obtained through the public channel of the website of the State Administration for Industry and Commerce of the People's Republic of China;
  • the scale and geography of the medical institution The grade data can be obtained through the public websites of the local health and family planning committee website and the hospital homepage;
  • the doctor's title and registration status can be obtained through the public channels of the local health and family planning committee websites;
  • the audit status and notification status of the local medical insurance funds can be crawled through the news network. Collecting and collating; the case behavior identified as a problem in the previous medical insurance fund audit needs to be obtained from the data of the social security calendar year.
  • the extracted features include the following categories.
  • Geographical features Through the network location provider, the distance between the insured person's place of origin, the company's geographic location, the geographic location of the medical institution, the participating area, and the geographical location are converted into coordinate values and values for storage. For example, the hometown "Shanghai" is input to the Baidu map API to obtain the GPS coordinates of Shanghai, and a certain community and a certain hospital are input map APIs to obtain the geographical distance between the two places.
  • Time characteristics time of visit, interval between visits, time for admission.
  • the time of visit and the time of insured are directly extracted from the data, and the interval between visits is the difference between the two visits.
  • Medical characteristics diagnosis, treatment department, medical consumption list, medical institution scale, medical institution level, medical institution's previous case label, doctor title
  • Cost characteristics single cost, total time period, cost component ratio (medical expenses, drug costs, inspection and inspection costs, material costs), etc.
  • step S300 the features extracted in step S200 may be clustered into different clinic category categories using a K-means algorithm or an RVM classifier. For example, the treatment of cancer patients is divided into the initial diagnosis state based on the consumption test and the drug-based chemotherapy state.
  • a hidden Markov model (HMM) model can be established as follows.
  • the different treatment categories were classified as the observation sequence O, and the purpose of the patient's visit (physical examination, initial diagnosis, referral, dispensing, hospitalization, abnormality) was used as an implicit state, and the Baum-Welch algorithm was used to solve the hidden Markov model parameters (observation probability). Matrix and state transition matrix).
  • the calculated observation state probability matrix and state transition probability matrix are respectively exemplified as follows (in a tabular manner for ease of understanding).
  • the data in the above table is the probability obtained by statistical data, which can be summarized from known data obtained from medical institutions.
  • step S500 based on the observation probability matrix and the state transition matrix calculated in step S400, the purpose of each visit of the patient can be dynamically inferred based on the patient visit behavior.
  • an abnormal state corresponding to an abnormal medical purpose
  • the insured person can be included in the abnormal population (specially monitored population), and the hidden state sequence (sequence of the treatment purpose) and The observation sequence (study category sequence) is used as evidence output as evidence for system review or manual audit processing.
  • the relevant insured person may also be according to the total number of occurrences of the abnormal state (corresponding to the abnormal purpose of the visit) in the sequence of the hidden state, and the calculated total probability of the hidden state sequence in which the abnormal state occurs. ) An abnormal population divided into different monitoring levels.
  • the probability of occurrence of all possible implicit state sequences including the abnormal state may also be added to obtain a total probability including an abnormal state (an abnormal medical purpose), and if the total probability is higher than a predetermined threshold, The risk control object is assigned to a special monitoring group.
  • the suspected unreasonable medical treatment behavior (including the abnormal medical treatment purpose) may be divided into the following two cases:
  • Medical behavior sequence (hidden state sequence) anomaly the calculated most likely implicit state sequence, although it does not contain an abnormal state, has an occurrence probability lower than a predetermined threshold.
  • the predetermined threshold may be set to a default of 25% of the probability of the most likely implied state sequence of equal length and can be changed as needed.
  • the sequence of four consecutive medical treatments for a wind control object is (physical examination, physical examination, physical examination, physical examination).
  • the probability of occurrence is lower than 25% of the most common sequence occurrence probability of 4 consecutive visits, and it is considered that the medical behavior sequence is abnormal.
  • the behavior of the wind control object that occurs 4 times during the set time period for the purpose of physical examination is abnormal, and the characteristics related to the physical examination behavior of the risk control object (age, gender, medical treatment without medication, inspection)
  • the inspection fee is high, the multiple inspection items are the same, the total cost is the same, etc.) is output as a chain of evidence to the background.
  • a system for inferring a medical insurance violation behavior by data analysis which is used to implement the above method, the system mainly comprising:
  • the wind control object data obtaining module is configured to acquire current and historical medical behavior data of the wind control object, and personal information and public data related to the wind control object;
  • a feature extraction module configured to extract, from the data acquired by the wind control object data acquisition module, features related to a visiting behavior of the wind control object
  • the medical treatment classification module is configured to divide each visit behavior in the current and historical visiting behaviors of the wind control object into corresponding medical treatment categories according to the extracted characteristics, and form a medical treatment category sequence;
  • a diagnosis target estimation module configured to use the treatment category sequence as an observation sequence, the treatment purpose as an implicit state, and the most likely implicit state sequence according to the hidden Markov model, including the wind control object The most likely purpose of the visit;
  • An abnormality output module configured to: when the most likely medical purpose of the wind control object included in the implicit state sequence corresponds to an abnormal purpose, the output is related to current and historical medical behavior of the wind control object Medical data.
  • the hidden Markov model building module is configured to calculate the observation probability matrix and the state transition probability matrix corresponding to the hidden Markov model based on the big data acquired from the medical institution or the public data source by using the Baum-Welch algorithm.
  • various embodiments of the present application can also be implemented by a software module or computer readable instructions stored on one or more computer readable medium, where the computer readable instructions are executed by a processor or device component Different embodiments described herein are performed.
  • any combination of software modules, computer readable media, and hardware components are contemplated by the present application.
  • the software modules can be stored on any type of computer readable storage medium such as RAM, EPROM, EEPROM, flash memory, registers, hard disk, CD-ROM, DVD, and the like.
  • the computing device or processor can be, for example, a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable logic device, and the like.
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the system for installing the application is installed and runs in the electronic device.
  • the electronic device may be a computing device such as a desktop computer, a notebook, a palmtop computer, or a server.
  • the electronic device can include, but is not limited to, a memory, a processor, and a display.
  • Figure 6 shows only the electronic device having the above components, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
  • the processor may, in some embodiments, be a central processing unit (CPU), a microprocessor, or other data processing chip for executing program code or processing data stored in the memory, such as performing the Install the application's system, etc.
  • CPU central processing unit
  • microprocessor microprocessor
  • other data processing chip for executing program code or processing data stored in the memory, such as performing the Install the application's system, etc.
  • the method in the foregoing embodiment can be implemented by means of software plus a necessary general hardware platform, and can also be implemented by hardware, but in many cases.
  • the former is a better implementation.
  • the technical solution of the present application in essence or the contribution to the prior art can be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic
  • the disc, the optical disc includes a plurality of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the various embodiments of the present application.
  • a computer readable storage medium storing thereon a program for executing a risk control method for inferring a medical insurance violation behavior by data analysis, the program being processed by a processor When executed, the steps according to the method are implemented.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Health & Medical Sciences (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Public Health (AREA)
  • Technology Law (AREA)
  • Physics & Mathematics (AREA)
  • Epidemiology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present application relates to a risk control method for determining irregular medical insurance behavior by means of data analysis. The method comprises: step 1: acquiring current and historical medical treatment behavior data of a person under risk control and personal information and public data related to the person under risk control; step 2: extracting, from the data acquired in step 1, features related to medical treatment behavior of the person under risk control; step 3: classifying, according to the extracted features, each medical treatment behavior of the current and historical medical treatment behavior of the person under risk control into corresponding medical treatment categories, so as to generate a medical treatment category sequence; step 4: using the medical treatment category sequence generated in step 3 as an observation sequence, using a medical treatment purpose as a hidden state, and calculating, according to a Hidden Markov model, a most likely hidden state sequence which comprises a most likely medical treatment purpose; and step 5: if the most likely medical treatment purpose of the person under risk control comprised in the hidden state sequence corresponds to an abnormal purpose, outputting medical data related to the current and historical medical treatment behavior of the person under risk control.

Description

通过数据分析推断医疗保险违规行为的风控方法Wind control method for inferring medical insurance violations through data analysis
本申请申明享有2018年3月8日递交的申请号为201810191862.5、名称为“通过数据分析推断医疗保险违规行为的风控方法”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。This application claims the priority of the Chinese patent application filed on March 8, 2018, the application number of 201101191862.5, entitled "Introduction of the risk of medical insurance violations through data analysis", the overall content of which is referenced. The way is combined in this application.
技术领域Technical field
本申请涉及互联网数据处理技术领域,尤其涉及通过数据分析推断医疗保险违规行为、为系统审核或人工稽核提供依据的风控方法。The present application relates to the field of Internet data processing technologies, and in particular, to a risk control method for inferring medical insurance violations through data analysis, providing a basis for system auditing or manual auditing.
背景技术Background technique
在医保社会体系中,每天都有数以万计的门诊、医院医疗行为的交易数据发生,主要包括病人与医疗机构的交易、医疗机构与保险机构的交易等。目前,现有医保处理系统在处理支付交易时难以精准识别病人的真正需求,参保人或医疗机构有从中谋取不当利益的可能性。医疗欺诈行为的存在严重影响了医疗保险基金的收支平衡,侵害了参保人的利益和社会公益。In the medical insurance social system, there are tens of thousands of transaction data of outpatient and hospital medical behaviors every day, mainly including transactions between patients and medical institutions, and transactions between medical institutions and insurance institutions. At present, the existing medical insurance processing system is difficult to accurately identify the real needs of patients when dealing with payment transactions, and the insured person or medical institution has the possibility of seeking improper benefits. The existence of medical fraud has seriously affected the balance of income and expenditure of medical insurance funds, infringing the interests of insured persons and social welfare.
政府和相关部门已致力于采用大数据方法鉴别医保欺诈行为,控制医保风险。然而,现有的医保风控方案多基于设定阈值红线以达到监测欺诈、浪费、滥用等违规行为,而因违规行为常随医保政策、支付方式及监管力度变化,故简单的阈值划分不适合多场景、参保人员组成、政策更替的实际应用环境。The government and relevant departments have been committed to using big data methods to identify health insurance fraud and control health care risks. However, the existing medical insurance risk control scheme is mostly based on setting the threshold red line to monitor fraud, waste, abuse and other irregularities. Because the violations often change with the medical insurance policy, payment methods and supervision, the simple threshold division is not suitable. The actual application environment of multiple scenarios, insured personnel, and policy replacement.
发明内容Summary of the invention
有鉴于此,针对现有技术的上述缺点,存在采用机器学习和分类器建模等技术手段来解决上述问题的需要。In view of this, in view of the above disadvantages of the prior art, there is a need to solve the above problems by using technical means such as machine learning and classifier modeling.
根据本申请的实施例,提供了一种通过数据分析推断医疗保险违规行为 的风控方法,其特征在于,风控对象的就诊目的被划分为正常目的和异常目的,所述方法包括以下步骤:步骤1、获取风控对象的当前及历史就诊行为数据、以及与风控对象相关的个人信息和公共数据;步骤2、从在步骤1获取的数据中提取与风控对象的就诊行为相关的特征;步骤3、根据所提取的特征,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,形成就诊类别序列;步骤4、将在步骤3中形成的就诊类别序列作为观测序列,将就诊目的作为隐含状态,根据隐马尔可夫模型计算出最可能的隐含状态序列;步骤5、如果所述最可能的隐含状态序列中包含的一个或多个隐含状态对应于异常目的,则输出与所述风控对象的当前及历史就诊行为相关的医疗数据。According to an embodiment of the present application, there is provided a risk control method for inferring a medical insurance violation behavior by data analysis, characterized in that the purpose of the medical treatment object is divided into normal purpose and abnormal purpose, and the method comprises the following steps: Step 1. Obtain current and historical medical behavior data of the wind control object, and personal information and public data related to the wind control object; Step 2. Extract features related to the visiting behavior of the wind control object from the data acquired in step 1 Step 3: According to the extracted features, each visit behavior in the current and historical visiting behaviors of the wind control object is divided into corresponding medical treatment categories to form a medical treatment category sequence; Step 4, the medical treatment category formed in step 3 The sequence is used as an observation sequence, and the purpose of the treatment is regarded as an implicit state, and the most likely sequence of hidden states is calculated according to the hidden Markov model; step 5, if one or more hidden in the sequence of the most likely implicit state is included The containing state corresponds to the abnormal purpose, and outputs medical data related to the current and historical visiting behavior of the wind control object.
根据本申请的实施例,提供了一种用于进行前述方法的风控系统,包括:风控对象数据获取模块,被配置用于获取风控对象的当前及历史就诊行为数据、以及与风控对象相关的个人信息和公共数据;特征提取模块,被配置用于从所述风控对象数据获取模块获取的数据中提取与风控对象的就诊行为相关的特征;就诊分类模块,被配置用于根据所提取的特征,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,形成就诊类别序列;就诊目的推测模块,被配置用于将在所述就诊类别序列作为观测序列,将就诊目的作为隐含状态,根据隐马尔可夫模型计算出最可能的隐含状态序列;异常输出模块,被配置用于在所述隐含状态序列中包含的隐含状态中的一个对应于异常目的的情况下,输出与所述风控对象的当前及历史就诊行为相关的医疗数据。According to an embodiment of the present application, there is provided a wind control system for performing the foregoing method, comprising: a wind control object data acquisition module configured to acquire current and historical medical behavior data of the wind control object, and the wind control Object-related personal information and public data; a feature extraction module configured to extract features related to the visiting behavior of the wind control object from the data acquired by the wind control object data acquiring module; the medical treatment classification module is configured to be used for According to the extracted features, each visit behavior in the current and historical visiting behavior of the wind control object is divided into corresponding medical treatment categories to form a medical treatment category sequence; a diagnosis target estimation module is configured to be used in the medical treatment category sequence As an observation sequence, the purpose of the treatment is regarded as an implicit state, and the most likely sequence of hidden states is calculated according to the hidden Markov model; the abnormality output module is configured to be included in the implicit state included in the sequence of the implicit state In the case of an abnormal purpose, the output is related to the current and historical visit behavior of the risk control object Treatment data.
根据本申请的实施例,提供了一种计算机可读存储介质,其上存储用于执行通过数据分析推断医疗保险违规行为的风控方法的程序,所述程序在被处理器执行时,实现以下步骤的操作:步骤1、获取风控对象的当前及历史就诊行为数据、以及与风控对象相关的个人信息和公共数据;步骤2、从在步骤1获取的数据中提取与风控对象的就诊行为相关的特征;步骤3、根据 所提取的特征,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,形成就诊类别序列;步骤4、将在步骤3中形成的就诊类别序列作为观测序列,将就诊目的作为隐含状态,根据隐马尔可夫模型计算出最可能的隐含状态序列;步骤5、如果所述最可能的隐含状态序列中包含的一个或多个隐含状态对应于异常目的,则输出与所述风控对象的当前及历史就诊行为相关的医疗数据。According to an embodiment of the present application, there is provided a computer readable storage medium having stored thereon a program for executing a risk control method for inferring a medical insurance violation by data analysis, the program, when executed by a processor, implements the following Step operation: Step 1. Obtain current and historical visit behavior data of the wind control object, and personal information and public data related to the wind control object; Step 2. Extract the visit with the risk control object from the data acquired in step 1. Behavior-related features; Step 3. According to the extracted features, each visit behavior in the current and historical visit behavior of the wind control object is divided into corresponding treatment categories to form a treatment category sequence; Step 4, will be in step 3. The formed medical treatment sequence is used as an observation sequence, and the purpose of the treatment is regarded as an implicit state, and the most likely implicit state sequence is calculated according to the hidden Markov model; step 5, if one of the most likely implicit state sequences is included Or a plurality of implied states corresponding to the abnormal purpose, outputting a doctor related to the current and historical visit behavior of the risk control object Data.
本申请的有益效果主要在于:The beneficial effects of the present application are mainly as follows:
1、提高了医保基金风控的灵活性与适应性;1. Improve the flexibility and adaptability of the medical insurance fund risk control;
2、通过现有行为组合预演可能发生的违规;2. Rehearsing possible violations through a combination of existing behaviors;
3、在风险识别和控制的过程中自动存留相关证据链,以供后序处理。3. The relevant evidence chain is automatically retained in the process of risk identification and control for subsequent processing.
附图说明DRAWINGS
图1为根据本申请的实施例的通过数据分析推断医疗保险违规行为的风控方法的流程示意图;1 is a schematic flow chart of a risk control method for inferring a medical insurance violation behavior by data analysis according to an embodiment of the present application;
图2为根据本申请的实施例的通过数据分析推断医疗保险违规行为的风控系统的功能模块示意图;2 is a functional block diagram of a risk control system for inferring a medical insurance violation by data analysis according to an embodiment of the present application;
图3示出了根据HMM模型计算出隐含的就诊目的序列后判断是否纳入异常监督的总体流程;Figure 3 shows the overall flow of judging whether or not to include abnormal surveillance after calculating the implied sequence of treatment objectives based on the HMM model;
图4示出了根据本申请的实施例的安装了应用程序的系统的运行环境。FIG. 4 illustrates an operating environment of a system in which an application is installed, according to an embodiment of the present application.
具体实施方式Detailed ways
下面,结合附图对技术方案的实施作进一步的详细描述。The implementation of the technical solution will be further described in detail below with reference to the accompanying drawings.
本领域的技术人员能够理解,尽管以下的说明涉及到有关本申请的实施例的很多技术细节,但这仅为用来说明本申请的原理的示例、而不意味着任何限制。本申请能够适用于不同于以下例举的技术细节之外的场合,只要它们不背离本申请的原理和精神即可。It will be understood by those skilled in the art that the following description is to be construed as illustrative of the embodiments of the invention. The present application can be applied to other than the technical details exemplified below, as long as they do not depart from the principles and spirit of the present application.
另外,为了避免使本说明书的描述限于冗繁,在本说明书中的描述中,可能对可在现有技术资料中获得的部分技术细节进行了省略、简化、变通等处理,这对于本领域的技术人员来说是可以理解的,并且这不会影响本说明书的公开充分性。In addition, in order to avoid limitation of the description of the present specification to the simplifications, in the description in the specification, some technical details that can be obtained in the prior art materials may be omitted, simplified, modified, etc., which is a technique in the art. It is understandable to the person and this does not affect the disclosure adequacy of this specification.
下文中,将参照附图描述本申请的实施例。Hereinafter, embodiments of the present application will be described with reference to the drawings.
注意,将以下面的次序给出描述:1、通过数据分析推断医疗保险违规行为的风控方法(图1);2、通过数据分析推断医疗保险违规行为的系统(图2-3);3、安装了用于实现本申请的实施例的应用程序的系统 、以及存储所 述应用程序的计算机可读介质(图4)。 Note that the description will be given in the following order: 1. The risk control method for inferring medical insurance violations through data analysis (Figure 1); 2. The system for inferring medical insurance violations through data analysis (Figure 2-3); installed on a computer system application program for implementing an embodiment of the present application, and the application program storing medium readable described (FIG. 4).
1、通过数据分析推断医疗保险违规行为的风控方法1. Risk control method for inferring medical insurance violations through data analysis
图1为根据本申请的实施例的通过数据分析推断医疗保险违规行为的风控方法的流程示意图。1 is a flow chart showing a risk control method for inferring a medical insurance violation by data analysis according to an embodiment of the present application.
如图1所示,根据本申请的实施例的通过数据分析推断医疗保险违规行为的风控方法主要包括以下步骤:As shown in FIG. 1 , the wind control method for inferring medical insurance violation behavior through data analysis according to an embodiment of the present application mainly includes the following steps:
步骤S100、数据获取,即,获取与数据分析相关的各类信息,包括参保人(风控对象)信息和公共信息。Step S100: Data acquisition, that is, acquiring various types of information related to data analysis, including insured (weather control object) information and public information.
其中,参保人(风控对象)信息包括参保人的当前及历史就诊行为数据、以及与参保人相关的个人信息;The information of the insured person (the risk control object) includes the current and historical medical treatment behavior data of the insured person, and the personal information related to the insured person;
所述参保人信息包括参保人的医保结算数据、参保人的年龄、性别、文化、职业等人口学信息,所述公共信息包括结合公开数据库如公司工商注册数据获取的医疗机构规模、地理、等级数据、医生职称、注册情况等信息、从新闻等公开信息抓取的各地医保基金稽核情况和通报情况、既往医保基金稽核中识别为问题的个案行为,等等。The insured person information includes medical insurance settlement data of the insured person, demographic information of the insured person's age, gender, culture, occupation, etc., and the public information includes the scale of the medical institution obtained by combining the public database, such as the company's industrial and commercial registration data, Geographical information, grade data, doctor's title, registration status, etc., the status of medical insurance fund audits and notifications from public information such as news, and case behaviors identified as problems in previous medical insurance fund audits, etc.
步骤S200、从上一步骤获取的信息中提取特征,即,从在步骤S100获取的数据中提取与参保人的就诊行为相关的特征。In step S200, the feature is extracted from the information acquired in the previous step, that is, the feature related to the insured person's visiting behavior is extracted from the data acquired in step S100.
具体地,从上述各类数据中提取数据特征,以进行下一步的分类,所提 取的特征主要分为以下几类:Specifically, data features are extracted from the above various types of data for classification in the next step, and the extracted features are mainly classified into the following categories:
人口学特征:包括年龄、性别、职业、文化水平等;Demographic characteristics: including age, gender, occupation, and cultural level;
地理特征:参保人籍贯、公司地理位置、医疗机构地理位置、参保区域等;Geographical features: the insured person's place of origin, the company's geographic location, the location of the medical institution, the insured area, etc.;
时间特征:就诊时间、就诊间期、参保时间;Time characteristics: time of visit, interval between visits, time for admission;
医疗特征:诊断、就诊科室、医疗消费清单、医疗机构规模、医疗机构等级、医疗机构既往案底标签、医生职称等;Medical characteristics: diagnosis, treatment department, medical consumption list, medical institution scale, medical institution level, medical institution's previous case label, doctor's title, etc.;
费用特征:单次花费、时间段总花费、费用构成比例(诊疗费用、药品费用、检验检查费用、材料费用)等;Cost characteristics: single cost, total time period, cost component ratio (medical expenses, drug costs, inspection and inspection costs, material costs);
步骤S300、建立就诊类别分类,即,根据在步骤S200中所提取的特征,将参保人的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,形成就诊类别序列,Step S300, establishing a classification of the medical treatment category, that is, according to the characteristics extracted in step S200, dividing each visiting behavior in the current and historical medical treatment behavior of the insured into the corresponding medical treatment category, forming a sequence of medical treatment categories,
具体地,根据在步骤S200中提取的特征,使用无监督学习的方式,将医保结算数据的就诊类别聚类为不同的类型。Specifically, according to the feature extracted in step S200, the medical care settlement data is clustered into different types using the unsupervised learning method.
步骤S400、建立隐马尔可夫模型(HMM)模型Step S400, establishing a hidden Markov model (HMM) model
以参保人为轴线构建隐马尔可夫模型。The hidden Markov model is constructed with the insured as the axis.
其中,以参保人维度构建的隐含状态包括:体检、初诊、复诊、配药、住院、异常。Among them, the implied states constructed by the insured person's dimensions include: physical examination, initial diagnosis, follow-up, dispensing, hospitalization, and abnormality.
将在步骤S300中建立的就诊类别作为观测序列,利用Baum-Welch算法推导出隐马尔可夫模型参数(观测概率矩阵和状态转移矩阵)。The medical treatment category established in step S300 is used as an observation sequence, and the hidden Markov model parameters (observation probability matrix and state transition matrix) are derived using the Baum-Welch algorithm.
步骤S500、利用在步骤S400中建立的HMM模型,根据参保人的当前行为数据以及历史行为数据,输出推理结果及证据链。Step S500: Using the HMM model established in step S400, output the inference result and the evidence chain according to the current behavior data of the insured and the historical behavior data.
具体地,将在步骤S300中形成的就诊类别序列作为观测序列,将就诊目的作为隐含状态,根据隐马尔可夫模型计算出最可能的隐含状态序列。Specifically, the treatment category sequence formed in step S300 is taken as an observation sequence, and the purpose of the treatment is taken as an implicit state, and the most likely implicit state sequence is calculated according to the hidden Markov model.
其中,当根据观测概率矩阵和状态转移矩阵推理某个参保人有异常就诊行为时,将相应的参保人为风险目标输出,并将其记录的行为特征状态转移 情况作为证据链输出到系统审核或人工稽核。When, according to the observation probability matrix and the state transition matrix, it is inferred that an insured person has an abnormal medical treatment behavior, the corresponding insured person is output as a risk target, and the recorded behavioral state transition state is output as a evidence chain to the system audit. Or manual audit.
作为示例,在步骤S100中,医保结算数据及参保人人口学特征由实施地社保结算系统提供;公司工商注册信息可通过中华人民共和国国家工商行政管理总局网站公开渠道获取;医疗机构规模、地理、等级数据可通过各地卫生和计划生育委员会网站及医院主页公开渠道获取;医生职称、注册情况可通过各地卫生和计划生育委员会网站公开渠道获取;各地医保基金稽核情况和通报情况可通过新闻网络爬虫收集及整理;既往医保基金稽核中识别为问题的个案行为需从实施地社保历年数据获取。As an example, in step S100, the medical insurance settlement data and the demographic characteristics of the insured are provided by the social security settlement system of the implementation; the business registration information of the company can be obtained through the public channel of the website of the State Administration for Industry and Commerce of the People's Republic of China; the scale and geography of the medical institution The grade data can be obtained through the public websites of the local health and family planning committee website and the hospital homepage; the doctor's title and registration status can be obtained through the public channels of the local health and family planning committee websites; the audit status and notification status of the local medical insurance funds can be crawled through the news network. Collecting and collating; the case behavior identified as a problem in the previous medical insurance fund audit needs to be obtained from the data of the social security calendar year.
作为示例,在步骤S200中,所提取的特征包括以下几类。As an example, in step S200, the extracted features include the following categories.
人口学特征:包括年龄、性别、职业、文化水平、工作单位等,从实施地社保数据库直接提取,工作单位通过与工商注册信息相关联。Demographic characteristics: including age, gender, occupation, cultural level, work unit, etc., directly extracted from the social security database of the implementation, and the work unit is associated with the business registration information.
地理特征:通过网络地理位置提供商,将参保人籍贯、公司地理位置、医疗机构地理位置、参保区域、各地理位置间的距离转换为坐标值和数值进行储存。例如将籍贯“上海”作为输入到百度地图API以获取上海的GPS坐标,将某某小区与某某医院为输入地图API以获取两地的地理距离。Geographical features: Through the network location provider, the distance between the insured person's place of origin, the company's geographic location, the geographic location of the medical institution, the participating area, and the geographical location are converted into coordinate values and values for storage. For example, the hometown "Shanghai" is input to the Baidu map API to obtain the GPS coordinates of Shanghai, and a certain community and a certain hospital are input map APIs to obtain the geographical distance between the two places.
时间特征:就诊时间、就诊间期、参保时间。就诊时间、参保时间由数据直接提取,就诊间期为前后两次就诊日期的差值。Time characteristics: time of visit, interval between visits, time for admission. The time of visit and the time of insured are directly extracted from the data, and the interval between visits is the difference between the two visits.
医疗特征:诊断、就诊科室、医疗消费清单、医疗机构规模、医疗机构等级、医疗机构既往案底标签、医生职称Medical characteristics: diagnosis, treatment department, medical consumption list, medical institution scale, medical institution level, medical institution's previous case label, doctor title
费用特征:单次花费、时间段总花费、费用构成比例(诊疗费用、药品费用、检验检查费用、材料费用)等Cost characteristics: single cost, total time period, cost component ratio (medical expenses, drug costs, inspection and inspection costs, material costs), etc.
作为示例,在步骤S300中,可使用K-means算法或者RVM分类器,将在步骤S200中提取的特征聚类为不同的就诊类别类别。如将肿瘤病人就诊分为以消费检验检查为主的初诊状态和以药品为主的化疗状态。As an example, in step S300, the features extracted in step S200 may be clustered into different clinic category categories using a K-means algorithm or an RVM classifier. For example, the treatment of cancer patients is divided into the initial diagnosis state based on the consumption test and the drug-based chemotherapy state.
作为示例,在步骤S400中,可如下建立隐马尔可夫模型(HMM)模型。As an example, in step S400, a hidden Markov model (HMM) model can be established as follows.
将提取的不同就诊类别分类作为观测序列O,患者就诊的目的(体检、 初诊、复诊、配药、住院、异常)作为隐含状态,使用Baum-Welch算法求解出隐马尔可夫模型参数(观测概率矩阵和状态转移矩阵)。The different treatment categories were classified as the observation sequence O, and the purpose of the patient's visit (physical examination, initial diagnosis, referral, dispensing, hospitalization, abnormality) was used as an implicit state, and the Baum-Welch algorithm was used to solve the hidden Markov model parameters (observation probability). Matrix and state transition matrix).
假设在步骤S300中将就诊类别分为4类(就诊类别),则计算出的观测状态概率矩阵和状态转移概率矩阵分别举例如下(为便于理解,以表格方式呈现)。Assuming that the medical treatment category is classified into four categories (visiting category) in step S300, the calculated observation state probability matrix and state transition probability matrix are respectively exemplified as follows (in a tabular manner for ease of understanding).
表1观测状态概率矩阵Table 1 observation state probability matrix
Figure PCTCN2018097746-appb-000001
Figure PCTCN2018097746-appb-000001
表2状态转移概率矩阵Table 2 state transition probability matrix
  体检Medical examination 初诊New diagnosis 复诊Review 配药Dispensing 住院Hospitalization 异常abnormal
体检Medical examination 0.320.32 0.010.01 0.130.13 0.340.34 0.190.19 0.010.01
初诊New diagnosis 0.070.07 0.160.16 0.310.31 0.020.02 0.090.09 0.350.35
复诊Review 0.080.08 0.370.37 0.070.07 0.260.26 0.200.20 0.020.02
配药Dispensing 0.070.07 0.250.25 0.030.03 0.290.29 0.050.05 0.310.31
住院Hospitalization 0.130.13 0.140.14 0.130.13 0.290.29 0.040.04 0.270.27
异常abnormal 0.070.07 0.340.34 0.270.27 0.150.15 0.120.12 0.050.05
上述表格中数据(矩阵中的值)为通过数据统计获得的概率,其可为从医疗机构获取的已知数据中汇总得出。The data in the above table (values in the matrix) is the probability obtained by statistical data, which can be summarized from known data obtained from medical institutions.
作为示例,在步骤S500中,根据在步骤S400中计算出的观测概率矩阵和状态转移矩阵,依据患者就诊行为,能够动态推理出患者每次就诊的目的。As an example, in step S500, based on the observation probability matrix and the state transition matrix calculated in step S400, the purpose of each visit of the patient can be dynamically inferred based on the patient visit behavior.
例如,如某患者既往4次就诊的观测序列为O=(就诊类别1,就诊类别3,就诊类别2,就诊类别2),则依据HMM模型(并非一定对应于上面示例的模型,可以是任意其它HMM模型)推理的最可能的隐含状态(就诊目的)序列可为I=(住院,配药,初诊,初诊)。接下来,当患者第5次就诊时,就诊类别为“就诊类别2”,此时,观测序列变为O=(就诊类别1,就诊类别3,就诊类别2,就诊类别2,就诊类别2),根据HMM模型,计算出最可能的隐含状态序列将变为I=(住院,配药,初诊,异常,初诊)。由此,当隐含状态序列中出现异常状态(对应于异常的就诊目的)时,可将该参保人纳入异常人群(特别监控人群),并将该隐含状态序列(就诊目的序列)和观测序列(就诊类别序列)作为特征输出作为证据,提供给系统审核或人工稽核处理。For example, if the observation sequence of a patient's previous 4 visits is O=(visit category 1, visit category 3, visit category 2, visit category 2), then according to the HMM model (not necessarily corresponding to the model above, it can be any Other HMM models) The most likely implied state of the reasoning (the purpose of the visit) can be I = (hospital, dispensing, initial diagnosis, initial diagnosis). Next, when the patient visits the 5th time, the visit category is “Visit Category 2”. At this time, the observation sequence becomes O=(visit category 1, visit category 3, visit category 2, visit category 2, visit category 2) According to the HMM model, the most likely sequence of implied states will be calculated as I= (hospitalization, dispensing, initial diagnosis, abnormality, initial diagnosis). Thus, when an abnormal state (corresponding to an abnormal medical purpose) occurs in the sequence of the implied state, the insured person can be included in the abnormal population (specially monitored population), and the hidden state sequence (sequence of the treatment purpose) and The observation sequence (study category sequence) is used as evidence output as evidence for system review or manual audit processing.
可选地,还可根据隐含状态序列中出现异常状态(对应于异常的就诊目的)的数目、计算出的出现异常状态的隐含状态序列的总概率,将相关参保人(风控对象)划分为不同监控级别的异常人群。Optionally, the relevant insured person (the wind control object may also be according to the total number of occurrences of the abnormal state (corresponding to the abnormal purpose of the visit) in the sequence of the hidden state, and the calculated total probability of the hidden state sequence in which the abnormal state occurs. ) An abnormal population divided into different monitoring levels.
可选地,还可将包括异常状态的全部可能的隐含状态序列的发生概率相加,得到包含异常状态(异常就诊目的)的总概率,如果所述总概率高于预定阈值,则可将该风控对象划入特别监控人群。Optionally, the probability of occurrence of all possible implicit state sequences including the abnormal state may also be added to obtain a total probability including an abnormal state (an abnormal medical purpose), and if the total probability is higher than a predetermined threshold, The risk control object is assigned to a special monitoring group.
需要说明的是,上述表格中的数据、以及观测序列为O和隐含状态序列I的具体内容仅为用于例示本申请的原理的示例,并帮助本领域的技术人员理解实现本申请的方式,其与真实应用情况并不构成严格的对应关系,同时,上述数据和具体内容也不构成对本申请的任何限定。It should be noted that the data in the above table, and the specific content of the observation sequence O and the implicit state sequence I are only examples for illustrating the principle of the present application, and help those skilled in the art understand the manner of implementing the present application. It does not constitute a strict correspondence with the actual application, and the above data and specific content do not constitute any limitation on the present application.
作为可选实施例,如图3所示,还可将疑似不合理的就诊行为(包含异常就诊目的)划分为如下两种情况:As an alternative embodiment, as shown in FIG. 3, the suspected unreasonable medical treatment behavior (including the abnormal medical treatment purpose) may be divided into the following two cases:
1、单次或多次就诊目的异常:如上述示例中出现标注为异常的状态,此时,默认将该风控对象划入特别监控对象,或者需要系统审核或人工稽核 处理;1. The purpose of single or multiple visits is abnormal: as in the above example, the status marked as abnormal occurs. At this time, the risk control object is assigned to the special monitoring object by default, or system audit or manual audit processing is required;
2、医疗行为序列(隐状态序列)异常:计算出的最可能的隐含状态序列尽管不包含异常状态,但其发生概率低于预定阈值,例如,该可将所述预定阈值默认设定为同等长度的最可能的隐含状态序列的发生概率的25%,并可根据需要随时更改。2. Medical behavior sequence (hidden state sequence) anomaly: the calculated most likely implicit state sequence, although it does not contain an abnormal state, has an occurrence probability lower than a predetermined threshold. For example, the predetermined threshold may be set to a default of 25% of the probability of the most likely implied state sequence of equal length and can be changed as needed.
例如,某风控对象的4次连续就诊类别序列为(体检,体检,体检,体检)发生概率低于发生4次连续就诊人群最常见序列发生概率的25%,则认为是医疗行为序列异常,提示工作人员该风控对象在设定时间段内发生4次以体检为目的的行为是异常的,并将与该风控对象的体检行为相关的特征(年龄、性别、就诊无药品消费、检验检查费高、多次检查项目雷同、总费用一致等)作为证据链输出到后台。For example, the sequence of four consecutive medical treatments for a wind control object is (physical examination, physical examination, physical examination, physical examination). The probability of occurrence is lower than 25% of the most common sequence occurrence probability of 4 consecutive visits, and it is considered that the medical behavior sequence is abnormal. It is suggested that the behavior of the wind control object that occurs 4 times during the set time period for the purpose of physical examination is abnormal, and the characteristics related to the physical examination behavior of the risk control object (age, gender, medical treatment without medication, inspection) The inspection fee is high, the multiple inspection items are the same, the total cost is the same, etc.) is output as a chain of evidence to the background.
2、通过数据分析推断医疗保险违规行为的系统2. A system for inferring medical insurance violations through data analysis
根据本申请的实施例,提供了一种用于通过数据分析推断医疗保险违规行为的系统,其用于实现上述方法,该系统主要包括:According to an embodiment of the present application, there is provided a system for inferring a medical insurance violation behavior by data analysis, which is used to implement the above method, the system mainly comprising:
风控对象数据获取模块,被配置用于获取风控对象的当前及历史就诊行为数据、以及与风控对象相关的个人信息和公共数据;The wind control object data obtaining module is configured to acquire current and historical medical behavior data of the wind control object, and personal information and public data related to the wind control object;
特征提取模块,被配置用于从所述风控对象数据获取模块获取的数据中提取与风控对象的就诊行为相关的特征;a feature extraction module, configured to extract, from the data acquired by the wind control object data acquisition module, features related to a visiting behavior of the wind control object;
就诊分类模块,被配置用于根据所提取的特征,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,形成就诊类别序列;The medical treatment classification module is configured to divide each visit behavior in the current and historical visiting behaviors of the wind control object into corresponding medical treatment categories according to the extracted characteristics, and form a medical treatment category sequence;
就诊目的推测模块,被配置用于将在所述就诊类别序列作为观测序列,将就诊目的作为隐含状态,根据隐马尔可夫模型计算出最可能的隐含状态序列,其中包含了风控对象最可能的就诊目的;a diagnosis target estimation module configured to use the treatment category sequence as an observation sequence, the treatment purpose as an implicit state, and the most likely implicit state sequence according to the hidden Markov model, including the wind control object The most likely purpose of the visit;
异常输出模块,被配置用于在所述隐含状态序列中包含的所述风控对象最可能的就诊目的对应于异常目的的情况下,输出与所述风控对象的当前及 历史就诊行为相关的医疗数据。An abnormality output module configured to: when the most likely medical purpose of the wind control object included in the implicit state sequence corresponds to an abnormal purpose, the output is related to current and historical medical behavior of the wind control object Medical data.
根据本申请的实施例,该系统还可包括:According to an embodiment of the present application, the system may further include:
隐马尔可夫模型建立模块,被配置用于利用Baum-Welch算法,基于从医疗机构或公开数据源获取的大数据,计算出隐马尔可夫模型对应的观测概率矩阵和状态转移概率矩阵。The hidden Markov model building module is configured to calculate the observation probability matrix and the state transition probability matrix corresponding to the hidden Markov model based on the big data acquired from the medical institution or the public data source by using the Baum-Welch algorithm.
3、安装了用于实现本申请的实施例的应用程序的系统、以及存储所述3. A system for installing an application for implementing an embodiment of the present application, and storing the 应用程序的计算机可读介质Computer readable medium for an application
此外,本申请的不同实施例也可以通过软件模块或存储在一个或多个计算机可读介质上的计算机可读指令的方式实现,其中,所述计算机可读指令当被处理器或设备组件执行时,执行本申请所述的不同的实施例。类似地,软件模块、计算机可读介质和硬件部件的任意组合都是本申请预期的。所述软件模块可以被存储在任意类型的计算机可读存储介质上,例如RAM、EPROM、EEPROM、闪存、寄存器、硬盘、CD-ROM、DVD等等。Furthermore, various embodiments of the present application can also be implemented by a software module or computer readable instructions stored on one or more computer readable medium, where the computer readable instructions are executed by a processor or device component Different embodiments described herein are performed. Similarly, any combination of software modules, computer readable media, and hardware components are contemplated by the present application. The software modules can be stored on any type of computer readable storage medium such as RAM, EPROM, EEPROM, flash memory, registers, hard disk, CD-ROM, DVD, and the like.
具体地,本申请的另一个方面涉及使用硬件和/或软件实现上述不同的实施例。本领域的技术人员应该理解,可以使用计算设备或者一个或多个处理器实现或执行本申请的实施例。计算设备或处理器可以是例如通用处理器、数字信号处理器(DSP)、专用集成芯片(ASIC)、现场可编程门阵列(FPGA)或其他可编程逻辑设备,等等。本申请不同的实施例也可以被这些设备的组合执行或体现。In particular, another aspect of the present application relates to implementing the various embodiments described above using hardware and/or software. Those skilled in the art will appreciate that embodiments of the present application can be implemented or executed using a computing device or one or more processors. The computing device or processor can be, for example, a general purpose processor, a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable logic device, and the like. Different embodiments of the application may also be implemented or embodied by a combination of these devices.
参照图4,其示出了根据本申请的实施例的安装了应用程序的系统的运行环境。Referring to Figure 4, there is shown an operating environment of a system in which an application is installed, in accordance with an embodiment of the present application.
在本实施例中,所述的安装应用程序的系统安装并运行于电子装置中。所述电子装置可以是桌上型计算机、笔记本、掌上电脑及服务器等计算设备。该电子装置可包括但不限于存储器、处理器及显示器。图6仅示出了具有上述组件的电子装置,但是应理解的是,并不要求实施所有示出的组件,可以 替代的实施更多或者更少的组件。In this embodiment, the system for installing the application is installed and runs in the electronic device. The electronic device may be a computing device such as a desktop computer, a notebook, a palmtop computer, or a server. The electronic device can include, but is not limited to, a memory, a processor, and a display. Figure 6 shows only the electronic device having the above components, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
所述存储器在一些实施例中可以是所述电子装置的内部存储单元,例如该电子装置的硬盘或内存。所述存储器在另一些实施例中也可以是所述电子装置的外部存储设备,例如所述电子装置上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器还可以既包括所述电子装置的内部存储单元也包括外部存储设备。所述存储器用于存储安装于所述电子装置的应用软件及各类数据,例如所述安装应用程序的系统的程序代码等。所述存储器还可以用于暂时地存储已经输出或者将要输出的数据。The memory may be an internal storage unit of the electronic device, such as a hard disk or memory of the electronic device, in some embodiments. The memory may also be an external storage device of the electronic device in other embodiments, such as a plug-in hard disk equipped on the electronic device, a smart memory card (SMC), and a secure digital (Secure Digital) , SD) card, flash card (Flash Card), etc. Further, the memory may also include both an internal storage unit of the electronic device and an external storage device. The memory is used to store application software installed on the electronic device and various types of data, such as program code of the system in which the application is installed. The memory can also be used to temporarily store data that has been output or is about to be output.
所述处理器在一些实施例中可以是中央处理单元(Central Processing Unit,CPU)、微处理器或其他数据处理芯片,用于运行所述存储器中存储的程序代码或处理数据,例如执行所述安装应用程序的系统等。The processor may, in some embodiments, be a central processing unit (CPU), a microprocessor, or other data processing chip for executing program code or processing data stored in the memory, such as performing the Install the application's system, etc.
所述显示器在一些实施例中可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。所述显示器用于显示在所述电子装置中处理的信息以及用于显示可视化的用户界面,例如应用菜单界面、应用图标界面等。所述电子装置的部件通过系统总线相互通信。The display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor, or the like in some embodiments. The display is for displaying information processed in the electronic device and a user interface for displaying visualizations, such as an application menu interface, an application icon interface, and the like. The components of the electronic device communicate with one another via a system bus.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解,上述实施方式中的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件来实现,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the method in the foregoing embodiment can be implemented by means of software plus a necessary general hardware platform, and can also be implemented by hardware, but in many cases. The former is a better implementation. Based on such understanding, the technical solution of the present application in essence or the contribution to the prior art can be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, magnetic The disc, the optical disc, includes a plurality of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method described in the various embodiments of the present application.
也就是说,根据本申请的实施例,还提供了一种计算机可读存储介质, 其上存储用于执行通过数据分析推断医疗保险违规行为的风控方法的程序,所述程序在被处理器执行时,实现根据所述方法的步骤。That is, according to an embodiment of the present application, there is also provided a computer readable storage medium storing thereon a program for executing a risk control method for inferring a medical insurance violation behavior by data analysis, the program being processed by a processor When executed, the steps according to the method are implemented.
由上,将理解,为了说明的目的,这里已描述了本申请的具体实施例,但是,可作出各个修改,而不会背离本申请的范围。本领域的技术人员将理解,流程图步骤中所绘出或这里描述的操作和例程可以多种方式变化。更具体地,可重新安排步骤的次序,可并行执行步骤,可省略步骤,可包括其它步骤,可作出例程的各种组合或省略。因而,本申请仅由所附权利要求限制。It will be understood that the specific embodiments of the present invention have been described herein for the purpose of illustration, but the various modifications may be made without departing from the scope of the application. Those skilled in the art will appreciate that the operations and routines depicted in the flowchart steps or described herein can be varied in many ways. More specifically, the order of the steps may be rearranged, the steps may be performed in parallel, the steps may be omitted, other steps may be included, various combinations or omissions of routines may be made. Accordingly, the application is limited only by the accompanying claims.

Claims (20)

  1. 一种通过数据分析推断医疗保险违规行为的风控方法,其特征在于,风控对象的就诊目的被划分为正常目的和异常目的,所述方法包括以下步骤:A wind control method for inferring medical insurance violation behavior through data analysis, characterized in that the purpose of the medical control object is divided into normal purpose and abnormal purpose, and the method comprises the following steps:
    步骤1、获取风控对象的当前及历史就诊行为数据、以及与风控对象相关的个人信息和公共数据;Step 1. Obtain current and historical medical behavior data of the wind control object, and personal information and public data related to the risk control object;
    步骤2、从在步骤1获取的数据中提取与风控对象的就诊行为相关的特征;Step 2: extracting features related to the visiting behavior of the wind control object from the data acquired in step 1;
    步骤3、根据所提取的特征,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,形成就诊类别序列;Step 3: According to the extracted features, each visit behavior in the current and historical visiting behaviors of the wind control object is divided into corresponding medical treatment categories to form a medical treatment category sequence;
    步骤4、将在步骤3中形成的就诊类别序列作为观测序列,将就诊目的作为隐含状态,根据隐马尔可夫模型计算出最可能的隐含状态序列;Step 4: taking the sequence of the treatment category formed in step 3 as an observation sequence, using the purpose of the treatment as an implicit state, and calculating the most likely sequence of hidden states according to the hidden Markov model;
    步骤5、如果所述最可能的隐含状态序列中包含的一个或多个隐含状态对应于异常目的、或者所述最可能的隐含状态序列的发生概率低于预定阈值,则输出与所述风控对象的当前及历史就诊行为相关的医疗数据。Step 5: If the one or more implicit states included in the most likely implicit state sequence correspond to an abnormal purpose, or the probability of occurrence of the most likely implicit state sequence is lower than a predetermined threshold, then output and Medical data related to the current and historical visit behavior of the risk control object.
  2. 根据权利要求1所述的通过数据分析推断医疗保险违规行为的风控方法,其特征在于,步骤4中的隐马尔可夫模型包含观测概率矩阵和状态转移概率矩阵,The wind control method for inferring medical insurance violation behavior by data analysis according to claim 1, wherein the hidden Markov model in step 4 comprises an observation probability matrix and a state transition probability matrix,
    其中,所述观测概率矩阵中记录有每个就诊目的下的各个就诊类别的概率,所述状态转移概率矩阵中记录有从一个就诊目的转移到另一个就诊目的的概率。The probability of each medical treatment category under each medical purpose is recorded in the observation probability matrix, and the probability of transferring from one medical treatment purpose to another is recorded in the state transition probability matrix.
  3. 根据权利要求1所述的通过数据分析推断医疗保险违规行为的风控方法,其特征在于,所述步骤5还包括:The method of claim 1 for inferring a medical insurance violation by data analysis according to claim 1, wherein the step 5 further comprises:
    如果包含异常目的的全部隐含状态序列的发生概率之和高于预定阈值,则将所述风控对象识别为风险目标并输出到后端系统。If the sum of the occurrence probabilities of all implied state sequences containing the anomalous purpose is above a predetermined threshold, the wind control object is identified as a risk target and output to the backend system.
  4. 根据权利要求2所述的通过数据分析推断医疗保险违规行为的风控方法,其特征在于,在步骤3中,使用K-Means或者RVM分类器,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,The risk control method for inferring medical insurance violation behavior by data analysis according to claim 2, wherein in step 3, using K-Means or RVM classifier, the current and historical visiting behavior of the wind control object is Each visit is classified into the corresponding treatment category.
    其中,与风控对象相关的个人数据包括风控对象的医保结算数据、风控对象的人口学数据,Among them, the personal data related to the risk control object includes the medical insurance settlement data of the wind control object and the demographic data of the risk control object.
    与风控对象相关的公共数据包括医疗机构相关数据、以及医保基金稽核数据。Public data related to risk control objects include data related to medical institutions and medical insurance fund audit data.
  5. 根据权利要求2所述的通过数据分析推断医疗保险违规行为的风控方法,其特征在于,所述正常就诊目的包括:体检、初诊、复诊、配药、住院,The risk control method for inferring medical insurance violation behavior by data analysis according to claim 2, wherein the normal medical treatment purpose comprises: physical examination, initial diagnosis, referral, dispensing, hospitalization,
    观测概率矩阵和状态转移概率矩阵是利用Baum-Welch算法推导出的。The observation probability matrix and the state transition probability matrix are derived using the Baum-Welch algorithm.
  6. 根据权利要求1所述的通过数据分析推断医疗保险违规行为的风控方法,其特征在于,在步骤2中,所述特征包括:The risk control method for inferring a medical insurance violation behavior by data analysis according to claim 1, wherein in the step 2, the feature comprises:
    人口学特征,包括风控对象的年龄、性别、职业、文化水平;Demographic characteristics, including the age, gender, occupation, and cultural level of the risk control object;
    地理特征,包括风控对象的籍贯、公司地理位置、医疗机构地理位置、参保区域等;Geographical characteristics, including the birthplace of the risk control object, the geographical location of the company, the geographical location of the medical institution, and the participating area;
    时间特征,包括风控对象的就诊时间、就诊间期、参保时间;Time characteristics, including the time of visit of the risk control object, the interval between visits, and the time of participation;
    医疗特征,包括风控对象的诊断数据、就诊科室、医疗消费清单、医疗机构规模、医疗机构等级、医疗机构的既往案底标签、医生平均职称;Medical characteristics, including diagnostic data of the risk control object, the visiting department, the medical consumption list, the size of the medical institution, the level of the medical institution, the previous case label of the medical institution, and the average title of the doctor;
    费用特征,包括风控对象的单次花费、时间段总花费、费用构成比例。Cost characteristics, including the single cost of the risk control object, the total cost of the time period, and the proportion of the cost.
  7. 根据权利要求1所述的通过数据分析推断医疗保险违规行为的风控方法,其特征在于,在步骤5中,The risk control method for inferring medical insurance violation behavior by data analysis according to claim 1, wherein in step 5,
    如果所述最可能的隐含状态序列中包含的所述风控对象的就诊目的中的一个或多个对应于异常目的,则将所述风控对象识别为风险目标并输出到后端系统。If one or more of the medical purposes of the risk control object included in the most likely implicit state sequence correspond to an abnormal purpose, the risk control object is identified as a risk target and output to the backend system.
  8. 根据权利要求1所述的通过数据分析推断医疗保险违规行为的风控 方法,其特征在于,在步骤5中,A risk control method for inferring a medical insurance violation behavior by data analysis according to claim 1, wherein in step 5,
    如果所述最可能的隐含状态序列中不包含异常目的、但所述最可能的隐含状态序列的发生概率低于预定阈值,则将所述风控对象识别为风险目标并输出到后端系统。If the most likely implicit state sequence does not contain an abnormal purpose, but the probability of occurrence of the most likely implicit state sequence is below a predetermined threshold, the wind control object is identified as a risk target and output to the back end system.
  9. 一种用于进行通过数据分析推断医疗保险违规行为的风控方法的风控系统,包括:A wind control system for performing a risk control method for inferring medical insurance violations through data analysis, comprising:
    风控对象数据获取模块,被配置用于获取风控对象的当前及历史就诊行为数据、以及与风控对象相关的个人信息和公共数据;The wind control object data obtaining module is configured to acquire current and historical medical behavior data of the wind control object, and personal information and public data related to the wind control object;
    特征提取模块,被配置用于从所述风控对象数据获取模块获取的数据中提取与风控对象的就诊行为相关的特征;a feature extraction module, configured to extract, from the data acquired by the wind control object data acquisition module, features related to a visiting behavior of the wind control object;
    就诊分类模块,被配置用于根据所提取的特征,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,形成就诊类别序列;The medical treatment classification module is configured to divide each visit behavior in the current and historical visiting behaviors of the wind control object into corresponding medical treatment categories according to the extracted characteristics, and form a medical treatment category sequence;
    就诊目的推测模块,被配置用于将在所述就诊类别序列作为观测序列,将就诊目的作为隐含状态,根据隐马尔可夫模型计算出最可能的隐含状态序列;a diagnosis target estimation module configured to use the treatment category sequence as an observation sequence, the treatment purpose as an implicit state, and calculate a most likely implicit state sequence according to the hidden Markov model;
    异常输出模块,被配置用于在所述隐含状态序列中包含的隐含状态中的一个对应于异常目的的情况下,输出与所述风控对象的当前及历史就诊行为相关的医疗数据;以及An abnormality output module configured to output medical data related to current and historical visit behavior of the risk control object if one of the implicit states included in the implicit state sequence corresponds to an abnormal purpose; as well as
    隐马尔可夫模型建立模块,被配置用于利用Baum-Welch算法,基于从医疗机构或公开数据源获取的大数据,计算出隐马尔可夫模型对应的观测概率矩阵和状态转移概率矩阵。The hidden Markov model building module is configured to calculate the observation probability matrix and the state transition probability matrix corresponding to the hidden Markov model based on the big data acquired from the medical institution or the public data source by using the Baum-Welch algorithm.
  10. 根据权利要求9所述的风控系统,其特征在于,所述隐马尔可夫模型包含观测概率矩阵和状态转移概率矩阵,The wind control system according to claim 9, wherein the hidden Markov model comprises an observation probability matrix and a state transition probability matrix,
    其中,所述观测概率矩阵中记录有每个就诊目的下的各个就诊类别的概率,所述状态转移概率矩阵中记录有从一个就诊目的转移到另一个就诊目的的概率。The probability of each medical treatment category under each medical purpose is recorded in the observation probability matrix, and the probability of transferring from one medical treatment purpose to another is recorded in the state transition probability matrix.
  11. 根据权利要求9所述的风控系统,其特征在于,所述特征包括:The wind control system of claim 9 wherein said features comprise:
    人口学特征,包括风控对象的年龄、性别、职业、文化水平;Demographic characteristics, including the age, gender, occupation, and cultural level of the risk control object;
    地理特征,包括风控对象的籍贯、公司地理位置、医疗机构地理位置、参保区域等;Geographical characteristics, including the birthplace of the risk control object, the geographical location of the company, the geographical location of the medical institution, and the participating area;
    时间特征,包括风控对象的就诊时间、就诊间期、参保时间;Time characteristics, including the time of visit of the risk control object, the interval between visits, and the time of participation;
    医疗特征,包括风控对象的诊断数据、就诊科室、医疗消费清单、医疗机构规模、医疗机构等级、医疗机构的既往案底标签、医生平均职称;Medical characteristics, including diagnostic data of the risk control object, the visiting department, the medical consumption list, the size of the medical institution, the level of the medical institution, the previous case label of the medical institution, and the average title of the doctor;
    费用特征,包括风控对象的单次花费、时间段总花费、费用构成比例。Cost characteristics, including the single cost of the risk control object, the total cost of the time period, and the proportion of the cost.
  12. 根据权利要求9所述的风控系统,其特征在于,所述异常输出模块,还被配置用于当包含异常目的的全部隐含状态序列的发生概率之和高于预定阈值时,将所述风控对象识别为风险目标并输出到后端系统;The air control system according to claim 9, wherein the abnormality output module is further configured to: when the sum of occurrence probabilities of all implied state sequences including an abnormal purpose is higher than a predetermined threshold The wind control object is identified as a risk target and output to the backend system;
    或者还被配置用于当所述最可能的隐含状态序列中包含的所述风控对象的就诊目的中的一个或多个对应于异常目的,将所述风控对象识别为风险目标并输出到后端系统;Or configured to identify the risk control object as a risk target and output when one or more of the medical purposes of the risk control object included in the most likely implicit state sequence correspond to an abnormal purpose To the backend system;
    或者还被配置用于当最可能的隐含状态序列中不包含异常目的、但所述最可能的隐含状态序列的发生概率低于预定阈值,将所述风控对象识别为风险目标并输出到后端系统。Or configured to identify the wind control object as a risk target and output when the most likely implied state sequence does not contain an abnormal purpose, but the probability of occurrence of the most likely implicit state sequence is below a predetermined threshold To the backend system.
  13. 一种计算机可读存储介质,其上存储有用于执行通过数据分析推断医疗保险违规行为的风控方法的程序,所述程序被处理器执行时,实现以下步骤的操作:A computer readable storage medium having stored thereon a program for executing a risk control method for inferring a medical insurance violation behavior by data analysis, the program being executed by a processor, implementing the operations of the following steps:
    步骤1、获取风控对象的当前及历史就诊行为数据、以及与风控对象相关的个人信息和公共数据;Step 1. Obtain current and historical medical behavior data of the wind control object, and personal information and public data related to the risk control object;
    步骤2、从在步骤1获取的数据中提取与风控对象的就诊行为相关的特征;Step 2: extracting features related to the visiting behavior of the wind control object from the data acquired in step 1;
    步骤3、根据所提取的特征,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,形成就诊类别序列;Step 3: According to the extracted features, each visit behavior in the current and historical visiting behaviors of the wind control object is divided into corresponding medical treatment categories to form a medical treatment category sequence;
    步骤4、将在步骤3中形成的就诊类别序列作为观测序列,将就诊目的作为隐含状态,根据隐马尔可夫模型计算出最可能的隐含状态序列;Step 4: taking the sequence of the treatment category formed in step 3 as an observation sequence, using the purpose of the treatment as an implicit state, and calculating the most likely sequence of hidden states according to the hidden Markov model;
    步骤5、如果所述最可能的隐含状态序列中包含的一个或多个隐含状态对应于异常目的、或者所述最可能的隐含状态序列的发生概率低于预定阈值,则输出与所述风控对象的当前及历史就诊行为相关的医疗数据。Step 5: If the one or more implicit states included in the most likely implicit state sequence correspond to an abnormal purpose, or the probability of occurrence of the most likely implicit state sequence is lower than a predetermined threshold, then output and Medical data related to the current and historical visit behavior of the risk control object.
  14. 根据权利要求13所述的计算机可读存储介质,其特征在于,步骤4中的隐马尔可夫模型包含观测概率矩阵和状态转移概率矩阵,The computer readable storage medium according to claim 13, wherein the hidden Markov model in step 4 comprises an observation probability matrix and a state transition probability matrix,
    其中,所述观测概率矩阵中记录有每个就诊目的下的各个就诊类别的概率,所述状态转移概率矩阵中记录有从一个就诊目的转移到另一个就诊目的的概率。The probability of each medical treatment category under each medical purpose is recorded in the observation probability matrix, and the probability of transferring from one medical treatment purpose to another is recorded in the state transition probability matrix.
  15. 根据权利要求13所述的计算机可读存储介质,其特征在于,所述步骤5还包括:The computer readable storage medium of claim 13, wherein the step 5 further comprises:
    如果包含异常目的的全部隐含状态序列的发生概率之和高于预定阈值,则将所述风控对象识别为风险目标并输出到后端系统。If the sum of the occurrence probabilities of all implied state sequences containing the anomalous purpose is above a predetermined threshold, the wind control object is identified as a risk target and output to the backend system.
  16. 根据权利要求14所述的计算机可读存储介质,其特征在于,在步骤3中,使用K-Means或者RVM分类器,将风控对象的当前及历史就诊行为中的每次就诊行为划分到相应的就诊类别,The computer readable storage medium according to claim 14, wherein in step 3, each visit behavior in the current and historical visit behavior of the wind control object is divided into corresponding uses using a K-Means or RVM classifier Visit category,
    其中,与风控对象相关的个人数据包括风控对象的医保结算数据、风控对象的人口学数据,Among them, the personal data related to the risk control object includes the medical insurance settlement data of the wind control object and the demographic data of the risk control object.
    与风控对象相关的公共数据包括医疗机构相关数据、以及医保基金稽核数据。Public data related to risk control objects include data related to medical institutions and medical insurance fund audit data.
  17. 根据权利要求14所述的计算机可读存储介质,其特征在于,所述正常就诊目的包括:体检、初诊、复诊、配药、住院,The computer readable storage medium according to claim 14, wherein the normal medical purpose includes: physical examination, initial diagnosis, follow-up, dispensing, hospitalization,
    观测概率矩阵和状态转移概率矩阵是利用Baum-Welch算法推导出的。The observation probability matrix and the state transition probability matrix are derived using the Baum-Welch algorithm.
  18. 根据权利要求13所述的计算机可读存储介质,其特征在于,在步骤2中,所述特征包括:The computer readable storage medium of claim 13, wherein in step 2, the feature comprises:
    人口学特征,包括风控对象的年龄、性别、职业、文化水平;Demographic characteristics, including the age, gender, occupation, and cultural level of the risk control object;
    地理特征,包括风控对象的籍贯、公司地理位置、医疗机构地理位置、参保区域等;Geographical characteristics, including the birthplace of the risk control object, the geographical location of the company, the geographical location of the medical institution, and the participating area;
    时间特征,包括风控对象的就诊时间、就诊间期、参保时间;Time characteristics, including the time of visit of the risk control object, the interval between visits, and the time of participation;
    医疗特征,包括风控对象的诊断数据、就诊科室、医疗消费清单、医疗机构规模、医疗机构等级、医疗机构的既往案底标签、医生平均职称;Medical characteristics, including diagnostic data of the risk control object, the visiting department, the medical consumption list, the size of the medical institution, the level of the medical institution, the previous case label of the medical institution, and the average title of the doctor;
    费用特征,包括风控对象的单次花费、时间段总花费、费用构成比例。Cost characteristics, including the single cost of the risk control object, the total cost of the time period, and the proportion of the cost.
  19. 根据权利要求13所述的计算机可读存储介质,其特征在于,在步骤5中,The computer readable storage medium of claim 13, wherein in step 5,
    如果所述最可能的隐含状态序列中包含的所述风控对象的就诊目的中的一个或多个对应于异常目的,则将所述风控对象识别为风险目标并输出到后端系统。If one or more of the medical purposes of the risk control object included in the most likely implicit state sequence correspond to an abnormal purpose, the risk control object is identified as a risk target and output to the backend system.
  20. 根据权利要求13所述的计算机可读存储介质,其特征在于,在步骤5中,The computer readable storage medium of claim 13, wherein in step 5,
    如果所述最可能的隐含状态序列中不包含异常目的、但所述最可能的隐含状态序列的发生概率低于预定阈值,则将所述风控对象识别为风险目标并输出到后端系统。If the most likely implicit state sequence does not contain an abnormal purpose, but the probability of occurrence of the most likely implicit state sequence is below a predetermined threshold, the wind control object is identified as a risk target and output to the back end system.
PCT/CN2018/097746 2018-03-08 2018-07-30 Risk control method for determining irregular medical insurance behavior by means of data analysis WO2019169826A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810191862.5 2018-03-08
CN201810191862.5A CN108492196B (en) 2018-03-08 2018-03-08 Wind control method for deducing medical insurance violation behavior through data analysis

Publications (1)

Publication Number Publication Date
WO2019169826A1 true WO2019169826A1 (en) 2019-09-12

Family

ID=63338027

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/097746 WO2019169826A1 (en) 2018-03-08 2018-07-30 Risk control method for determining irregular medical insurance behavior by means of data analysis

Country Status (2)

Country Link
CN (1) CN108492196B (en)
WO (1) WO2019169826A1 (en)

Families Citing this family (31)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109377388B (en) * 2018-09-13 2023-08-18 深圳平安医疗健康科技服务有限公司 Medical insurance application method, medical insurance application device, computer equipment and storage medium
CN109636623A (en) * 2018-10-19 2019-04-16 平安医疗健康管理股份有限公司 Medical data method for detecting abnormality, device, equipment and storage medium
CN109308793A (en) * 2018-10-22 2019-02-05 平安医疗健康管理股份有限公司 The exceeded method for early warning of drug expenditure and device based on data processing
CN109559090A (en) * 2018-10-27 2019-04-02 平安医疗健康管理股份有限公司 Medical item air control method, apparatus, server and medium based on data analysis
CN109524098A (en) * 2018-10-27 2019-03-26 平安医疗健康管理股份有限公司 Diagnosis information processing method, device, equipment and medium based on data analysis
CN109523396A (en) * 2018-10-27 2019-03-26 平安医疗健康管理股份有限公司 Medical insurance fund air control method, apparatus, server and medium based on data analysis
CN109524097A (en) * 2018-10-27 2019-03-26 平安医疗健康管理股份有限公司 Extension bed behavioral value method, apparatus, server and medium based on recognition of face
CN109545387B (en) * 2018-10-30 2024-02-27 平安科技(深圳)有限公司 Abnormal case recognition method and computing equipment based on neural network
CN109377207A (en) * 2018-10-30 2019-02-22 平安医疗健康管理股份有限公司 The abnormal method and Related product that behavior determines of being hospitalized
CN109584086A (en) * 2018-10-30 2019-04-05 平安医疗健康管理股份有限公司 Be hospitalized rational method and Related product are predicted based on prediction model
CN109559806A (en) * 2018-10-30 2019-04-02 平安医疗健康管理股份有限公司 The determination method and Related product of abnormal behavior of being hospitalized
CN109637615B (en) * 2018-11-30 2022-10-14 平安医疗健康管理股份有限公司 Method, device and equipment for judging abnormal medical prescription and readable storage medium
CN109615204B (en) * 2018-11-30 2023-02-03 平安医疗健康管理股份有限公司 Quality evaluation method, device and equipment of medical data and readable storage medium
CN109636627B (en) * 2018-12-04 2020-11-03 泰康保险集团股份有限公司 Insurance product management method, device, medium and electronic equipment based on block chain
CN109559242A (en) * 2018-12-13 2019-04-02 平安医疗健康管理股份有限公司 Processing method, device, equipment and the computer readable storage medium of abnormal data
CN109636421A (en) * 2018-12-13 2019-04-16 平安医疗健康管理股份有限公司 Medical data exception recognition methods, equipment and storage medium based on machine learning
CN109658267A (en) * 2018-12-13 2019-04-19 平安医疗健康管理股份有限公司 Social security violation detection method, device, equipment and computer storage medium
CN109659035A (en) * 2018-12-13 2019-04-19 平安医疗健康管理股份有限公司 Medical data exception recognition methods, equipment and storage medium based on machine learning
CN109636650A (en) * 2018-12-13 2019-04-16 平安医疗健康管理股份有限公司 Recognition methods, device, terminal and the readable storage medium storing program for executing of therapeutic regimen exception
CN109598633A (en) * 2018-12-13 2019-04-09 平安医疗健康管理股份有限公司 Social security violation detection method, device, equipment and computer storage medium
CN109544391A (en) * 2018-12-13 2019-03-29 平安医疗健康管理股份有限公司 Recognition methods, device, terminal and the computer readable storage medium of abnormal purchase medicine
CN109615012A (en) * 2018-12-13 2019-04-12 平安医疗健康管理股份有限公司 Medical data exception recognition methods, equipment and storage medium based on machine learning
CN109635044A (en) * 2018-12-13 2019-04-16 平安医疗健康管理股份有限公司 Hospitalization data method for detecting abnormality, device, equipment and readable storage medium storing program for executing
CN110245960A (en) * 2019-05-21 2019-09-17 何金星 A kind of medical insurance antifraud system and method based on computer control
CN111210356B (en) * 2020-01-14 2023-03-21 平安医疗健康管理股份有限公司 Medical insurance data analysis method and device, computer equipment and storage medium
CN111340641B (en) * 2020-05-22 2020-11-13 浙江工业大学 Abnormal hospitalizing behavior detection method
CN112131277B (en) * 2020-09-28 2023-04-18 深圳平安医疗健康科技服务有限公司 Medical data anomaly analysis method and device based on big data and computer equipment
CN112541831A (en) * 2020-12-16 2021-03-23 中国人寿保险股份有限公司 Medical insurance risk identification method, device, medium and electronic equipment
CN114866351B (en) * 2022-07-06 2022-10-14 湖南创星科技股份有限公司 Regional medical prescription supervision method and system based on block chain
CN116976879B (en) * 2023-09-22 2024-01-09 广州扬盛计算机软件有限公司 Method and system for monitoring abnormality of payment system of self-service equipment
CN117151902B (en) * 2023-10-25 2024-01-23 北京创智和宇科技有限公司 Method for monitoring and early warning DRG and DIP medical insurance payment risk through big data analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102013084A (en) * 2010-12-14 2011-04-13 江苏大学 System and method for detecting fraudulent transactions in medical insurance outpatient services
CN103761748A (en) * 2013-12-31 2014-04-30 北京邮电大学 Method and device for detecting abnormal behaviors
US20160267224A1 (en) * 2015-03-10 2016-09-15 International Business Machines Corporation Detecting outlier prescription behavior using graphical models with latent variables
CN107464115A (en) * 2017-07-20 2017-12-12 北京小米移动软件有限公司 personal characteristic information verification method and device

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160110818A1 (en) * 2014-10-21 2016-04-21 Hartford Fire Insurance Company System for dynamic fraud detection
WO2016210122A1 (en) * 2015-06-24 2016-12-29 IGATE Global Solutions Ltd. Insurance fraud detection and prevention system
CN104952000A (en) * 2015-07-01 2015-09-30 华侨大学 Wind turbine operating state fuzzy synthetic evaluation method based on Markov chain
CN107402921B (en) * 2016-05-18 2021-03-30 创新先进技术有限公司 Event time sequence data processing method, device and system for identifying user behaviors
CN107657536B (en) * 2017-02-20 2018-07-31 平安科技(深圳)有限公司 The recognition methods of social security fraud and device
CN107240024A (en) * 2017-05-22 2017-10-10 中国平安人寿保险股份有限公司 The anti-fraud recognition methods of settlement of insurance claim and device
CN107680602A (en) * 2017-08-24 2018-02-09 平安科技(深圳)有限公司 Voice fraud recognition methods, device, terminal device and storage medium
CN107609980A (en) * 2017-09-07 2018-01-19 平安医疗健康管理股份有限公司 Medical data processing method, device, computer equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102013084A (en) * 2010-12-14 2011-04-13 江苏大学 System and method for detecting fraudulent transactions in medical insurance outpatient services
CN103761748A (en) * 2013-12-31 2014-04-30 北京邮电大学 Method and device for detecting abnormal behaviors
US20160267224A1 (en) * 2015-03-10 2016-09-15 International Business Machines Corporation Detecting outlier prescription behavior using graphical models with latent variables
CN107464115A (en) * 2017-07-20 2017-12-12 北京小米移动软件有限公司 personal characteristic information verification method and device

Also Published As

Publication number Publication date
CN108492196B (en) 2020-11-10
CN108492196A (en) 2018-09-04

Similar Documents

Publication Publication Date Title
WO2019169826A1 (en) Risk control method for determining irregular medical insurance behavior by means of data analysis
CN108511059B (en) Chronic disease management method and system
Taylor et al. Prediction of in‐hospital mortality in emergency department patients with sepsis: a local big data–driven, machine learning approach
US20200126011A1 (en) Computer-implemented methods and systems for analyzing healthcare data
Lau et al. Use of electronic medical records (EMR) for oncology outcomes research: assessing the comparability of EMR information to patient registry and health claims data
Tran et al. A framework for feature extraction from hospital medical data with applications in risk prediction
US10430716B2 (en) Data driven featurization and modeling
US20150149215A1 (en) System and method to detect and visualize finding-specific suggestions and pertinent patient information in radiology workflow
US20180210925A1 (en) Reliability measurement in data analysis of altered data sets
CN108898316A (en) Settling fee method for early warning and system
Khanna et al. A risk stratification tool for hospitalisation in Australia using primary care data
Xiao et al. An MCEM framework for drug safety signal detection and combination from heterogeneous real world evidence
Erickson et al. Automatic address validation and health record review to identify homeless Social Security disability applicants
CN109636085A (en) Based on the pre-authorization of data processing from kernel method and system
Kim et al. Weekly ILI patient ratio change prediction using news articles with support vector machine
US20160259896A1 (en) Segmented temporal analysis model used in fraud, waste, and abuse detection
US20210056438A1 (en) Data driven featurization and modeling
King et al. Predicting self-intercepted medication ordering errors using machine learning
US20220319647A1 (en) Systems and methods for an improved healthcare data fabric
Mailloux et al. A decision support tool for identifying abuse of controlled substances by ForwardHealth Medicaid members
Brachmann et al. Cost-of-illness comparison between clinical judgment and molecular point-of-care testing for influenza-like illness patients in Germany
CN113094595A (en) Object recognition method, device, computer system and readable storage medium
Settipalli et al. Provider profiling and labeling of fraudulent health insurance claims using Weighted MultiTree
Zucco et al. Personalized survival probabilities for SARS-CoV-2 positive patients by explainable machine learning
Zakharova et al. Multi-level model for structuring heterogeneous biomedical data in the tasks of socially significant diseases risk evaluation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18908781

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 22/01/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18908781

Country of ref document: EP

Kind code of ref document: A1