CN110119991A - Checking method, device and storage medium are compensated in medical treatment based on machine learning - Google Patents

Checking method, device and storage medium are compensated in medical treatment based on machine learning Download PDF

Info

Publication number
CN110119991A
CN110119991A CN201910294783.1A CN201910294783A CN110119991A CN 110119991 A CN110119991 A CN 110119991A CN 201910294783 A CN201910294783 A CN 201910294783A CN 110119991 A CN110119991 A CN 110119991A
Authority
CN
China
Prior art keywords
medical
medical treatment
compensated
machine learning
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910294783.1A
Other languages
Chinese (zh)
Inventor
贺健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
OneConnect Smart Technology Co Ltd
Original Assignee
OneConnect Smart Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by OneConnect Smart Technology Co Ltd filed Critical OneConnect Smart Technology Co Ltd
Priority to CN201910294783.1A priority Critical patent/CN110119991A/en
Publication of CN110119991A publication Critical patent/CN110119991A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/08Insurance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/20ICT specially adapted for the handling or processing of medical references relating to practices or guidelines
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Medicinal Chemistry (AREA)
  • Economics (AREA)
  • Chemical & Material Sciences (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Toxicology (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioethics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Development Economics (AREA)
  • Multimedia (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present invention relates to field of artificial intelligence, disclose a kind of medical treatment compensation checking method based on machine learning, this method comprises: extracting the medication standard of different syndromes from the medical diagnostic information text for the various diseases that medical literature database and doctor are issued, using the medication standard as training data, pass through the method for machine learning, audit model is compensated in training medical treatment, it generates the medical treatment and compensates audit model, it is audited with compensating the reimbursement document that audit model hurts Claims Resolution to people using the medical treatment, to judge automatically the reimbursement document with the presence or absence of fraud.The present invention also proposes that audit device and a kind of computer readable storage medium are compensated in a kind of medical treatment based on machine learning.The present invention can the Claims Resolution to malpractice audited automatically, to judge whether there is, hypermedication is counter to be cheated.

Description

Checking method, device and storage medium are compensated in medical treatment based on machine learning
Technical field
The present invention relates to field of artificial intelligence more particularly to a kind of medical treatment based on machine learning to compensate audit side Method, device and computer readable storage medium.
Background technique
Medication is to do comprehensive feelings according to the gene of patient individual, the state of an illness, constitution, familial inheritance medical history and composition of drug etc. The detection of condition accurately selects drug, is truly realized " suiting the remedy to the case ", at the same in a suitable approach, it is dosage appropriate, appropriate Time accurate medication.Pay attention to the taboo, adverse reaction, interaction etc. of the drug.Can thus accomplish safety, rationally, have Effect, economically medication, but in actual operation, hypermedication often occurs for fault or experience deficiency due to healthcare givers Behavior, excessively seriously even results in death, and malpractice occurs, the malpractice occurred by hypermedication because currently without Reparation scheme well, causes the interests of insurance company and aggrieved family that generally can not rationally be ensured.
For malpractice caused by hypermedication when being settled a claim, insurance company causes to settle a claim because of no professional The deficiencies of time-consuming for journey, at high cost, depends on outside unduly, at the same Vehicles Collected from Market do not have it is excessive for this insufficient relevant administration Anti- fraud scheme.
Summary of the invention
The present invention provides a kind of medical treatment compensation checking method, device and computer-readable storage medium based on machine learning Matter, main purpose be to provide it is a kind of can the Claims Resolution to malpractice audited automatically, to judge whether there is medication Excessively anti-fraud.
To achieve the above object, checking method is compensated in a kind of medical treatment based on machine learning provided by the invention, comprising:
The medical literature that each medicine section is searched for from medical literature database establishes medical literature collection database, by making Text mining is carried out to the medical literature with natural language processing technique, extracts the medication standard of different syndromes, and will be described Medication standard storage is in medical consultations scheme base;
The medical diagnostic information text that the various diseases that doctor issues are obtained from hospital system, passes through optical character identification And natural language processing technique, identification and text-processing are carried out to the medical diagnostic information text, grab diagnostic message text Medicining condition, be added to the medical consultations scheme base;
Medical treatment compensation audit model is established to pass through using the data in the medical consultations scheme base as training data Each parameter of audit model is compensated in the method for machine learning, the training medical treatment, is generated the medical treatment and is compensated audit model;And
When someone hurt Claims Resolution case occur when, by the people hurt Claims Resolution reimbursement document by optical character recognition technology into It after row identification, is input to the medical treatment and compensates in audit model, judging that the people hurts Claims Resolution case is all the presence of fraud.
Optionally, described that text mining is carried out to the medical literature by using natural language processing technique, it extracts not With the medication standard of illness, comprising:
According to drug knowledge architecture drug corpus;
Using natural language processing technique, be based on the drug corpus, to the medical literature carry out morphological analysis, according to Syntactic analysis, viewpoint extraction are deposited, the medication standard of different syndromes is extracted.
Optionally, the morphological analysis include medical literature is segmented, part-of-speech tagging and name Entity recognition behaviour Make;Wherein, the participle includes:
(1) according to sequence from left to right from the text of medical literature, from the starting position of character string selection one Whether the word long segment of a maximum length is matched with the drug corpus, judge institute's predicate long segment in the drug language Expect in library, if calculating and then judging again few for a participle if it was not then reducing by a character since the right Whether the segment of one character circuits sequentially in drug corpus, until remaining individual character;
(2) from the remaining partial sequence of the text of the medical literature again according to above-mentioned step (1) the method It is segmented, until entire text is completed to segment.
Optionally, audit model is compensated in the foundation medical treatment, using the data in the medical consultations scheme base as instruction Practice data, by the method for machine learning, the training medical each parameter for compensating audit model generates the medical treatment compensation and examines Nuclear model, comprising:
Using the medication standard of the medical consultations scheme base as training data, audit model is compensated in the medical treatment for inputting foundation In, it is iterated calculating by data, each parameter in audit model is compensated in training medical treatment, and by continuous adjusting parameter, is obtained Audit model is compensated to the medical treatment.
Optionally, it is support vector machines or Random Forest model that audit model is compensated in the medical treatment.
In addition, to achieve the above object, the present invention also provides a kind of, and audit device is compensated in the medical treatment based on machine learning, it should Device includes memory and processor, and audit journey is compensated in the medical treatment that be stored in the memory to run on the processor Sequence, the medical treatment compensate when review procedure is executed by the processor and realize following steps:
The medical literature that each medicine section is searched for from medical literature database establishes medical literature collection database, by making Text mining is carried out to the medical literature with natural language processing technique, extracts the medication standard of different syndromes, and will be described Medication standard storage is in medical consultations scheme base;
The medical diagnostic information text that the various diseases that doctor issues are obtained from hospital system, passes through optical character identification And natural language processing technique, identification and text-processing are carried out to the medical diagnostic information text, grab diagnostic message text Medicining condition, be added to the medical consultations scheme base;
Medical treatment compensation audit model is established to pass through using the data in the medical consultations scheme base as training data Each parameter of audit model is compensated in the method for machine learning, the training medical treatment, is generated the medical treatment and is compensated audit model;And
When someone hurt Claims Resolution case occur when, by the people hurt Claims Resolution reimbursement document by optical character recognition technology into It after row identification, is input to the medical treatment and compensates in audit model, judging that the people hurts Claims Resolution case is all the presence of fraud.
Optionally, described that text mining is carried out to the medical literature by using natural language processing technique, it extracts not With the medication standard of illness, comprising:
According to drug knowledge architecture drug corpus;
Using natural language processing technique, be based on the drug corpus, to the medical literature carry out morphological analysis, according to Syntactic analysis, viewpoint extraction are deposited, the medication standard of different syndromes is extracted.
Optionally, the morphological analysis include medical literature is segmented, part-of-speech tagging and name Entity recognition behaviour Make;Wherein, the participle includes:
(1) according to sequence from left to right from the text of medical literature, from the starting position of character string selection one Whether the word long segment of a maximum length is matched with the drug corpus, judge institute's predicate long segment in the drug language Expect in library, if calculating and then judging again few for a participle if it was not then reducing by a character since the right Whether the segment of one character circuits sequentially in drug corpus, until remaining individual character;
(2) from the remaining partial sequence of the text of the medical literature again according to above-mentioned step (1) the method It is segmented, until entire text is completed to segment.
Optionally, audit model is compensated in the foundation medical treatment, using the data in the medical consultations scheme base as instruction Practice data, by the method for machine learning, the training medical each parameter for compensating audit model generates the medical treatment compensation and examines Nuclear model, comprising:
Using the medication standard of the medical consultations scheme base as training data, audit model is compensated in the medical treatment for inputting foundation In, it is iterated calculating by data, each parameter in audit model is compensated in training medical treatment, and by continuous adjusting parameter, is obtained Audit model is compensated to the medical treatment.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium It is stored with medical treatment on storage medium and compensates review procedure, the medical treatment is compensated review procedure and can be held by one or more processor Row, to realize the step of checking method is compensated in the medical treatment based on machine learning as described above.
Medical treatment proposed by the present invention based on machine learning compensate checking method, device and computer readable storage medium from The medication standard of different syndromes is extracted in the medical diagnostic information text for the various diseases that medical literature database and doctor issue, Using the medication standard as training data, by the method for machine learning, each parameter of audit model is compensated in training medical treatment, raw Audit model is compensated at the medical treatment, is audited with the reimbursement document for hurting Claims Resolution to people, to judge automatically the expense report According to the presence or absence of fraud.
Detailed description of the invention
Fig. 1 is the flow diagram that checking method is compensated in the medical treatment based on machine learning that one embodiment of the invention provides;
Fig. 2 is the internal structure signal that audit device is compensated in the medical treatment based on machine learning that one embodiment of the invention provides Figure;
Fig. 3 compensates medical treatment compensation audit in audit device for the medical treatment based on machine learning that one embodiment of the invention provides The module diagram of program.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of medical treatment compensation checking method based on machine learning.It is real for the present invention one shown in referring to Fig.1 The flow diagram of checking method is compensated in the medical treatment based on machine learning for applying example offer.This method can be held by a device Row, which can be by software and or hardware realization.
In the present embodiment, the medical treatment compensation checking method based on machine learning includes:
S10, the medical literature that each medicine section is searched for from medical literature database, establish medical literature collection database, lead to It crosses and text mining is carried out to the medical literature using natural language processing technique, extract the medication standard of different syndromes, and will The medication standard storage is in medical consultations scheme base.
The present invention presses medicine class first out of worldwide, and history medicine is collected from each medical literature database Document establishes medical literature collection database, then carries out text mining to medical literature by using natural language processing technique, mentions The medication standard of different syndromes is taken, building medical consultations scheme base stores these standards, compensates as subsequent intelligent checks medical treatment Data supporting.
Natural language processing (Natural Language Processing, the NPL) technology is computer science, people Work intelligence, linguistics pay close attention to the field of the interaction between computer and the mankind (nature) language.In natural language processing Most crucial is natural language understanding, i.e., the meaning that inputs in artificial or natural language of computer sources and other be related to nature Language generation.
Present pre-ferred embodiments according to drug knowledge architecture drug corpus, recycle the natural language processing first Technology is based on the drug corpus, carries out the steps such as morphological analysis, interdependent syntactic analysis, viewpoint extraction to the medical literature Suddenly, the medication standard completed in medical literature extracts, to construct the medical consultations scheme base.
Morphological analysis for being segmented to medical literature, part-of-speech tagging, the name operation such as Entity recognition.
Described segment is based on described based on corpus, and by continuous natural language text, being cut into has semanteme rationally The process of the sequence of words of property and integrality, comprising steps of
(1.1) it is selected according to sequence from left to right from the starting position of a character string from the text of medical literature Whether the word long segment of one maximum length is matched with drug corpus, judge institute's predicate long segment in the drug corpus In library, if calculating and then judging to have lacked one again if it is not, then reducing by a character since the right for a participle Whether the segment of a character circuits sequentially in drug corpus, until remaining individual character.
(1.2) it is segmented from the remaining partial sequence of text again according to above-mentioned step (1) the method, until Entire text completes participle.
The part-of-speech tagging (Part-of-Speech tagging or POS tagging) refers in natural language text Each vocabulary assign a part of speech process.Part-of-speech tagging of the present invention can be real using specific part-of-speech tagging algorithm It is existing, such as hidden Markov model (Hidden Markov Model, HMM), condition random field (Conditional random Fields, CRFs) etc..
It names Entity recognition (Named Entity Recognition abbreviation NER), i.e., " proper name identification ", refers to identification certainly With the entity of certain sense in right language text, mainly include name, place name, mechanism name, the name of product, medicine name, Time of Day etc..
The interdependent syntactic analysis is the syntactic structure that word is indicated using the dependence in sentence between word and word Information (such as subject-predicate, determines medium structure relationship at dynamic guest), and indicate with tree the structure of whole sentence (such as Subject, Predicate and Object determines shape Mend etc.).
The viewpoint extract function is to automatically analyze the drug name of an article in text, dosage, take mode etc., and to these spies Sign labelling, the feature after posting label are stored into medical consultations scheme base.This step mainly uses single viewpoint Abstracting method, under normal circumstances, doctor can be diagnosed and be outputed prescription according to the practical state of an illness of patient, i.e., specific chief complaint, Therefore more suitable single viewpoint abstracting method has been selected.
S20, the medical diagnostic information text that the various diseases that doctor issues are obtained from hospital system, by OCR and certainly Right language processing techniques carry out identification and text-processing to the medical diagnostic information text, grab the use of diagnostic message text Medicine situation is added in the medical consultations scheme base.
In order to realize the flexible dynamic and actual availability of tranining database, the present invention is by the doctor of real medical consultations list It learns diagnostic message and medical consultations scheme base is added, expand scheme base, so that training data is complete with more dynamic update property, information Property and real availability.
The medical diagnostic information for the various diseases issued by doctor is usually hand-written version and electronic edition, for hand-written version, this Invention uses optical character identification (Optical Character Recognition, OCR) technology to identify it first, so Medication characteristic therein is extracted with natural language processing technique afterwards;For electronic edition, natural language processing technique is directly used Extract medication characteristic therein.
The OCR technique is converted the text in the medical diagnostic information of hand-written version to by optics input modes such as scannings Image information recycles character recognition technology to convert image information to the computer input technology that can be used.
The method for extracting medication characteristic therein with natural language processing technique is referred to above-mentioned steps S20 In description.
S30, it establishes medical treatment and compensates audit model, using the data in the medical consultations scheme base as training data, By the method for machine learning, each parameter of audit model is compensated in the training medical treatment, is generated the medical treatment and is compensated audit model.
On the basis of above-mentioned two step established medical consultations scheme base, the method that the present invention passes through machine learning will The characteristics such as the medication standard of medical consultations scheme base as training data, compensate in audit model by the medical treatment for inputting foundation, It being iterated calculatings by data up to ten thousand time, each parameter audited in model is compensated in training medical treatment, by continuous adjusting parameter, Optimal effect is obtained, audit model is compensated in the more high quality more preferably medical treatment of final output practicability higher efficiency.
Audit model is compensated in medical treatment of the present invention can be a support vector machines (Support Vector Machine, SVM) or random forest etc..
S40, when someone hurt Claims Resolution case occur when, by the people hurt Claims Resolution reimbursement document known by OCR technique It after not, is input to the medical treatment and compensates in audit model, judging that the people hurts Claims Resolution case is all the presence of medication fraud.
In order to further verify the real availability that audit model is compensated in the medical treatment, the people that the present invention will occur in reality The expense report of wound Claims Resolution case and true diagnosis and treatment nonoculture are that audit mould is compensated in the trained established medical treatment of test data input It in type, is exported by the calculating of model, obtains this and play case medication fraud as a result, and insurance personnel is arranged to carry out practical tune It looks into, generates investigation specification, the judging result of model is compared with manual research specification, to finally determine that this plays case Part whether there is fraud, the availability and high efficiency of final certification model.
The present invention also provides a kind of, and audit device is compensated in the medical treatment based on machine learning.Referring to shown in Fig. 2, for the present invention one The schematic diagram of internal structure of audit device is compensated in the medical treatment based on machine learning that embodiment provides.
In the present embodiment, audit device 1 is compensated in the medical treatment based on machine learning can be PC (Personal Computer, PC), it is also possible to the terminal devices such as smart phone, tablet computer, portable computer.It should be based on machine The medical treatment of study compensates audit device 1 and includes at least memory 11, processor 12, communication bus 13 and network interface 14.
Wherein, memory 11 include at least a type of readable storage medium storing program for executing, the readable storage medium storing program for executing include flash memory, Hard disk, multimedia card, card-type memory (for example, SD or DX memory etc.), magnetic storage, disk, CD etc..Memory 11 It can be the internal storage unit that audit device 1 is compensated in the medical treatment based on machine learning in some embodiments, such as this is based on The hard disk of audit device 1 is compensated in the medical treatment of machine learning.Memory 11 is also possible in further embodiments based on engineering The External memory equipment of audit device 1 is compensated in the medical treatment of habit, such as the medical treatment based on machine learning is compensated and is equipped on audit device 1 Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, Flash card (Flash Card) etc..Further, memory 11 can also both include that audit is compensated in the medical treatment based on machine learning The internal storage unit of device 1 also includes External memory equipment.Memory 11 can be not only used for storage and be installed on based on machine The application software and Various types of data of audit device 1 are compensated in the medical treatment of study, such as the code etc. of review procedure 01 is compensated in medical treatment, also It can be used for temporarily storing the data that has exported or will export.
Processor 12 can be in some embodiments a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chips, the program for being stored in run memory 11 Code or processing data, such as execute medical treatment and compensate review procedure 01 etc..
Communication bus 13 is for realizing the connection communication between these components.
Network interface 14 optionally may include standard wireline interface and wireless interface (such as WI-FI interface), be commonly used in Communication connection is established between the device 1 and other electronic equipments.
Optionally, which can also include user interface, and user interface may include display (Display), input Unit such as keyboard (Keyboard), optional user interface can also include standard wireline interface and wireless interface.It is optional Ground, in some embodiments, display can be light-emitting diode display, liquid crystal display, touch-control liquid crystal display and OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) touches device etc..Wherein, display can also be appropriate Referred to as display screen or display unit, for be shown in the medical treatment based on machine learning compensate in audit device 1 information that handles with And for showing visual user interface.
Fig. 2 is illustrated only to be paid for component 11-14 and the medical treatment medical treatment based on machine learning for compensating review procedure 01 Audit device 1 is paid, it will be appreciated by persons skilled in the art that structure shown in fig. 1 is not constituted to based on machine learning The restriction of medical treatment compensation audit device 1 may include components less perhaps more more than diagram or combine certain components, or The different component layout of person.
In 1 embodiment of device shown in Fig. 2, it is stored with medical treatment in memory 11 and compensates review procedure 01;Processor 12 It executes when review procedure 01 is compensated in the medical treatment stored in memory 11 and realizes following steps:
Step 1: searching for the medical literature of each medicine section from medical literature database, medical literature collection database is established, Text mining is carried out to the medical literature by using natural language processing technique, extracts the medication standard of different syndromes, and By the medication standard storage in medical consultations scheme base.
The present invention presses medicine class first out of worldwide, and history medicine is collected from each medical literature database Document establishes medical literature collection database, carries out text mining to medical literature by using natural language processing technique, extracts The medication standard of different syndromes, building medical consultations scheme base store these standards, compensate as subsequent intelligent checks medical treatment Data supporting.
Natural language processing (Natural Language Processing, the NPL) technology is computer science, people Work intelligence, linguistics pay close attention to the field of the interaction between computer and the mankind (nature) language.In natural language processing Most crucial is natural language understanding, i.e., the meaning that inputs in artificial or natural language of computer sources and other be related to nature Language generation.
Present pre-ferred embodiments according to drug knowledge architecture drug corpus, recycle the natural language processing first Technology is based on the drug corpus, carries out the steps such as morphological analysis, interdependent syntactic analysis, viewpoint extraction to the medical literature Suddenly, the medication standard completed in medical literature extracts, to construct the medical consultations scheme base.
Morphological analysis for being segmented to medical literature, part-of-speech tagging, the name operation such as Entity recognition.
Described segment is based on described based on corpus, and by continuous natural language text, being cut into has semanteme rationally The process of the sequence of words of property and integrality, comprising steps of
(1.1) it is selected according to sequence from left to right from the starting position of a character string from the text of medical literature Whether the word long segment of one maximum length is matched with drug corpus, judge institute's predicate long segment in the drug corpus In library, if calculating and then judging to have lacked one again if it was not then reducing by a character since the right for a participle Whether the segment of a character circuits sequentially in drug corpus, until remaining individual character.
(1.2) it is segmented from the remaining partial sequence of text again according to above-mentioned step (1) the method, until Entire text completes participle.
The part-of-speech tagging (Part-of-Speech tagging or POS tagging) refers in natural language text Each vocabulary assign a part of speech process.Part-of-speech tagging of the present invention can be real using specific part-of-speech tagging algorithm It is existing, such as hidden Markov model (Hidden Markov Model, HMM), condition random field (Conditional random Fields, CRFs) etc..
It names Entity recognition (Named Entity Recognition abbreviation NER), i.e., " proper name identification ", refers to identification certainly With the entity of certain sense in right language text, mainly include name, place name, mechanism name, the name of product, medicine name, Time of Day etc..
The interdependent syntactic analysis is the syntactic structure that word is indicated using the dependence in sentence between word and word Information (such as subject-predicate, determines medium structure relationship at dynamic guest), and indicate with tree the structure of whole sentence (such as Subject, Predicate and Object determines shape Mend etc.).
The viewpoint extract function is to automatically analyze the drug name of an article in text, dosage, take mode etc., and to these spies Sign labelling, the feature after posting label are stored into medical consultations scheme base.This step mainly uses single viewpoint Abstracting method, under normal circumstances, doctor can be diagnosed and be outputed prescription according to the practical state of an illness of patient, i.e., specific chief complaint, Therefore more suitable single viewpoint abstracting method has been selected.
Step 2: obtain the medical diagnostic information text of various diseases that doctor issues from hospital system, by OCR and Natural language processing technique carries out identification and text-processing to the medical diagnostic information text, crawl diagnostic message text Medicining condition is added in the medical consultations scheme base.
In order to realize the flexible dynamic and actual availability of tranining database, the present invention is by the doctor of real medical consultations list It learns diagnostic message and medical consultations scheme base is added, expand scheme base, so that training data is complete with more dynamic update property, information Property and real availability.
The medical diagnostic information for the various diseases issued by doctor is usually hand-written version and electronic edition, for hand-written version, this Invention uses optical character identification (Optical Character Recognition, OCR) technology to identify it first, so Medication characteristic therein is extracted with natural language processing technique afterwards;For electronic edition, natural language processing technique is directly used Extract medication characteristic therein.
The OCR technique is converted the text in the medical diagnostic information of hand-written version to by optics input modes such as scannings Image information recycles character recognition technology to convert image information to the computer input technology that can be used.
The method for extracting medication characteristic therein with natural language processing technique is referred to above-mentioned steps two In description.
Step 3: establishing medical treatment compensates audit model, using the data in the medical consultations scheme base as training number According to by the method for machine learning, each parameter of audit model is compensated in the training medical treatment, is generated the medical treatment and is compensated audit mould Type.
On the basis of above-mentioned two step established medical consultations scheme base, the method that the present invention passes through machine learning will The characteristics such as the medication standard of medical consultations scheme base as training data, compensate in audit model by the medical treatment for inputting foundation, It being iterated calculatings by data up to ten thousand time, each parameter audited in model is compensated in training medical treatment, by continuous adjusting parameter, Optimal effect is obtained, audit model is compensated in the more high quality more preferably medical treatment of final output practicability higher efficiency.
Audit model is compensated in medical treatment of the present invention can be a support vector machines (Support Vector Machine, SVM) or Random Forest model etc..
Step 4: the reimbursement document that the people hurts Claims Resolution is carried out by OCR technique when someone hurts Claims Resolution case generation It after identification, is input to the medical treatment and compensates in audit model, judging that the people hurts Claims Resolution case is all the presence of fraud.
In order to further verify the real availability that audit model is compensated in the medical treatment, the people that the present invention will occur in reality The expense report of wound Claims Resolution case and true diagnosis and treatment nonoculture are that audit mould is compensated in the trained established medical treatment of test data input It in type, is exported by the calculating of model, obtains this and play case medication fraud as a result, and insurance personnel is arranged to carry out practical tune It looks into, generates investigation specification, the judging result of model is compared with manual research specification, to finally determine that this plays case Part whether there is fraud, the availability and high efficiency of final certification model.
Optionally, in other embodiments, review procedure is compensated in the medical treatment can also be divided into one or more Module, one or more module are stored in memory 11, and (the present embodiment is processor by one or more processors 12) performed to complete the present invention, the so-called module of the present invention is the series of computation machine program for referring to complete specific function Instruction segment compensates execution of the review procedure in the medical treatment compensation audit device 1 based on machine learning for describing the medical treatment Process.
For example, referring to shown in Fig. 3, the doctor in audit one embodiment of device is compensated for the medical treatment the present invention is based on machine learning The program module schematic diagram for compensating review procedure 01 is treated, in the embodiment, review procedure 01 is compensated in medical treatment can be divided into number Module 10, model generation module 20 and auditing module 30 are established according to library, illustratively:
The Database module 10 is used for: being searched for the medical literature of each medicine section from medical literature database, is built Vertical medical literature collection database carries out text mining to the medical literature by using natural language processing technique, extracts not With the medication standard of illness, and by the medication standard storage in medical consultations scheme base.
The Database module 10 is also used to: the medicine that the various diseases that doctor issues are obtained from hospital system is examined Disconnected information text identifies the medical diagnostic information text by optical character identification and natural language processing technique And text-processing, the medicining condition of diagnostic message text is grabbed, the medical consultations scheme base is added to.
The model generation module 20 is used for: being established medical treatment and is compensated audit model, using in the medical consultations scheme base Data as training data, by the method for machine learning, each parameter of audit model is compensated in the training medical treatment, generates institute It states medical treatment and compensates audit model.
The auditing module 30 is used for: when someone, which hurts Claims Resolution case, to be occurred, the reimbursement document that the people hurts Claims Resolution being led to It crosses after optical character recognition technology identified, is input to the medical treatment and compensates in audit model, judge that the people hurts Claims Resolution case Part is all the presence of fraud.
The program modules such as above-mentioned Database module 10, model generation module 20 and auditing module 30 be performed it is real Existing functions or operations step is substantially the same with above-described embodiment, and details are not described herein.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium On be stored with medical treatment compensate review procedure, it is described medical treatment compensate review procedure can be executed by one or more processors, with realize Following operation:
The medical literature that each medicine section is searched for from medical literature database establishes medical literature collection database, by making Text mining is carried out to the medical literature with natural language processing technique, extracts the medication standard of different syndromes, and will be described Medication standard storage is in medical consultations scheme base.
The medical diagnostic information text that the various diseases that doctor issues are obtained from hospital system, passes through optical character identification And natural language processing technique, identification and text-processing are carried out to the medical diagnostic information text, grab diagnostic message text Medicining condition, be added to the medical consultations scheme base.
Medical treatment compensation audit model is established to pass through using the data in the medical consultations scheme base as training data Each parameter of audit model is compensated in the method for machine learning, the training medical treatment, is generated the medical treatment and is compensated audit model.
When someone hurt Claims Resolution case occur when, by the people hurt Claims Resolution reimbursement document by optical character recognition technology into It after row identification, is input to the medical treatment and compensates in audit model, judging that the people hurts Claims Resolution case is all the presence of fraud.
Computer readable storage medium specific embodiment of the present invention and the above-mentioned medical treatment based on machine learning are compensated and are audited Each embodiment of device and method is essentially identical, does not make tired state herein.
It should be noted that the serial number of the above embodiments of the invention is only for description, do not represent the advantages or disadvantages of the embodiments.And The terms "include", "comprise" herein or any other variant thereof is intended to cover non-exclusive inclusion, so that packet Process, device, article or the method for including a series of elements not only include those elements, but also including being not explicitly listed Other element, or further include for this process, device, article or the intrinsic element of method.Do not limiting more In the case where, the element that is limited by sentence "including a ...", it is not excluded that including process, device, the article of the element Or there is also other identical elements in method.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art The part contributed out can be embodied in the form of software products, which is stored in one as described above In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone, Computer, server or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims (10)

1. checking method is compensated in a kind of medical treatment based on machine learning, which is characterized in that the described method includes:
The medical literature that each medicine section is searched for from medical literature database establishes medical literature collection database, by using certainly Right language processing techniques carry out text mining to the medical literature, extract the medication standard of different syndromes, and by the medication Standard storage is in medical consultations scheme base;
The medical diagnostic information text that the various diseases that doctor issues are obtained from hospital system, by optical character identification and certainly Right language processing techniques carry out identification and text-processing to the medical diagnostic information text, grab the use of diagnostic message text Medicine situation is added to the medical consultations scheme base;
It establishes medical treatment compensation audit model and passes through machine using the data in the medical consultations scheme base as training data Each parameter of audit model is compensated in the method for study, the training medical treatment, is generated the medical treatment and is compensated audit model;And
When someone, which hurts Claims Resolution case, to be occurred, the reimbursement document that the people hurts Claims Resolution is known by optical character recognition technology It after not, is input to the medical treatment and compensates in audit model, judging that the people hurts Claims Resolution case is all the presence of fraud.
2. as described in claim 1 based on machine learning medical treatment compensate checking method, which is characterized in that it is described by using Natural language processing technique carries out text mining to the medical literature, extracts the medication standard of different syndromes, comprising:
According to drug knowledge architecture drug corpus;
Using natural language processing technique, it is based on the drug corpus, morphological analysis, interdependent sentence are carried out to the medical literature Method analysis, viewpoint extract, and extract the medication standard of different syndromes.
3. checking method is compensated in the medical treatment based on machine learning as claimed in claim 2, which is characterized in that the morphological analysis Including being segmented to medical literature, part-of-speech tagging and name Entity recognition operation;Wherein, the participle includes:
(1) from the text of medical literature according to sequence from left to right, from the starting position of a character string select one most Whether the word long segment of long length is matched with the drug corpus, judge institute's predicate long segment in the drug corpus In, if calculating and then judging to have lacked one again if it was not then reducing by a character since the right for a participle Whether the segment of character circuits sequentially in drug corpus, until remaining individual character;
(2) it is carried out from the remaining partial sequence of the text of the medical literature again according to above-mentioned step (1) the method Participle, until entire text is completed to segment.
4. checking method is compensated in the medical treatment based on machine learning as described in any one of claims 1 to 3, feature exists In the foundation medical treatment compensates audit model and passes through machine using the data in the medical consultations scheme base as training data Each parameter of audit model is compensated in the method for device study, the training medical treatment, is generated the medical treatment and is compensated audit model, comprising:
Using the medication standard of the medical consultations scheme base as training data, the medical treatment for inputting foundation is compensated in audit model, It is iterated calculating by data, each parameter in audit model is compensated in training medical treatment, and by continuous adjusting parameter, obtains institute It states medical treatment and compensates audit model.
5. checking method is compensated in the medical treatment based on machine learning as claimed in claim 4, which is characterized in that the medical treatment is compensated Auditing model is support vector machines or Random Forest model.
6. audit device is compensated in a kind of medical treatment based on machine learning, which is characterized in that described device includes memory and processing Review procedure, institute are compensated in device, the medical treatment based on machine learning that be stored on the memory to run on the processor It states when the medical treatment compensation review procedure based on machine learning is executed by the processor and realizes following steps:
The medical literature that each medicine section is searched for from medical literature database establishes medical literature collection database, by using certainly Right language processing techniques carry out text mining to the medical literature, extract the medication standard of different syndromes, and by the medication Standard storage is in medical consultations scheme base;
The medical diagnostic information text that the various diseases that doctor issues are obtained from hospital system, by optical character identification and certainly Right language processing techniques carry out identification and text-processing to the medical diagnostic information text, grab the use of diagnostic message text Medicine situation is added to the medical consultations scheme base;
It establishes medical treatment compensation audit model and passes through machine using the data in the medical consultations scheme base as training data Each parameter of audit model is compensated in the method for study, the training medical treatment, is generated the medical treatment and is compensated audit model;And
When someone, which hurts Claims Resolution case, to be occurred, the reimbursement document that the people hurts Claims Resolution is known by optical character recognition technology It after not, is input to the medical treatment and compensates in audit model, judging that the people hurts Claims Resolution case is all the presence of fraud.
7. as claimed in claim 6 based on machine learning medical treatment compensate audit device, which is characterized in that it is described by using Natural language processing technique carries out text mining to the medical literature, extracts the medication standard of different syndromes, comprising:
According to drug knowledge architecture drug corpus;
Using natural language processing technique, it is based on the drug corpus, morphological analysis, interdependent sentence are carried out to the medical literature Method analysis, viewpoint extract, and extract the medication standard of different syndromes.
8. audit device is compensated in the medical treatment based on machine learning as claimed in claim 7, which is characterized in that the morphological analysis Including being segmented to medical literature, part-of-speech tagging and name Entity recognition operation;Wherein, the participle includes:
(1) from the text of medical literature according to sequence from left to right, from the starting position of a character string select one most Whether the word long segment of long length is matched with the drug corpus, judge institute's predicate long segment in the drug corpus In, if calculating and then judging to have lacked one again if it was not then reducing by a character since the right for a participle Whether the segment of character circuits sequentially in drug corpus, until remaining individual character;
(2) it is carried out from the remaining partial sequence of the text of the medical literature again according to above-mentioned step (1) the method Participle, until entire text is completed to segment.
9. audit device is compensated in the medical treatment based on machine learning as described in any one of claim 6 to 8, feature exists In the foundation medical treatment compensates audit model and passes through machine using the data in the medical consultations scheme base as training data Each parameter of audit model is compensated in the method for device study, the training medical treatment, is generated the medical treatment and is compensated audit model, comprising:
Using the medication standard of the medical consultations scheme base as training data, the medical treatment for inputting foundation is compensated in audit model, It is iterated calculating by data, each parameter in audit model is compensated in training medical treatment, and by continuous adjusting parameter, obtains institute It states medical treatment and compensates audit model.
10. a kind of computer readable storage medium, which is characterized in that be stored on the computer readable storage medium based on machine Review procedure is compensated in the medical treatment of device study, and review procedure is compensated in the medical treatment based on machine learning can be by one or more It manages device to execute, to realize that the step of checking method is compensated in the medical treatment based on machine learning as described in any one of claims 1 to 5 Suddenly.
CN201910294783.1A 2019-04-12 2019-04-12 Checking method, device and storage medium are compensated in medical treatment based on machine learning Pending CN110119991A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910294783.1A CN110119991A (en) 2019-04-12 2019-04-12 Checking method, device and storage medium are compensated in medical treatment based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910294783.1A CN110119991A (en) 2019-04-12 2019-04-12 Checking method, device and storage medium are compensated in medical treatment based on machine learning

Publications (1)

Publication Number Publication Date
CN110119991A true CN110119991A (en) 2019-08-13

Family

ID=67521019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910294783.1A Pending CN110119991A (en) 2019-04-12 2019-04-12 Checking method, device and storage medium are compensated in medical treatment based on machine learning

Country Status (1)

Country Link
CN (1) CN110119991A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866836A (en) * 2019-11-14 2020-03-06 支付宝(杭州)信息技术有限公司 Computer-implemented medical insurance scheme auditing method and device
CN111382279A (en) * 2020-03-06 2020-07-07 中国建设银行股份有限公司 Order examination method and device
CN111914095A (en) * 2020-06-20 2020-11-10 武汉海云健康科技股份有限公司 Medicine interaction relation extraction method and system
CN112507141A (en) * 2020-12-01 2021-03-16 平安医疗健康管理股份有限公司 Investigation task generation method and device, computer equipment and storage medium
CN112528887A (en) * 2020-12-16 2021-03-19 支付宝(杭州)信息技术有限公司 Auditing method and device
CN112632995A (en) * 2020-12-02 2021-04-09 北京健康之家科技有限公司 User service request processing method and device, server and storage medium
CN112712436A (en) * 2020-12-31 2021-04-27 天津幸福生命科技有限公司 Medical data processing method, device, medium and electronic equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7263492B1 (en) * 2002-02-15 2007-08-28 Fair Isaac Corporation Sequencing models of healthcare related states
US20160267484A1 (en) * 2014-03-25 2016-09-15 Medicfp LLC Medical data collection and fraud prediction system and method
CN107871285A (en) * 2017-12-06 2018-04-03 和金在线(北京)科技有限公司 A kind of health insurance pays for the method for detecting and system of fraud and abuse
CN107895168A (en) * 2017-10-13 2018-04-10 平安科技(深圳)有限公司 The method of data processing, the device of data processing and computer-readable recording medium
CN107909299A (en) * 2017-12-11 2018-04-13 凯泰铭科技(北京)有限公司 People hinders Claims Resolution data risk checking method and system
CN108305175A (en) * 2017-12-30 2018-07-20 上海栈略数据技术有限公司 Settlement of insurance claim air control assisted verification system based on intellectual medical knowledge mapping
CN109165849A (en) * 2018-08-27 2019-01-08 众安信息技术服务有限公司 Methods of risk assessment and device
CN109359669A (en) * 2018-09-10 2019-02-19 平安科技(深圳)有限公司 Method for detecting abnormality, device, computer equipment and storage medium are submitted an expense account in medical insurance

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7263492B1 (en) * 2002-02-15 2007-08-28 Fair Isaac Corporation Sequencing models of healthcare related states
US20160267484A1 (en) * 2014-03-25 2016-09-15 Medicfp LLC Medical data collection and fraud prediction system and method
CN107895168A (en) * 2017-10-13 2018-04-10 平安科技(深圳)有限公司 The method of data processing, the device of data processing and computer-readable recording medium
CN107871285A (en) * 2017-12-06 2018-04-03 和金在线(北京)科技有限公司 A kind of health insurance pays for the method for detecting and system of fraud and abuse
CN107909299A (en) * 2017-12-11 2018-04-13 凯泰铭科技(北京)有限公司 People hinders Claims Resolution data risk checking method and system
CN108305175A (en) * 2017-12-30 2018-07-20 上海栈略数据技术有限公司 Settlement of insurance claim air control assisted verification system based on intellectual medical knowledge mapping
CN109165849A (en) * 2018-08-27 2019-01-08 众安信息技术服务有限公司 Methods of risk assessment and device
CN109359669A (en) * 2018-09-10 2019-02-19 平安科技(深圳)有限公司 Method for detecting abnormality, device, computer equipment and storage medium are submitted an expense account in medical insurance

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110866836A (en) * 2019-11-14 2020-03-06 支付宝(杭州)信息技术有限公司 Computer-implemented medical insurance scheme auditing method and device
CN110866836B (en) * 2019-11-14 2022-12-06 支付宝(杭州)信息技术有限公司 Computer-implemented medical insurance scheme auditing method and device
CN111382279A (en) * 2020-03-06 2020-07-07 中国建设银行股份有限公司 Order examination method and device
CN111914095A (en) * 2020-06-20 2020-11-10 武汉海云健康科技股份有限公司 Medicine interaction relation extraction method and system
CN111914095B (en) * 2020-06-20 2024-04-19 武汉海云健康科技股份有限公司 Medicine interaction relation extraction method and system
CN112507141A (en) * 2020-12-01 2021-03-16 平安医疗健康管理股份有限公司 Investigation task generation method and device, computer equipment and storage medium
CN112632995A (en) * 2020-12-02 2021-04-09 北京健康之家科技有限公司 User service request processing method and device, server and storage medium
CN112528887A (en) * 2020-12-16 2021-03-19 支付宝(杭州)信息技术有限公司 Auditing method and device
CN112528887B (en) * 2020-12-16 2022-10-28 蚂蚁胜信(上海)信息技术有限公司 Auditing method and device
CN112712436A (en) * 2020-12-31 2021-04-27 天津幸福生命科技有限公司 Medical data processing method, device, medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN110119991A (en) Checking method, device and storage medium are compensated in medical treatment based on machine learning
US20220020495A1 (en) Methods and apparatus for providing guidance to medical professionals
CN109299472B (en) Text data processing method and device, electronic equipment and computer readable medium
US9652719B2 (en) Authoring system for bayesian networks automatically extracted from text
Ross et al. The HMO research network virtual data warehouse: a public data model to support collaboration
US10181012B2 (en) Extracting clinical care pathways correlated with outcomes
CN114616572A (en) Cross-document intelligent writing and processing assistant
US20140365239A1 (en) Methods and apparatus for facilitating guideline compliance
Dias et al. Evidence synthesis for decision making 6: embedding evidence synthesis in probabilistic cost-effectiveness analysis
Xu et al. Named entity recognition of follow-up and time information in 20 000 radiology reports
US20200387635A1 (en) Anonymization of heterogenous clinical reports
US10290370B2 (en) Systems and methods for extracting specified data from narrative text
WO2022035018A1 (en) Clinic manager service platform and system for providing same
CN111986744B (en) Patient interface generation method and device for medical institution, electronic equipment and medium
US20200074300A1 (en) Artificial-intelligence-augmented classification system and method for tender search and analysis
WO2014197669A1 (en) Methods and apparatus for providing guidance to medical professionals
CN111753089A (en) Topic clustering method and device, electronic equipment and storage medium
Noor et al. Deployment of a free-text analytics platform at a UK national health service research hospital: Cogstack at University College London Hospitals
CN113627797A (en) Image generation method and device for employee enrollment, computer equipment and storage medium
CN114003704A (en) Method and device for creating designated tag guest group, electronic equipment and storage medium
Kanagasabai et al. A workflow for mutation extraction and structure annotation
CN107910066A (en) Case history appraisal procedure, device, electronic equipment and storage medium
CN114676307A (en) Ranking model training method, device, equipment and medium based on user retrieval
Magoc et al. Generalizability and portability of natural language processing system to extract individual social risk factors
Lavanya et al. Auto capture on drug text detection in social media through NLP from the heterogeneous data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination