CN110032715A - A kind of method of disease code conversion - Google Patents
A kind of method of disease code conversion Download PDFInfo
- Publication number
- CN110032715A CN110032715A CN201910215224.7A CN201910215224A CN110032715A CN 110032715 A CN110032715 A CN 110032715A CN 201910215224 A CN201910215224 A CN 201910215224A CN 110032715 A CN110032715 A CN 110032715A
- Authority
- CN
- China
- Prior art keywords
- disease code
- disease
- code
- test set
- conversion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/151—Transformation
- G06F40/16—Automatic learning of transformation rules, e.g. from examples
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- General Physics & Mathematics (AREA)
- Public Health (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Biomedical Technology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The present invention relates to a kind of methods of disease code conversion, comprising the following steps: S01: acquisition standard disease code and standard diagnostics describe corresponding each version of code, establish normal dictionary library;S02: test set is established in the disease code converted as needed and diagnosis description;S03: according to the normal dictionary library and the test set, term vector is formed;S04: extracting the top N encoded radio for the disease code converted, and obtains primary election disease code;S05: being directed to the term vector, calculates similarity value, obtains the primary election disease code of particular version corresponding with similarity maximum value;S06: according to clinical rules, the primary election disease code of the particular version of acquisition and the mapping relations for the disease code converted are verified, determines the disease code of conversion.The beneficial effects of the present invention are: the accuracy for the disease code for ensuring to convert, realizes the conversion between each version disease code.
Description
Technical field
The present invention relates to a kind of methods of medicine, computer application technology more particularly to disease code conversion.
Background technique
International statistical classification (the International Classification of of diseases and related health problems
Diseases, ICD), it is the disease for the international uniform that WHO (World Health Organization, the World Health Organization) formulates
Sick classification method, it classifies disease according to characteristics such as the cause of disease of disease, pathology, clinical manifestation and anatomical positions, makes it
The combination orderly as one, and the system indicated with the method for coding, it is the carrier for recording medical information, is to carry out doctor
Treat the basis of data mining, medical diagnosis on disease grouping and performance appraisal, medical insurance DRG receipt and payment expense.
At home in medical institutions' practice, various regions have carried out different expansions to coding according to the characteristics of clinical disease, together
When, for same disease, there is also the description sex differernces in version.For example, " A00.100 is suddenly in GB-2016 ICD-10 editions
Disorderly, due to O1 group cholera vibrio, caused by El Tor biotype ", with BJ-V6.01 editions in " A00.101 El Tor biotype is suddenly
Disorderly ", the two has differences on coding and term description;Thus there are multiple version disunity problems, drastically influence
The excavation application of data interconnection intercommunication and medical data in industry.
Summary of the invention
The technical problem to be solved by the present invention is in view of the drawbacks of the prior art, provide a kind of side of disease code conversion
Method.
The technical scheme to solve the above technical problems is that a kind of method of disease code conversion, including it is following
Step:
S01: acquisition standard disease code and standard diagnostics describe corresponding each version of code, establish normal dictionary library,
And classify according to different editions coding;
S02: test set is established in the disease code converted as needed and diagnosis description;
S03: according to the normal dictionary library and the test set, term vector is formed, vector space model is established;
S04: the top N encoded radio for the disease code converted is extracted, in the normal dictionary library
Standard disease code described in each version is compared, and obtains the primary election disease with the consistent multiple versions of the top N encoded radio
Coding;
S05: being directed to the term vector, calculates similarity value, obtains particular version corresponding with similarity maximum value
The primary election disease code;
S06: it according to clinical rules, verifies the primary election disease code of the particular version of acquisition and is converted
The mapping relations of the disease code determine the disease code of conversion.
The beneficial effects of the present invention are: forming term vector by establishing normal dictionary library and test set, vector sky is established
Between model then by calculating similarity value obtain the primary election disease code of corresponding with similarity maximum value particular version, just
Step determines the disease code of conversion;By verifying mapping relations according to clinical rules, it is ensured that the disease code of conversion it is accurate
Degree, realizes the conversion between each version disease code.
Based on the above technical solution, the present invention can also be improved as follows.
Further, the standard diagnostics description includes that standard procedures and operation describe.
Further, the test set includes disease code test set and diagnosis text test set, wherein the disease code
Test set is corresponding with the disease code converted, and the diagnosis text test set is corresponding with the diagnosis description.
Further, the step S03 specifically includes the following steps:
S03.1: it according to the normal dictionary library, is pre-processed according to medicine rule, and pretreated data is pressed
Participle operation is carried out according to Chinese part-of-speech rule, removes stop words and repetitor, generates normal dictionary library word packet;
S03.2: it according to the test set, is pre-processed according to medicine rule, and to pretreated data according to the Chinese
Language part-of-speech rule carries out participle operation, stop words and repetitor is removed, according to preconfigured thesaurus, to the synonymous of appearance
Word carries out unification processing, generates test library word packet;
S03.3: not repeated vocabulary involved in the normal dictionary library word packet and the test library word packet is made
For term word packet;
S03.4: term vector is formed according to the term word packet, establishes vector space model.
Further, the formula for calculating similarity value is,
Wherein,Indicate the term vector of i-th of normal dictionary term,Indicate the term of j-th of test set term to
Amount.
The beneficial effect of above-mentioned further scheme is: utilizing cosine similarity scheduling algorithm, realizes different ICD (international diseases
Disease classification) automatic conversion between version of code, greatly improve the efficiency and accuracy of code conversion.
Further, the clinical rules include position rule, cause of disease rule, art formula rule.
The beneficial effect of above-mentioned further scheme is: improving the primary election disease code of the particular version of acquisition and needs to carry out
The accuracy of mapping relations verification between the disease code of conversion.
Further, N is the natural number greater than 3 or equal to 3, and N place value includes that the disease is compiled in the step S04
Including the decimal point of code.
The beneficial effect of above-mentioned further scheme is: improving matching degree and matching accuracy rate.
Further, further include after the disease code for determining conversion,
Medical expert end is sent by the disease code of the conversion to audit.
The beneficial effect of above-mentioned further scheme is: Optimized Coding Based conversion effect.
Detailed description of the invention
Fig. 1 is a kind of flow chart of the method for disease code conversion of the present invention.
Specific embodiment
The principle and features of the present invention will be described below with reference to the accompanying drawings, and the given examples are served only to explain the present invention, and
It is non-to be used to limit the scope of the invention.
As shown in Figure 1, a kind of method of disease code conversion comprising following steps:
S01: acquisition standard disease code and standard diagnostics describe corresponding each version of code, establish normal dictionary library,
And classify according to different editions coding;
S02: test set is established in the disease code converted as needed and diagnosis description;
S03: according to the normal dictionary library and the test set, term vector is formed, vector space model is established;
S04: the top N encoded radio for the disease code converted is extracted, in the normal dictionary library
Standard disease code described in each version is compared, and obtains the primary election disease with the consistent multiple versions of the top N encoded radio
Coding;
S05: being directed to the term vector, calculates similarity value, obtains particular version corresponding with similarity maximum value
The primary election disease code;
S06: it according to clinical rules, verifies the primary election disease code of the particular version of acquisition and is converted
The mapping relations of the disease code determine the disease code of conversion.
The clinical rules include position rule, cause of disease rule, art formula rule.
Preferably, in the step S01, it is doctor couple that the standard diagnostics description, which includes that standard procedures and operation describe,
The Main Diagnosis verbal description that patient writes.
In the step S02, the test set includes disease code test set and diagnosis text test set, wherein described
Disease code test set is corresponding with the disease code converted, and the diagnosis text test set and the diagnosis describe
It is corresponding.
The step S03 specifically includes the following steps:
S03.1: it according to the normal dictionary library, is pre-processed according to medicine rule, and pretreated data is pressed
Participle operation is carried out according to Chinese part-of-speech rule, removes stop words and repetitor, generates normal dictionary library word packet;
S03.2: it according to the test set, is pre-processed according to medicine rule, and to pretreated data according to the Chinese
Language part-of-speech rule carries out participle operation, stop words and repetitor is removed, according to preconfigured thesaurus, to the synonymous of appearance
Word carries out unification processing, generates test library word packet;
S03.3: not repeated vocabulary involved in the normal dictionary library word packet and the test library word packet is made
For term word packet;
It wherein, include a plurality of normal dictionary library term and a plurality of test term in the term word packet;
S03.4: term vector is formed according to the term word packet, establishes vector space model.
In the step S04, N is the natural number greater than 3 or equal to 3, and N place value includes the decimal of the disease code
Including point.
Wherein, normal dictionary library term described in each is corresponding with normal dictionary library term vector, surveys described in each
Examination term is corresponding with test term vector.
The mode for forming term vector is to use the mode of one-hot-encoding (one-hot encoding) to mark described in each
Test term described in quasi- dictionary library term and each be respectively formed corresponding normal dictionary library term vector sum test term to
Amount, to establish vector space model.
Preferably, in the step S05, the formula for calculating similarity value is,
Wherein,Indicate the term vector of i-th of normal dictionary term,Indicate the term of j-th of test set term to
Amount.
Present invention innovation and application natural language recognition (NLP) technology in the conversion of ICD code identification, utilizes one-hot-
Encoding constructs text vector spatial model, in combination with cosine similarity scheduling algorithm, realizes and turns between different coding version
It changes, improves the efficiency of code conversion, lay a good foundation for medical data application (such as medical research, disease control expense manage).
Converter specifically is constructed according to the transformation rule and similarity algorithm of domain expert's configuration, when needing to newly arriving
Textual diagnosis when carrying out code conversion, using this converter, i.e., the target version disease of exportable term to be converted is compiled
Code realizes a key transcoding, and simple and convenient, accuracy is high.
Preferably, after the disease code for determining conversion, further include,
It sends medical expert end for the disease code of the conversion to audit, Optimized Coding Based conversion effect.
It is audited specifically, sending medical expert end for the disease code of the conversion, will wherein there is obvious problem
Data, after amendment, repeat the above steps S03 to S06, and then continues to optimize the conversion effect of code conversion, improves work
Accuracy.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and
Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
Claims (8)
1. a kind of method of disease code conversion, which comprises the following steps:
S01: acquisition standard disease code and standard diagnostics describe corresponding each version of code, establish normal dictionary library, and press
Classify according to different editions coding;
S02: test set is established in the disease code converted as needed and diagnosis description;
S03: according to the normal dictionary library and the test set, term vector is formed, vector space model is established;
S04: the top N encoded radio for the disease code converted is extracted, with each version in the normal dictionary library
This described standard disease code is compared, and obtains the primary election disease code with the consistent multiple versions of the top N encoded radio;
S05: being directed to the term vector, calculates similarity value, obtains the described of particular version corresponding with similarity maximum value
Primary election disease code;
S06: according to clinical rules, verify the primary election disease code of the particular version of acquisition with converted it is described
The mapping relations of disease code determine the disease code of conversion.
2. the method for a kind of disease code conversion according to claim 1, it is characterised in that: the standard diagnostics description includes
Standard procedures and operation describe.
3. a kind of method of disease code conversion according to claim 1, it is characterised in that: the test set includes that disease is compiled
Code test set and diagnosis text test set, wherein the disease code test set is opposite with the disease code converted
It answers, the diagnosis text test set is corresponding with the diagnosis description.
4. a kind of method of disease code conversion according to claim 1, it is characterised in that: the step S03 is specifically included
Following steps:
S03.1: it according to the normal dictionary library, is pre-processed according to medicine rule, and to pretreated data according to the Chinese
Language part-of-speech rule carries out participle operation, removes stop words and repetitor, generates normal dictionary library word packet;
S03.2: it according to the test set, is pre-processed according to medicine rule, and to pretreated data according to Chinese word
Property rule carry out participle operation, remove stop words and repetitor, according to preconfigured thesaurus, to the synonym of appearance into
The processing of row unification, generates test library word packet;
S03.3: not repeated vocabulary involved in the normal dictionary library word packet and the test library word packet is made as art
Words and phrases packet;
S03.4: term vector is formed according to the term word packet, establishes vector space model.
5. a kind of method of disease code conversion according to claim 4, it is characterised in that: calculate the formula of similarity value
For,
Wherein,Indicate the term vector of i-th of normal dictionary term,Indicate the term vector of j-th of test set term.
6. a kind of method of disease code conversion according to claim 1, it is characterised in that: the clinical rules include position
Rule, cause of disease rule, art formula rule.
7. a kind of method of disease code conversion according to claim 1, it is characterised in that: in the step S04, N is big
In 3 or the natural number equal to 3, and N place value is including the decimal point of the disease code.
8. a kind of method of disease code conversion according to claim 1, it is characterised in that: in the disease code for determining conversion
Later, further include,
Medical expert end is sent by the disease code of the conversion to audit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910215224.7A CN110032715A (en) | 2019-03-21 | 2019-03-21 | A kind of method of disease code conversion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910215224.7A CN110032715A (en) | 2019-03-21 | 2019-03-21 | A kind of method of disease code conversion |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110032715A true CN110032715A (en) | 2019-07-19 |
Family
ID=67236346
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910215224.7A Pending CN110032715A (en) | 2019-03-21 | 2019-03-21 | A kind of method of disease code conversion |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110032715A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110827929A (en) * | 2019-11-05 | 2020-02-21 | 中山大学 | Disease classification code recognition method and device, computer equipment and storage medium |
CN112233794A (en) * | 2020-10-20 | 2021-01-15 | 吾征智能技术(北京)有限公司 | Disease information matching system based on hematuria information |
CN112632910A (en) * | 2020-12-21 | 2021-04-09 | 北京惠及智医科技有限公司 | Operation encoding method, electronic device and storage device |
CN113705166A (en) * | 2021-07-28 | 2021-11-26 | 浙江太美医疗科技股份有限公司 | Method and device for encoding medical events |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844308A (en) * | 2017-01-20 | 2017-06-13 | 天津艾登科技有限公司 | A kind of use semantics recognition carries out the method for automating disease code conversion |
CN108182207A (en) * | 2017-12-15 | 2018-06-19 | 上海长江科技发展有限公司 | The intelligent coding method and system of Chinese surgical procedure based on participle network |
CN108446260A (en) * | 2018-02-06 | 2018-08-24 | 天津艾登科技有限公司 | The method and system of automation disease code conversion are carried out based on semantic approximate match algorithm |
-
2019
- 2019-03-21 CN CN201910215224.7A patent/CN110032715A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106844308A (en) * | 2017-01-20 | 2017-06-13 | 天津艾登科技有限公司 | A kind of use semantics recognition carries out the method for automating disease code conversion |
CN108182207A (en) * | 2017-12-15 | 2018-06-19 | 上海长江科技发展有限公司 | The intelligent coding method and system of Chinese surgical procedure based on participle network |
CN108446260A (en) * | 2018-02-06 | 2018-08-24 | 天津艾登科技有限公司 | The method and system of automation disease code conversion are carried out based on semantic approximate match algorithm |
Non-Patent Citations (1)
Title |
---|
鲍庆升等: "基于文本分析的自动化疾病编码方法", 《计算机系统应用》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110827929A (en) * | 2019-11-05 | 2020-02-21 | 中山大学 | Disease classification code recognition method and device, computer equipment and storage medium |
CN110827929B (en) * | 2019-11-05 | 2022-06-07 | 中山大学 | Disease classification code recognition method and device, computer equipment and storage medium |
CN112233794A (en) * | 2020-10-20 | 2021-01-15 | 吾征智能技术(北京)有限公司 | Disease information matching system based on hematuria information |
CN112632910A (en) * | 2020-12-21 | 2021-04-09 | 北京惠及智医科技有限公司 | Operation encoding method, electronic device and storage device |
CN113705166A (en) * | 2021-07-28 | 2021-11-26 | 浙江太美医疗科技股份有限公司 | Method and device for encoding medical events |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109582955B (en) | Method, apparatus and medium for standardizing medical terms | |
CN107680600B (en) | Sound-groove model training method, audio recognition method, device, equipment and medium | |
CN106997376B (en) | Question and answer sentence similarity calculation method based on multi-level features | |
US20190287684A1 (en) | Medical system interface apparatus and methods to classify and provide medical data using artificial intelligence | |
CN107833603B (en) | Electronic medical record document classification method and device, electronic equipment and storage medium | |
CN110032715A (en) | A kind of method of disease code conversion | |
CN106897545B (en) | A kind of tumor prognosis forecasting system based on depth confidence network | |
CN109783479B (en) | Data standardization processing method and device and storage medium | |
US11501178B2 (en) | Data processing method, medical term processing system and medical diagnostic system | |
CN107680689A (en) | Potential disease estimating method, system and the readable storage medium storing program for executing of medical text | |
US11275934B2 (en) | Positional embeddings for document processing | |
CN104021302A (en) | Auxiliary registration method based on Bayes text classification model | |
CN111191415A (en) | Operation classification coding method based on original operation data | |
CN111292814A (en) | Medical data standardization method and device | |
CN116127056A (en) | Medical dialogue abstracting method with multi-level characteristic enhancement | |
CN112949308A (en) | Method and system for identifying named entities of Chinese electronic medical record based on functional structure | |
CN110717021A (en) | Input text and related device for obtaining artificial intelligence interview | |
CN116911300A (en) | Language model pre-training method, entity recognition method and device | |
CN109284491A (en) | Medicine text recognition method, sentence identification model training method | |
CN111104481B (en) | Method, device and equipment for identifying matching field | |
CN116976350A (en) | Small sample medical entity identification method based on boundary and mutual information enhancement | |
CN111523309A (en) | Medicine information normalization method and device, storage medium and electronic equipment | |
CN116258685A (en) | Multi-organ segmentation method and device for simultaneous extraction and fusion of global and local features | |
CN109859813A (en) | A kind of entity modification word recognition method and device | |
CN111651575B (en) | Session text processing method, device, medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190719 |
|
RJ01 | Rejection of invention patent application after publication |