CN110032715A - A kind of method of disease code conversion - Google Patents

A kind of method of disease code conversion Download PDF

Info

Publication number
CN110032715A
CN110032715A CN201910215224.7A CN201910215224A CN110032715A CN 110032715 A CN110032715 A CN 110032715A CN 201910215224 A CN201910215224 A CN 201910215224A CN 110032715 A CN110032715 A CN 110032715A
Authority
CN
China
Prior art keywords
disease
test set
standard
disease code
rules
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910215224.7A
Other languages
Chinese (zh)
Inventor
孙闯
火立龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Kindo Medical Data Technology Co Ltd
Original Assignee
Wuhan Kindo Medical Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Kindo Medical Data Technology Co Ltd filed Critical Wuhan Kindo Medical Data Technology Co Ltd
Priority to CN201910215224.7A priority Critical patent/CN110032715A/en
Publication of CN110032715A publication Critical patent/CN110032715A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/16Automatic learning of transformation rules, e.g. from examples
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • General Physics & Mathematics (AREA)
  • Public Health (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present invention relates to a kind of methods of disease code conversion, comprising the following steps: S01: acquisition standard disease code and standard diagnostics describe corresponding each version of code, establish normal dictionary library;S02: test set is established in the disease code converted as needed and diagnosis description;S03: according to the normal dictionary library and the test set, term vector is formed;S04: extracting the top N encoded radio for the disease code converted, and obtains primary election disease code;S05: being directed to the term vector, calculates similarity value, obtains the primary election disease code of particular version corresponding with similarity maximum value;S06: according to clinical rules, the primary election disease code of the particular version of acquisition and the mapping relations for the disease code converted are verified, determines the disease code of conversion.The beneficial effects of the present invention are: the accuracy for the disease code for ensuring to convert, realizes the conversion between each version disease code.

Description

Method for coding and converting diseases
Technical Field
The invention relates to the technical field of medical science and computer application, in particular to a disease code conversion method.
Background
International Classification of diseases and related Health Issues (ICD) is an International unified disease Classification method established by WHO (World Health Organization), which classifies diseases into classes according to characteristics of disease etiology, pathology, clinical manifestation, anatomical location and the like, so that the diseases become an ordered combination and are expressed by a coding method, which is a carrier for recording medical information and is a basis for developing medical data mining, disease diagnosis grouping and performance evaluation, and medical insurance DRG collection and payment.
In the practice of domestic medical institutions, different extensions are made to the codes according to the characteristics of clinical diseases in various places, and meanwhile, for the same disease, descriptive differences in version also exist. For example, in GB-2016 ICD-10 edition, "A00.100 cholera, due to O1 group Vibrio cholerae, biotype Ellto", and "A00.101 biotype Ellto" in BJ-V6.01 edition, both differ in coding and in terms of description; therefore, the problem of non-uniformity of multiple versions occurs, and data interconnection and intercommunication and medical data mining application in the industry are seriously influenced.
Disclosure of Invention
The technical problem to be solved by the present invention is to provide a method for transcoding diseases, aiming at the defects of the prior art.
The technical scheme for solving the technical problems is as follows: a method of disease transcoding comprising the steps of:
s01: collecting coding versions corresponding to standard disease codes and standard diagnosis descriptions, establishing a standard dictionary library, and classifying according to codes of different versions;
s02: establishing a test set according to the disease codes and diagnosis descriptions which need to be converted;
s03: forming term vectors according to the standard dictionary library and the test set, and establishing a vector space model;
s04: extracting the first N-bit coding value of the disease code to be converted, comparing the first N-bit coding value with the standard disease codes of each version in the standard dictionary library, and acquiring a plurality of versions of initially selected disease codes consistent with the first N-bit coding value;
s05: calculating a similarity value aiming at the term vector, and acquiring the initial selection disease code of a specific version corresponding to the maximum similarity value;
s06: and checking the mapping relation between the acquired initial selected disease codes of the specific version and the disease codes needing to be converted according to clinical rules, and determining the converted disease codes.
The invention has the beneficial effects that: forming term vectors by establishing a standard dictionary library and a test set, establishing a vector space model, then obtaining a primary selected disease code of a specific version corresponding to the maximum value of the similarity by calculating the similarity value, and primarily determining a converted disease code; the mapping relation is verified according to clinical rules, so that the accuracy of the converted disease codes is ensured, and the conversion among the disease codes of all versions is realized.
On the basis of the technical scheme, the invention can be further improved as follows.
Further: the standard diagnostic description includes standard surgical and operational descriptions.
Further: the test set comprises a disease code test set and a diagnostic text test set, wherein the disease code test set corresponds to a disease code to be converted, and the diagnostic text test set corresponds to the diagnostic description.
Further: the step S03 specifically includes the following steps:
s03.1: preprocessing according to the standard dictionary database and medical rules, performing word segmentation operation on the preprocessed data according to Chinese part-of-speech rules, removing stop words and repeated words, and generating a standard dictionary database word packet;
s03.2: preprocessing according to the medical rules according to the test set, performing word segmentation operation on the preprocessed data according to Chinese part-of-speech rules, removing stop words and repeated words, performing consistency processing on the appeared synonyms according to a preset synonym library, and generating a test library word package;
s03.3: gathering the non-repeated words related in the standard dictionary word packet and the test library word packet to be used as a term word packet;
s03.4: and forming term vectors according to the term word packet, and establishing a vector space model.
Further: the formula for calculating the similarity value is as follows,
wherein,a term vector representing the ith standard dictionary term,a term vector representing the jth test set term.
The beneficial effects of the further scheme are as follows: by utilizing algorithms such as cosine similarity and the like, automatic conversion among different ICD (international disease classification) coding versions is realized, and the efficiency and the accuracy of coding conversion are greatly improved.
Further: the clinical rules include location rules, etiology rules, and surgical rules.
The beneficial effects of the further scheme are as follows: the accuracy of checking the mapping relation between the acquired initial selected disease codes of the specific version and the disease codes needing to be converted is improved.
Further: in step S04, N is a natural number greater than or equal to 3, and the value of N includes the decimal point of the disease code.
The beneficial effects of the further scheme are as follows: the matching degree and the matching accuracy are improved.
Further: after determining the transformed disease code, further comprising,
and sending the converted disease codes to a medical expert end for auditing.
The beneficial effects of the further scheme are as follows: and optimizing the transcoding effect.
Drawings
FIG. 1 is a flow chart of a method of disease transcoding in accordance with the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, a method of disease transcoding comprises the steps of:
s01: collecting coding versions corresponding to standard disease codes and standard diagnosis descriptions, establishing a standard dictionary library, and classifying according to codes of different versions;
s02: establishing a test set according to the disease codes and diagnosis descriptions which need to be converted;
s03: forming term vectors according to the standard dictionary library and the test set, and establishing a vector space model;
s04: extracting the first N-bit coding value of the disease code to be converted, comparing the first N-bit coding value with the standard disease codes of each version in the standard dictionary library, and acquiring a plurality of versions of initially selected disease codes consistent with the first N-bit coding value;
s05: calculating a similarity value aiming at the term vector, and acquiring the initial selection disease code of a specific version corresponding to the maximum similarity value;
s06: and checking the mapping relation between the acquired initial selected disease codes of the specific version and the disease codes needing to be converted according to clinical rules, and determining the converted disease codes.
The clinical rules include location rules, etiology rules, and surgical rules.
Preferably, in step S01, the standard diagnosis description includes a standard operation and operation description, which is a main diagnostic text description written by a doctor for a patient.
In step S02, the test set includes a disease code test set and a diagnostic text test set, where the disease code test set corresponds to a disease code to be converted, and the diagnostic text test set corresponds to the diagnostic description.
The step S03 specifically includes the following steps:
s03.1: preprocessing according to the standard dictionary database and medical rules, performing word segmentation operation on the preprocessed data according to Chinese part-of-speech rules, removing stop words and repeated words, and generating a standard dictionary database word packet;
s03.2: preprocessing according to the medical rules according to the test set, performing word segmentation operation on the preprocessed data according to Chinese part-of-speech rules, removing stop words and repeated words, performing consistency processing on the appeared synonyms according to a preset synonym library, and generating a test library word package;
s03.3: gathering the non-repeated words related in the standard dictionary word packet and the test library word packet to be used as a term word packet;
wherein, the term packet comprises a plurality of standard dictionary library terms and a plurality of test terms;
s03.4: and forming term vectors according to the term word packet, and establishing a vector space model.
In step S04, N is a natural number greater than or equal to 3, and the value of N includes the decimal point of the disease code.
Each standard dictionary base term corresponds to a standard dictionary base term vector, and each test term corresponds to a test term vector.
The term vectors are formed in a one-hot-encoding manner, and corresponding standard dictionary library term vectors and test term vectors are respectively formed for each standard dictionary library term and each test term so as to establish a vector space model.
Preferably, in step S05, the similarity value is calculated by the formula,
wherein,a term vector representing the ith standard dictionary term,a term vector representing the jth test set term.
The invention innovatively applies natural language identification (NLP) technology in ICD coding identification conversion, utilizes one-hot-encoding to construct a text vector space model, and simultaneously combines algorithms such as cosine similarity and the like to realize conversion among different coding versions, improve coding conversion efficiency and lay a foundation for medical data application (such as medical research and disease control fee management).
Specifically, a converter is constructed according to a conversion rule configured by a domain expert and a similarity algorithm, when code conversion needs to be carried out on new character diagnosis, the converter can be used for outputting target version disease codes of terms to be converted, one-key transcoding is realized, and the method is simple and convenient and has high accuracy.
Preferably, after determining the transformed disease code, further comprising,
and sending the converted disease codes to a medical expert end for auditing, and optimizing the code conversion effect.
Specifically, the converted disease codes are sent to a medical expert for auditing, data with obvious problems are corrected, and the steps from S03 to S06 are repeated, so that the conversion effect of code conversion is continuously optimized, and the accuracy of work is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (8)

1. A method of disease transcoding comprising the steps of:
s01: collecting coding versions corresponding to standard disease codes and standard diagnosis descriptions, establishing a standard dictionary library, and classifying according to codes of different versions;
s02: establishing a test set according to the disease codes and diagnosis descriptions which need to be converted;
s03: forming term vectors according to the standard dictionary library and the test set, and establishing a vector space model;
s04: extracting the first N-bit coding value of the disease code to be converted, comparing the first N-bit coding value with the standard disease codes of each version in the standard dictionary library, and acquiring a plurality of versions of initially selected disease codes consistent with the first N-bit coding value;
s05: calculating a similarity value aiming at the term vector, and acquiring the initial selection disease code of a specific version corresponding to the maximum similarity value;
s06: and checking the mapping relation between the acquired initial selected disease codes of the specific version and the disease codes needing to be converted according to clinical rules, and determining the converted disease codes.
2. The method of disease transcoding of claim 1, wherein: the standard diagnostic description includes standard surgical and operational descriptions.
3. The method of disease transcoding of claim 1, wherein: the test set comprises a disease code test set and a diagnostic text test set, wherein the disease code test set corresponds to a disease code to be converted, and the diagnostic text test set corresponds to the diagnostic description.
4. The method of disease transcoding of claim 1, wherein: the step S03 specifically includes the following steps:
s03.1: preprocessing according to the standard dictionary database and medical rules, performing word segmentation operation on the preprocessed data according to Chinese part-of-speech rules, removing stop words and repeated words, and generating a standard dictionary database word packet;
s03.2: preprocessing according to the medical rules according to the test set, performing word segmentation operation on the preprocessed data according to Chinese part-of-speech rules, removing stop words and repeated words, performing consistency processing on the appeared synonyms according to a preset synonym library, and generating a test library word package;
s03.3: gathering the non-repeated words related in the standard dictionary word packet and the test library word packet to be used as a term word packet;
s03.4: and forming term vectors according to the term word packet, and establishing a vector space model.
5. The method of disease transcoding of claim 4, wherein: the formula for calculating the similarity value is as follows,
wherein,a term vector representing the ith standard dictionary term,a term vector representing the jth test set term.
6. The method of disease transcoding of claim 1, wherein: the clinical rules include location rules, etiology rules, and surgical rules.
7. The method of disease transcoding of claim 1, wherein: in step S04, N is a natural number greater than or equal to 3, and the value of N includes the decimal point of the disease code.
8. The method of disease transcoding of claim 1, wherein: after determining the transformed disease code, further comprising,
and sending the converted disease codes to a medical expert end for auditing.
CN201910215224.7A 2019-03-21 2019-03-21 A kind of method of disease code conversion Pending CN110032715A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910215224.7A CN110032715A (en) 2019-03-21 2019-03-21 A kind of method of disease code conversion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910215224.7A CN110032715A (en) 2019-03-21 2019-03-21 A kind of method of disease code conversion

Publications (1)

Publication Number Publication Date
CN110032715A true CN110032715A (en) 2019-07-19

Family

ID=67236346

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910215224.7A Pending CN110032715A (en) 2019-03-21 2019-03-21 A kind of method of disease code conversion

Country Status (1)

Country Link
CN (1) CN110032715A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110827929A (en) * 2019-11-05 2020-02-21 中山大学 Disease classification code recognition method and device, computer equipment and storage medium
CN112233794A (en) * 2020-10-20 2021-01-15 吾征智能技术(北京)有限公司 Disease information matching system based on hematuria information
CN112632910A (en) * 2020-12-21 2021-04-09 北京惠及智医科技有限公司 Operation encoding method, electronic device and storage device
CN113705166A (en) * 2021-07-28 2021-11-26 浙江太美医疗科技股份有限公司 Method and device for encoding medical events
CN114077837A (en) * 2020-08-10 2022-02-22 卫宁健康科技集团股份有限公司 Method and system for converting disease codes, electronic device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844308A (en) * 2017-01-20 2017-06-13 天津艾登科技有限公司 A kind of use semantics recognition carries out the method for automating disease code conversion
CN108182207A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese surgical procedure based on participle network
CN108446260A (en) * 2018-02-06 2018-08-24 天津艾登科技有限公司 The method and system of automation disease code conversion are carried out based on semantic approximate match algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844308A (en) * 2017-01-20 2017-06-13 天津艾登科技有限公司 A kind of use semantics recognition carries out the method for automating disease code conversion
CN108182207A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese surgical procedure based on participle network
CN108446260A (en) * 2018-02-06 2018-08-24 天津艾登科技有限公司 The method and system of automation disease code conversion are carried out based on semantic approximate match algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
鲍庆升等: "基于文本分析的自动化疾病编码方法", 《计算机系统应用》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110827929A (en) * 2019-11-05 2020-02-21 中山大学 Disease classification code recognition method and device, computer equipment and storage medium
CN110827929B (en) * 2019-11-05 2022-06-07 中山大学 Disease classification code recognition method and device, computer equipment and storage medium
CN114077837A (en) * 2020-08-10 2022-02-22 卫宁健康科技集团股份有限公司 Method and system for converting disease codes, electronic device and storage medium
CN112233794A (en) * 2020-10-20 2021-01-15 吾征智能技术(北京)有限公司 Disease information matching system based on hematuria information
CN112632910A (en) * 2020-12-21 2021-04-09 北京惠及智医科技有限公司 Operation encoding method, electronic device and storage device
CN113705166A (en) * 2021-07-28 2021-11-26 浙江太美医疗科技股份有限公司 Method and device for encoding medical events

Similar Documents

Publication Publication Date Title
CN110032715A (en) A kind of method of disease code conversion
CN109920501B (en) Electronic medical record classification method and system based on convolutional neural network and active learning
CN106844308B (en) Method for automatic disease code conversion using semantic recognition
CN109741806B (en) Auxiliary generation method and device for medical image diagnosis report
JP5098559B2 (en) Similar image search device and similar image search program
CN110047584A (en) Hospital distributing diagnosis method, system, device and medium based on deep learning
WO2021046536A1 (en) Automated information extraction and enrichment in pathology report using natural language processing
US20170147753A1 (en) Method for searching for similar case of multi-dimensional health data and apparatus for the same
CN111180062A (en) Disease classification coding intelligent recommendation method based on original diagnosis data
US20130144651A1 (en) Determining one or more probable medical codes using medical claims
CN111814463B (en) International disease classification code recommendation method and system, corresponding equipment and storage medium
CN111177356B (en) Acid-base index medical big data analysis method and system
CN111191415A (en) Operation classification coding method based on original operation data
CN113284572A (en) Multi-modal heterogeneous medical data processing method and related device
WO2014130287A1 (en) Method and system for propagating labels to patient encounter data
CN111259664B (en) Method, device and equipment for determining medical text information and storage medium
CN114358001A (en) Method for standardizing diagnosis result, and related device, equipment and storage medium thereof
CN113823414B (en) Main diagnosis and main operation matching detection method, device, computing equipment and storage medium
Moldwin et al. Empirical findings on the role of structured data, unstructured data, and their combination for automatic clinical phenotyping
US8473314B2 (en) Method and system for determining precursors of health abnormalities from processing medical records
CN109859813B (en) Entity modifier recognition method and device
CN116741358A (en) Inquiry registration recommendation method, inquiry registration recommendation device, inquiry registration recommendation equipment and storage medium
CN112836006B (en) Multi-diagnostic intelligent coding method, system, medium and equipment
CN112992303A (en) Human phenotype standard expression extraction method
CN118016263B (en) Digital medical assistant system based on voice recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190719

RJ01 Rejection of invention patent application after publication