CN111180060B - Disease diagnosis automatic coding method and device - Google Patents
Disease diagnosis automatic coding method and device Download PDFInfo
- Publication number
- CN111180060B CN111180060B CN201911168334.9A CN201911168334A CN111180060B CN 111180060 B CN111180060 B CN 111180060B CN 201911168334 A CN201911168334 A CN 201911168334A CN 111180060 B CN111180060 B CN 111180060B
- Authority
- CN
- China
- Prior art keywords
- icd
- codes
- disease
- disease diagnosis
- preset number
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Public Health (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Computational Linguistics (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The invention discloses an automatic coding method for disease diagnosis, which comprises the following steps: obtaining a disease diagnosis according to the target case; searching according to the disease diagnosis to obtain ICD disease names and codes of a preset number of candidates with highest similarity ranking with the disease diagnosis; calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name and the chapter, the section, the category and the subgraph codes in the ICD codes; the ICD code of the candidate with the highest score is determined as the code of the disease diagnosis. The scheme disclosed by the invention is adopted to code the disease diagnosis by utilizing the hierarchical characteristics of the codes, namely the chapter, the section, the category and the subgraph codes in the codes, so that the accuracy of the codes is improved.
Description
Technical Field
The invention relates to the technical field of medical services, in particular to an automatic coding method and device for disease diagnosis.
Background
International disease classification (International Classification of Diseases, ICD) is an internationally unified disease classification method formulated by WHO, which classifies diseases into an ordered combination based on their etiology, pathology, clinical manifestations, and anatomical locations, and the like, and is a system represented by a coded method. Worldwide, the 10 th revision of the international statistical classification of diseases and related health problems remains abbreviated as ICD and is commonly referred to as ICD-10.
At present, a disease data set is obtained from medical records, disease feature words are obtained by segmentation, the feature words are vectorized, a convolutional neural network is introduced for classification, disease types are obtained, coding results are obtained by coding the disease types, a hierarchical relationship exists in coding, the prior art does not utilize the hierarchical characteristics of the coding, the accuracy is low, and how to utilize the hierarchical characteristics of the coding to improve the accuracy of the coding is a technical problem to be solved urgently.
Disclosure of Invention
The invention provides an automatic coding method for disease diagnosis, which is used for improving the coding accuracy by utilizing the hierarchical characteristics of coding.
The invention provides an automatic coding method for disease diagnosis, which comprises the following steps:
obtaining a disease diagnosis according to the target case;
searching according to the disease diagnosis to obtain ICD disease names and codes of a preset number of candidates with highest similarity ranking with the disease diagnosis;
calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name and the chapter, the section, the category and the subgraph codes in the ICD codes;
the ICD code of the candidate with the highest score is determined as the code of the disease diagnosis.
In one embodiment, the retrieving according to the disease diagnosis to obtain ICD disease names and codes of a preset number of candidates with highest similarity to the disease diagnosis includes:
searching the disease name of the clinical version 2.0 of the ICD10 country of the disease diagnosis through a search model to obtain a preset number of candidate ICD disease names and codes with highest ranking of the similarity of the disease diagnosis and search scores corresponding to the preset number of candidate ICD disease names and codes respectively;
and normalizing the search score, and marking the processed search score as score0.
In one embodiment, the disease diagnosis automatic encoding method further comprises:
acquiring a data set of chapters, sections, categories, sub-categories and details in ICD10 disease coding;
inputting the data sets of the chapter, the section, the category and the subgraph into a pre-training medical field model for training in a fine tuning mode to obtain a chapter classification model, a section classification model, a category classification model, a subgraph classification model and an accuracy index P corresponding to the chapter classification model 1 Accuracy index P corresponding to section classification model 2 Accuracy index P corresponding to category classification model 3 Accuracy index P corresponding to sub-order classification model 4 ;
Inputting the data set of the detail intoTraining in the pre-training medical field model to obtain a detail alignment model and an accuracy index P corresponding to the detail alignment model 5 。
In one embodiment, the calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name, and the chapter, section, category, and sub-order codes in the ICD codes includes:
extracting chapter, section, category codes and subgraph codes in the ICD disease codes of the preset number of candidates;
applying the chapter classification model, the section classification model, the category classification model and the subgraph classification model to the disease diagnosis respectively to obtain scores respectively corresponding to chapter, section, category and subgraph codes in ICD disease codes of a preset number of candidates;
normalizing the scores corresponding to the chapter, section, category and subgraph codes in the ICD disease codes of the preset number of candidates respectively;
and marking the scores corresponding to the chapters, the sections, the categories and the sub-category codes in the ICD disease codes of the preset number of candidates after the treatment as score1, score2, score3 and score4 respectively.
In one embodiment, the calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name, and the chapter, section, category, and sub-category codes in the ICD codes further includes:
applying the detail alignment model to the disease diagnosis and the ICD disease names of the preset number of candidates to obtain corresponding scores;
and carrying out normalization processing on the corresponding scores, and marking the score after normalization processing as score5.
In one embodiment, the calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates is calculated by the following formula:
wherein beta is i Being super-parameter, xi is threshold value, and the value range of i is 0 to 5.
The invention also provides an automatic coding device for disease diagnosis, which comprises:
the first acquisition module is used for acquiring disease diagnosis according to the target case;
the retrieval module is used for retrieving according to the disease diagnosis so as to obtain ICD disease names and codes of a preset number of candidates with highest similarity ranking with the disease diagnosis;
a calculation module, configured to calculate scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name, and chapter, section, category, and subgraph codes in the ICD codes;
a determining module, configured to determine the ICD code of the candidate with the highest score as the code of the disease diagnosis.
In one embodiment, the retrieval module comprises:
a retrieval sub-module, configured to retrieve, through a retrieval model, a disease name of clinical version 2.0 of the ICD10 country of the disease diagnosis, so as to obtain a preset number of candidate ICD disease names and codes with highest similarity ranking with the disease diagnosis, and retrieval scores corresponding to the preset number of candidate ICD disease names and codes respectively;
and the first processing sub-module is used for carrying out normalization processing on the search scores and marking the processed search scores as score0.
In one embodiment, the disease diagnosis automatic encoding apparatus further comprises:
the second acquisition module is used for acquiring data sets of chapters, sections, categories, sub-categories and fine categories in ICD10 disease coding;
a first training module for inputting the data sets of chapter, section, category, and subgraph into a pre-training medical field model for training by fine tuning to obtain chapter classification model, section classification model, category classification model, subgraphClassification model and precision index P corresponding to chapter classification model 1 Accuracy index P corresponding to section classification model 2 Accuracy index P corresponding to category classification model 3 Accuracy index P corresponding to sub-order classification model 4 ;
A second training module for inputting the data set of the detail into the pre-training medical field model for training to obtain a detail alignment model and an accuracy index P corresponding to the detail alignment model 5 。
In one embodiment, the computing module includes:
the extraction submodule is used for extracting chapter, section, category and subgraph codes in the ICD disease codes of the preset number of candidates;
the first molecular obtaining module is used for respectively applying the chapter classification model, the section classification model, the category classification model and the subgraph classification model to the disease diagnosis so as to obtain scores respectively corresponding to chapter, section, category and subgraph codes in ICD disease codes of a preset number of candidates;
the second processing submodule is used for carrying out normalization processing on scores corresponding to the chapter, the section, the category and the subgraph codes in the ICD disease codes of the preset number of candidates respectively;
the marking submodule is used for marking the scores corresponding to the chapters, the sections, the categories and the sub-category codes in the processed ICD disease codes with the preset number of candidates as score1, score2, score3 and score4 respectively;
an obtaining sub-module, configured to apply the detail alignment model to the disease diagnosis and the ICD disease names of the preset number of candidates, to obtain a corresponding score;
the third processing sub-module is used for carrying out normalization processing on the corresponding scores and marking the scores after normalization processing as score5;
a calculation sub-module for calculating the disease diagnosis and the ICD-encoded scores of the preset number of candidates by the following formula:
wherein beta is i Being super-parameter, xi is threshold value, and the value range of i is 0 to 5.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flowchart of an automatic encoding method for disease diagnosis according to an embodiment of the present invention;
FIG. 2 is a flowchart of an automatic encoding method for disease diagnosis according to an embodiment of the present invention;
FIG. 3 is a block diagram of an automatic disease diagnosis encoding apparatus according to an embodiment of the present invention;
fig. 4 is a block diagram of an automatic disease diagnosis encoding apparatus according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described below with reference to the accompanying drawings, it being understood that the preferred embodiments described herein are for illustration and explanation of the present invention only, and are not intended to limit the present invention.
FIG. 1 is a flowchart of a disease diagnosis automatic encoding method according to an embodiment of the present invention, as shown in FIG. 1, the method may be implemented as steps S11-S14:
in step S11, a disease diagnosis is acquired from the target case;
in step S12, searching is performed according to the disease diagnosis to obtain ICD disease names and codes of a preset number of candidates with highest similarity rank with the disease diagnosis;
in step S13, calculating the disease diagnosis and the scores of ICD codes of a preset number of candidates according to the disease diagnosis, the ICD disease name and the chapter, the section, the category, the subgraph codes in the ICD codes;
in step S14, the ICD code of the candidate with the highest score is determined as the code for disease diagnosis.
It should be noted that, the steps S11-S14 may also be used to automatically code ICD-9-CM-3 for the operation.
In this embodiment, a disease diagnosis of a patient is acquired according to a target case; searching according to the disease diagnosis to obtain ICD disease names and codes of a preset number of candidates with highest similarity ranking with the disease diagnosis; calculating the disease diagnosis and the scores of ICD codes of a preset number of candidates according to the disease diagnosis, ICD disease names and chapter, section, category and subgraph codes in ICD codes; the candidate ICD code with the highest score is determined as the code for this disease diagnosis.
The beneficial effects of this embodiment lie in: the disease diagnosis is encoded by utilizing the hierarchical characteristics of the codes, namely the chapter, the section, the category and the subgraph codes in the codes, so that the accuracy of the codes is improved.
In one embodiment, as shown in FIG. 2, the above step S12 may be implemented as steps S21-S22 as follows:
in step S21, the disease name of the clinical version 2.0 of the ICD10 country of the disease diagnosis is searched through the search model, so as to obtain a preset number of candidate ICD disease names and codes with highest similarity ranking with the disease diagnosis and search scores corresponding to the preset number of candidate ICD disease names and codes respectively;
in step S22, the search score is normalized, and the search score after the processing is denoted as score0.
In this embodiment, the disease diagnosis is input into a search model (which may be referred to as model 0), and the ICD10 country clinical version 2.0 disease name of the disease diagnosis is searched by the search model, so as to obtain a preset number of candidate ICD disease names and codes with highest similarity ranking with the disease diagnosis and search scores corresponding to the preset number of candidate ICD disease names and codes; and normalizing the search score, and marking the processed search score as score0, wherein the ICD disease names and codes of the preset number of candidates can be 10, but are not limited to 10.
The beneficial effects of this embodiment lie in: and obtaining the ICD codes of the highest-ranking preset number of candidates similar to the disease diagnosis through the retrieval model, so as to ensure that the disease diagnosis codes are more accurate.
In one embodiment, a disease diagnosis automatic encoding method further comprises:
acquiring a data set of chapters, sections, categories, sub-categories and details in ICD10 disease coding;
inputting the data sets of chapters, sections, categories and sub-categories into a pre-trained medical field model for training in a fine tuning mode to obtain a chapter classification model, a section classification model, a category classification model, a sub-category classification model and an accuracy index P corresponding to the chapter classification model 1 Accuracy index P corresponding to section classification model 2 Accuracy index P corresponding to category classification model 3 Accuracy index P corresponding to sub-order classification model 4 ;
Inputting the data set of the detail into a pre-training medical field model for training to obtain a detail alignment model and an accuracy index P corresponding to the detail alignment model 5 。
The pre-training medical field model may be a BERT model, and the chapter classification model, the section classification model, the category classification model, the sub-category classification model, and the detail alignment model may be referred to as model1, model2, model3, model4, and model5, respectively.
In this embodiment, training is performed on the pre-training prediction field model by a fine tuning manner to obtain a chapter classification model, a section classification model, a category classification model, a sub-category classification model and accuracy indexes P1, P2, P3 and P4 respectively corresponding to the chapter classification model, the section classification model, the category classification model and the sub-category classification model, and training is performed on the pre-training medical field model to obtain a detail alignment model and a corresponding accuracy index P5.
The beneficial effects of this embodiment lie in: the number of the fine mesh classification is as high as 3 ten thousand and more than 5 thousand, the effect of the classification model is bad, and the classification model is converted into the classification problem between diagnosis and name when the classification model is aligned, so that the effect can be ensured by obtaining the fine mesh alignment model instead of the classification model through training.
In one embodiment, the step S13 may be implemented, including:
extracting chapter, section, category and subgraph codes in ICD disease codes of a preset number of candidates;
respectively applying a chapter classification model, a section classification model, a category classification model and a subgraph classification model to the disease diagnosis to obtain scores respectively corresponding to chapter, section, category and subgraph codes in ICD disease codes with a preset number of candidates;
normalizing scores corresponding to chapter, section, category and subgraph codes in ICD disease codes of a preset number of candidates respectively;
and marking the scores corresponding to the chapters, sections, categories and sub-category codes in the processed ICD disease codes with the preset number of candidates as score1, score2, score3 and score4 respectively.
In this embodiment, the score1, score2, score3, and score4 are obtained by using the chapter classification model, the section classification model, the category classification model, and the subgraph classification model to obtain the scores corresponding to the candidate ICD codes, the chapter, the category, and the subgraph codes and normalizing the scores.
The beneficial effects of this embodiment lie in: the similarity score between the ICD code and the candidate ICD code in disease diagnosis can be obtained.
In one embodiment, the step S13 may be implemented, and further includes:
applying a detail alignment model to disease diagnosis and ICD disease names of a preset number of candidates to obtain corresponding scores;
and carrying out normalization processing on the corresponding score, and marking the score after normalization processing as score5.
In this embodiment, by using the detail alignment model, a corresponding score is obtained, and normalization processing is performed on the corresponding score to obtain score5.
The beneficial effects of this embodiment lie in: a similarity score for disease diagnosis and candidate ICD disease names can be obtained.
In one embodiment, the scores for disease diagnosis and ICD codes for a preset number of candidates are calculated by the following formula:
wherein beta is i Being super-parameter, xi is threshold value, and the value range of i is 0 to 5.
In this embodiment, the scores of disease diagnosis and ICD codes for a predetermined number of candidates are calculated by the above formula.
The beneficial effects of this embodiment lie in: and calculating the score conditions of the disease diagnosis and ICD codes of a preset number of candidates through a formula, and further obtaining the candidate ICD codes with the highest similarity score with the disease diagnosis.
FIG. 3 is a block diagram of an automatic disease diagnosis encoding apparatus according to an embodiment of the present invention, as shown in FIG. 3, the apparatus may include the following modules:
a first acquisition module 31 for acquiring a disease diagnosis from a target case;
a retrieving module 32, configured to retrieve according to the disease diagnosis, so as to obtain a preset number of candidate ICD disease names and codes with highest similarity to the disease diagnosis;
a calculating module 33, configured to calculate a score of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name, and the chapter, the section, the category, the sub-category codes in the ICD codes;
a determining module 34 is configured to determine the ICD code of the candidate with the highest score as the code for disease diagnosis.
The beneficial effects of this embodiment lie in: the disease diagnosis is encoded by utilizing the hierarchical characteristics of the codes, namely the chapter, the section, the category and the subgraph codes in the codes, so that the accuracy of the codes is improved.
In one embodiment, as shown in FIG. 4, the retrieval module 32 includes:
a retrieving sub-module 41, configured to retrieve, by using a retrieval model, a disease name of clinical version 2.0 of ICD10 country of disease diagnosis, so as to obtain a preset number of candidate ICD disease names and codes with highest similarity ranking with the disease diagnosis, and retrieval scores corresponding to the preset number of candidate ICD disease names and codes respectively;
the first processing sub-module 42 is configured to normalize the search score, and label the processed search score as score0.
The beneficial effects of this embodiment lie in: and obtaining the ICD codes of the highest-ranking preset number of candidates similar to the disease diagnosis through the retrieval model, so as to ensure that the disease diagnosis codes are more accurate.
In one embodiment, an automatic disease diagnosis encoding apparatus further comprises:
the second acquisition module is used for acquiring data sets of chapters, sections, categories, sub-categories and fine categories in ICD10 disease coding;
the first training module is used for inputting the data sets of chapter, section, category and subgraph into the pre-training medical field model in a fine tuning mode to train so as to obtain a chapter classification model, a section classification model, a category classification model, a subgraph classification model and an accuracy index P corresponding to the chapter classification model 1 Accuracy index P corresponding to section classification model 2 Accuracy index P corresponding to category classification model 3 Accuracy index P corresponding to sub-order classification model 4 ;
A second training module for inputting the data set of the detail into the pre-training medical field model for training to obtain a detail alignment model and an accuracy index P corresponding to the detail alignment model 5 。
The beneficial effects of this embodiment lie in: the number of the fine mesh classification is as high as 3 ten thousand and more than 5 thousand, the effect of the classification model is bad, and the classification model is converted into the classification problem between diagnosis and name when the classification model is aligned, so that the effect can be ensured by obtaining the fine mesh alignment model instead of the classification model through training.
In one embodiment, a computing module includes:
the extraction submodule is used for extracting chapter, section, category and subgraph codes in ICD disease codes of a preset number of candidates;
the first molecular obtaining module is used for respectively applying a chapter classification model, a section classification model, a category classification model and a subgraph classification model to disease diagnosis so as to obtain scores respectively corresponding to chapter, section, category and subgraph codes in ICD disease codes with a preset number of candidates;
the second processing submodule is used for carrying out normalization processing on scores corresponding to the chapter, the section, the category and the sub-category codes in the ICD disease codes of the preset number of candidates respectively;
the marking submodule is used for marking the scores corresponding to the chapters, the sections, the categories and the sub-category codes in the processed ICD disease codes with the preset number of candidates as score1, score2, score3 and score4 respectively;
the acquisition sub-module is used for applying a detail alignment model to disease diagnosis and ICD names of a preset number of candidates to acquire corresponding scores;
the third processing sub-module is used for carrying out normalization processing on the corresponding scores and marking the scores after normalization processing as score5;
a calculation sub-module for calculating the disease diagnosis and the ICD code score for a preset number of candidates by the following formula:
wherein beta is i Being super-parameter, xi is threshold value, and the value range of i is 0 to 5.
The beneficial effects of this embodiment lie in: and calculating the score conditions of the disease diagnosis and ICD codes of a preset number of candidates through a formula, and further obtaining the candidate ICD codes with the highest similarity score with the disease diagnosis.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, magnetic disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.
Claims (2)
1. A disease diagnosis automatic coding method, comprising:
obtaining a disease diagnosis according to the target case;
searching according to the disease diagnosis to obtain ICD disease names and ICD codes of a preset number of candidates with highest similarity ranking with the disease diagnosis;
calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name and the chapter, the section, the category and the subgraph codes in the ICD codes;
determining the ICD code of the candidate with the highest score as the code of the disease diagnosis;
the searching according to the disease diagnosis to obtain the ICD disease names and ICD codes of the preset number of candidates with the highest similarity rank with the disease diagnosis comprises the following steps:
retrieving the disease names of the standard international disease classification ICD of the disease diagnosis through a retrieval model to obtain a preset number of candidate ICD disease names and ICD codes with highest similarity ranking with the disease diagnosis, wherein the preset number of candidate ICD disease names and ICD codes respectively correspond to retrieval scores;
normalizing the search score, and marking the processed search score as score0;
the calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name and chapter, section, category and subgraph codes in the ICD codes comprises the following steps:
extracting chapter, section, category and subgraph codes in the ICD codes of the preset number of candidates;
respectively applying a chapter classification model, a section classification model, a category classification model and a subgraph classification model to the disease diagnosis to obtain scores respectively corresponding to chapter, section, category and subgraph codes in ICD codes of a preset number of candidates;
normalizing the scores corresponding to the chapter, section, category and subgraph codes in the ICD codes of the preset number of candidates respectively;
marking the scores corresponding to the chapters, sections, categories and sub-category codes in the ICD codes of the preset number of candidates after processing as score1, score2, score3 and score4 respectively;
the method further comprises the steps of:
acquiring a data set of chapters, sections, categories, sub-categories and details in ICD10 disease coding;
inputting the data sets of the chapter, the section, the category and the subgraph into a pre-training medical field model for training in a fine tuning mode to obtain a chapter classification model, a section classification model, a category classification model, a subgraph classification model and an accuracy index P corresponding to the chapter classification model 1 Accuracy index P corresponding to section classification model 2 Accuracy index P corresponding to category classification model 3 Accuracy index P corresponding to sub-order classification model 4 ;
Inputting the data set of the detail into the pre-training medical field model for training to obtain a detail alignment model and an accuracy index P corresponding to the detail alignment model 5 ;
The calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name and chapter, section, category and subgraph codes in the ICD codes, and the method further comprises:
applying the detail alignment model to the disease diagnosis and the ICD disease names of the preset number of candidates to obtain corresponding scores;
normalizing the corresponding scores, and marking the score after normalization as score5;
the calculating the scores of the disease diagnosis and the ICD codes of the preset number of candidates is calculated by the following formula:
wherein beta is i Being super-parameter, xi is threshold value, and the value range of i is 0 to 5.
2. An automatic disease diagnosis encoding apparatus, comprising:
the first acquisition module is used for acquiring disease diagnosis according to the target case;
the retrieval module is used for retrieving according to the disease diagnosis so as to obtain ICD disease names and ICD codes of a preset number of candidates with highest similarity ranking with the disease diagnosis;
a calculation module, configured to calculate scores of the disease diagnosis and the ICD codes of the preset number of candidates according to the disease diagnosis, the ICD disease name, and chapter, section, category, and subgraph codes in the ICD codes;
a determining module for determining the ICD code of the candidate with the highest score as the code of the disease diagnosis;
the retrieval module comprises:
the retrieval sub-module is used for retrieving the disease names of the standard international disease classification ICD of the disease diagnosis through a retrieval model so as to obtain a preset number of candidate ICD disease names and ICD codes with highest ranking of the disease diagnosis similarity and retrieval scores respectively corresponding to the preset number of candidate ICD disease names and ICD codes;
the first processing sub-module is used for carrying out normalization processing on the search scores and marking the processed search scores as score0;
the computing module comprises:
the extraction submodule is used for extracting chapter, section, category and subgraph codes in the ICD codes of the preset number of candidates;
the first obtaining molecular module is used for respectively applying a chapter classification model, a section classification model, a category classification model and a subgraph classification model to the disease diagnosis so as to obtain scores respectively corresponding to chapter, section, category and subgraph codes in ICD codes of a preset number of candidates;
the second processing submodule is used for carrying out normalization processing on scores corresponding to the chapter, the section, the category and the sub-category codes in the ICD codes of the preset number of candidates respectively;
the marking submodule is used for marking the scores corresponding to the chapters, the sections, the categories and the sub-category codes in the processed ICD codes with the preset number of candidates as score1, score2, score3 and score4 respectively;
an obtaining sub-module, configured to apply a detail alignment model to the disease diagnosis and the ICD disease names of the preset number of candidates, to obtain corresponding scores;
the third processing sub-module is used for carrying out normalization processing on the corresponding scores and marking the scores after normalization processing as score5;
a calculation sub-module for calculating the disease diagnosis and the ICD-encoded scores of the preset number of candidates by the following formula:
wherein beta is i Is super parameter, xi is threshold value, and the value range of i is 0 to 5;
the apparatus further comprises:
the second acquisition module is used for acquiring data sets of chapters, sections, categories, sub-categories and fine categories in ICD10 disease coding;
the first training module is used for inputting the data sets of the chapter, the section, the category and the subgraph into a pre-training medical field model in a fine tuning mode for training so as to obtain a chapter classification model, a section classification model, a category classification model, a subgraph classification model and an accuracy index P corresponding to the chapter classification model 1 Accuracy index P corresponding to section classification model 2 Accuracy index P corresponding to category classification model 3 Accuracy index P corresponding to sub-order classification model 4 ;
A second training module for inputting the data set of the detail into the pre-training medical field model for training to obtain a detail alignment model and an accuracy index P corresponding to the detail alignment model 5 。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911168334.9A CN111180060B (en) | 2019-11-25 | 2019-11-25 | Disease diagnosis automatic coding method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911168334.9A CN111180060B (en) | 2019-11-25 | 2019-11-25 | Disease diagnosis automatic coding method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111180060A CN111180060A (en) | 2020-05-19 |
CN111180060B true CN111180060B (en) | 2023-07-25 |
Family
ID=70657292
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911168334.9A Active CN111180060B (en) | 2019-11-25 | 2019-11-25 | Disease diagnosis automatic coding method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111180060B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111785387B (en) * | 2020-07-02 | 2024-06-11 | 朱玮 | Method and system for classifying disease standardization mapping by using Bert |
CN113593711A (en) * | 2021-08-03 | 2021-11-02 | 中电健康云科技有限公司 | Health management information pushing method based on international disease classification coding |
CN115964472A (en) * | 2021-12-03 | 2023-04-14 | 奥码哈(杭州)医疗科技有限公司 | ICD coding method, ICD coding query method, coding system and query system |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109065157A (en) * | 2018-08-01 | 2018-12-21 | 中国人民解放军第二军医大学 | A kind of Disease Diagnosis Standard coded Recommendation list determines method and system |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11093842B2 (en) * | 2018-02-13 | 2021-08-17 | International Business Machines Corporation | Combining chemical structure data with unstructured data for predictive analytics in a cognitive system |
CN109785959A (en) * | 2018-12-14 | 2019-05-21 | 平安医疗健康管理股份有限公司 | A kind of disease code method and apparatus |
CN109994215A (en) * | 2019-04-25 | 2019-07-09 | 清华大学 | Disease automatic coding system, method, equipment and storage medium |
-
2019
- 2019-11-25 CN CN201911168334.9A patent/CN111180060B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109065157A (en) * | 2018-08-01 | 2018-12-21 | 中国人民解放军第二军医大学 | A kind of Disease Diagnosis Standard coded Recommendation list determines method and system |
Non-Patent Citations (1)
Title |
---|
基于文本分析的自动化疾病编码方法;鲍庆升;程绍银;蒋凡;;计算机系统应用(第12期) * |
Also Published As
Publication number | Publication date |
---|---|
CN111180060A (en) | 2020-05-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111026841B (en) | Automatic coding method and device based on retrieval and deep learning | |
CN111180060B (en) | Disease diagnosis automatic coding method and device | |
CN110032739B (en) | Method and system for extracting named entities of Chinese electronic medical record | |
CN112257422B (en) | Named entity normalization processing method and device, electronic equipment and storage medium | |
CN111949759A (en) | Method and system for retrieving medical record text similarity and computer equipment | |
CN109993227B (en) | Method, system, apparatus and medium for automatically adding international disease classification code | |
CN109994215A (en) | Disease automatic coding system, method, equipment and storage medium | |
CN111177375B (en) | Electronic document classification method and device | |
CN112183104B (en) | Code recommendation method, system, corresponding equipment and storage medium | |
CN116719520B (en) | Code generation method and device | |
CN110852076B (en) | Method and device for automatic disease code conversion | |
CN112052154A (en) | Test case processing method and device | |
CN113012774B (en) | Automatic medical record coding method and device, electronic equipment and storage medium | |
CN111462914B (en) | Entity linking method and device | |
CN110837494B (en) | Method and device for identifying unspecified diagnosis coding errors of medical record home page | |
CN116719840A (en) | Medical information pushing method based on post-medical-record structured processing | |
CN115631823A (en) | Similar case recommendation method and system | |
CN111063430B (en) | Disease prediction method and device | |
CN110851595A (en) | Identification method and device for disease term core vocabulary | |
CN117077598B (en) | 3D parasitic parameter optimization method based on Mini-batch gradient descent method | |
CN117438028A (en) | Medical death certificate generation method, system, terminal and storage medium | |
Chahbandarian et al. | Increasing Alertness while Coding Secondary Diagnostics in the Medical Record. | |
CN118471423A (en) | Medical image report quality control method, device, equipment and medium | |
CN117273001A (en) | Medical record entity extraction method and device | |
CN117893090A (en) | Comprehensive evaluation method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |