CN105069124B - A kind of International Classification of Diseases coding method of automation and system - Google Patents

A kind of International Classification of Diseases coding method of automation and system Download PDF

Info

Publication number
CN105069124B
CN105069124B CN201510496513.0A CN201510496513A CN105069124B CN 105069124 B CN105069124 B CN 105069124B CN 201510496513 A CN201510496513 A CN 201510496513A CN 105069124 B CN105069124 B CN 105069124B
Authority
CN
China
Prior art keywords
term
ontology
disease
character
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510496513.0A
Other languages
Chinese (zh)
Other versions
CN105069124A (en
Inventor
金以东
朱华玲
陈志永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Original Assignee
Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ebaotech Internet Medical Information Technology (beijing) Co Ltd filed Critical Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Priority to CN201510496513.0A priority Critical patent/CN105069124B/en
Publication of CN105069124A publication Critical patent/CN105069124A/en
Application granted granted Critical
Publication of CN105069124B publication Critical patent/CN105069124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems

Abstract

Embodiments of the present invention provide a kind of International Classification of Diseases coding method of automation.This method includes:The Chinese medical diagnosis on disease information of input;Natural language processing is carried out to Chinese medical diagnosis on disease information, obtains one or more titles to be encoded;Based on standard terminology library and expand terminology bank, search the standard terminology to match with title to be encoded or expand term, and by the standard terminology of successful match or expand the coding of term, be determined as the coding of title to be encoded;Wherein, standard terminology is each disease term included in the ICD versions to be referred to, expands subclass disease term or newly generated disease term that term is being commonly called as of standard terminology, nickname or abbreviation or standard terminology.By the method for the above embodiment of the present invention, ICD codings can be automatically performed, without manually participating in, have many advantages, such as that coding rate is high, at low cost, accuracy is high.In addition, embodiments of the present invention additionally provide a kind of International Classification of Diseases coded system of automation.

Description

A kind of International Classification of Diseases coding method of automation and system
Technical field
Embodiments of the present invention are related to classification of diseases field, are compiled more particularly to a kind of International Classification of Diseases of automation Code method and system.
Background technology
International Classification of Diseases (International Classification of Diseases, ICD) is according to disease Certain features, disease is classified according to rule, and with the method for coding come the system that represents, the application in China has More than 20 years.The most popular ICD versions in the whole world are the ICD-10 that World Health Organization WHO was announced in 1992 at present. According to the regulation of WHO, WHO only provides the ICD-10 of 4 codings, and various countries or area can as needed be extended ICD-10 Form localization version (such as disease quantity can be increased by adding extended code).
ICD causes disease term to be able to standardize and format, and is the application base of medical information, managing medical information Therefore plinth and the important evidence of medical insurance clearing, effectively have the development of health care system using ICD very heavy The effect wanted.
In ICD application fields, it is broadly divided into h coding and Computer-Aided Coding two ways at present.In China, people Work coding mode is used till today always, and the Record room of large hospital is designed with professional coder post, passes through academic program and training Coding criterion can be based on, is inquired, is selected with the same or similar coding of the diagnostic result of doctor by dictionary library.With net Network and information-based development, Computer-Aided Coding have become the hot spot in the field, and with very strong development potentiality, domestic mesh It is preceding to be configured in information system mostly using structure classification of diseases path and code database, it is automatic according to the diagnostic result being manually entered It guides and recommends coding, by manually carrying out selection confirmation.
Invention content
Either current h coding's mode or Computer-Aided Coding mode are required for manually participation could be complete Into, and this artificial participation process there are efficiency it is low, of high cost the shortcomings that, and different people participate in may export different volumes Code is as a result, be unfavorable for the progress of the work such as managing medical information, the audit that medical insurance is settled accounts.
In addition, since the Chinese medical diagnosis on disease information of doctor's input belongs to natural language, form complexity is various, does not unify Standard (for example, using multilingual mixing expression, using grammer lack of standardization, typing having false information, using abbreviation or being commonly called as Instead of being mingled with gibberish such as symbol etc. in standard terminology, word) so that coding difficulty further increases, and error rate is also more It is high.
Thus, it is also very desirable to a kind of improved ICD coding modes.
In the present context, embodiments of the present invention are intended to provide a kind of International Classification of Diseases coding method of automation And system.
In the first aspect of embodiment of the present invention, a kind of International Classification of Diseases coding method of automation is provided, Including:
Step 1, Chinese medical diagnosis on disease information is inputted;
Step 2, natural language processing is carried out to the Chinese medical diagnosis on disease information, obtains one or more names to be encoded Claim;
Step 3, based on standard terminology library and expansion terminology bank, the standard terminology to match with the title to be encoded is searched Or expand term, and by the standard terminology of successful match or expand the coding of term, it is determined as the coding of the title to be encoded;
Wherein, the standard terminology library creates as follows:
The determining International Classification of Diseases ICD versions to be referred to;
The each disease term that will be included in the ICD versions to be referred to, is determined as standard terminology;
According to the ICD versions to be referred to, the coding of each standard terminology is determined;
The standard terminology and its coding are stored, obtains standard terminology library;
Wherein, the expansion terminology bank creates as follows:
The following various types being not included in the ICD versions to be referred to are determined as to expand term:The mark Quasi- term be commonly called as nickname abbreviation, the subclass disease term of the standard terminology and in the ICD the to be referred to versions Newly generated disease term after this announcement;
When it is described expansion term for any one of standard terminology be commonly called as nickname abbreviation when, by the standard terminology Coding assign the expansion term;
When subclass disease term of the expansion term for any one of standard terminology or the newly generated disease During term, the expansion term will be assigned with the coding of the immediate standard terminology of the relation of genus and species of the expansion term;
The expansion term and its coding are stored, obtains expanding terminology bank.
In the second aspect of embodiment of the present invention, a kind of International Classification of Diseases coded system of automation is provided, Including:
Standard terminology library creation module, for according to the International Classification of Diseases version to be referred to, to be referred to described ICD versions in each disease term for including, be determined as standard terminology;According to the ICD versions to be referred to, determine every The coding of one standard terminology;The standard terminology and its coding are stored, obtains standard terminology library;
Expand terminology bank creation module, it is following various types of in the ICD versions to be referred to for that will be not included in Type is determined as expanding term:The standard terminology be commonly called as nickname abbreviation, the subclass disease term of the standard terminology, with And the newly generated disease term after the ICD versions to be referred to are announced;Judge the expansion term for any one The standard terminology be commonly called as nickname abbreviation when, assign the coding of the standard terminology to the expansion term;Judge the expansion It, will be with the expansion when filling the subclass disease term or the newly generated disease term that term is any one of standard terminology The coding of the immediate standard terminology of relation of genus and species of term assigns the expansion term;The expansion term and its coding are stored, It obtains expanding terminology bank;
Import modul, for inputting Chinese medical diagnosis on disease information;
Data processing module for carrying out natural language processing to the Chinese medical diagnosis on disease information, obtains one or more A title to be encoded;
Coding module for being based on the standard terminology library and the expansion terminology bank, is searched and the title to be encoded The standard terminology that matches expands term, and by the standard terminology of successful match or expands the coding of term, is determined as described The coding of title to be encoded.
According to the International Classification of Diseases coding method of embodiment of the present invention and system, the present invention has fully considered that doctor is defeated The Chinese medical diagnosis on disease information entered belongs to the features such as natural language, form complexity are various, without unified standard, using in advance according to Chinese surgical procedure information character string is matched according to a variety of dictionaries that ICD-9-CM-3 is established, so as to automatic, quick, accurate It really identifies surgical procedure title and it is encoded, whole process can be automatically performed ICD codings, nothing without manually participating in It need to manually participate in, improve coding rate, reduce coding cost, and ensure that coding accuracy.
Description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, embodiment will be described below Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description be only the present invention some Embodiment, for those of ordinary skill in the art, without creative efforts, can also be attached according to these Figure obtains other attached drawings.
Fig. 1 schematically shows the application scenarios that embodiments of the present invention can be implemented within;
Fig. 2A schematically shows the flow diagram of ICD coding methods in illustrative methods of the present invention;
Fig. 2 B schematically show the flow diagram that standard terminology library is created in illustrative methods of the present invention;
Fig. 2 C schematically show the standard terminology library of data sheet form in illustrative methods of the present invention;
Fig. 2 D schematically show the flow diagram for being created in illustrative methods of the present invention and expanding terminology bank;
Fig. 2 E schematically show the expansion terminology bank of data sheet form in illustrative methods of the present invention;
Fig. 3 A schematically show the flow diagram of ICD coding methods in the embodiment of the present invention one;
Fig. 3 B schematically show the flow diagram that Hypothetical classification terminology bank is created in the embodiment of the present invention one;
Fig. 3 C schematically show the Hypothetical classification terminology bank of data sheet form in the embodiment of the present invention one;
Fig. 4 A schematically show the flow diagram of ICD coding methods in the embodiment of the present invention two;
Fig. 4 B schematically show the flow diagram that odd encoder terminology bank is created in the embodiment of the present invention two;
Fig. 4 C schematically show the odd encoder terminology bank of data sheet form in the embodiment of the present invention two;
Fig. 5 A schematically show the flow diagram of ICD coding methods in the embodiment of the present invention three;
Fig. 5 B schematically show the flow diagram for being created in the embodiment of the present invention three and merging terminology bank;
Fig. 5 C schematically show the merging terminology bank of data sheet form in the embodiment of the present invention three;
Fig. 6 A schematically show the flow diagram of ICD coding methods in the embodiment of the present invention four;
Fig. 6 B schematically show data sheet form in the embodiment of the present invention four without encryption description library;
Fig. 7 schematically shows the structure diagram of ICD coded systems in example devices of the present invention;
Fig. 8 schematically shows the structure diagram of another kind ICD coded systems in example devices of the present invention;
Fig. 9 schematically shows the structure diagram of another ICD coded system in example devices of the present invention;
Figure 10 schematically shows the structure diagram of another ICD coded system in example devices of the present invention;
Figure 11 schematically shows the structure diagram of another ICD coded system in example devices of the present invention;
Figure 12 A schematically show in the embodiment of the present invention five and carry out natural language processing to Chinese medical diagnosis on disease information Flow chart;
Figure 12 B schematically show the part disease degree term that disease degree glossary includes;
Figure 12 C schematically show the concurrent term of part disease that the concurrent glossary of disease includes;
Figure 12 D schematically show the site morbidity position term that site of pathological change glossary includes;
Figure 12 E schematically show cutting first kind substring and the sub- word of Second Type in the embodiment of the present invention five Accord with the flow chart of string;
Figure 12 F schematically show a kind of segmentation rules;
Figure 12 G schematically show another segmentation rules;
Figure 12 H schematically show another segmentation rules;
Figure 12 I schematically show another segmentation rules;
Figure 12 J schematically show another segmentation rules;
Figure 12 K schematically show another segmentation rules;
Figure 13 schematically show searched in the embodiment of the present invention six standard terminology that matches with title to be encoded or Expand the flow chart of term.
In the accompanying drawings, identical or corresponding label represents identical or corresponding part.
Specific embodiment
The principle and spirit of the invention are described below with reference to several illustrative embodiments.It should be appreciated that provide this A little embodiments are not with any just for the sake of better understood when those skilled in the art and then realize the present invention Mode limits the scope of the invention.On the contrary, these embodiments are provided so that the disclosure is more thorough and complete, and energy It is enough that the scope of the present disclosure is completely communicated to those skilled in the art.
Art technology technical staff knows, embodiments of the present invention can be implemented as a kind of system, device, equipment, Method or computer program product.Therefore, the disclosure can be with specific implementation is as follows, i.e.,:It is complete hardware, complete soft The form that part (including firmware, resident software, microcode etc.) or hardware and software combine.
According to the embodiment of the present invention, it is proposed that a kind of International Classification of Diseases coding method of automation and system.
Herein, it is to be understood that herein referred " clinic " refers to that doctor parent is preclinical to diagnosing patient and treatment Disease refers to the business practice of medical institutions.
In addition, any number of elements in attached drawing be used to example and it is unrestricted and it is any name be only used for distinguishing, Without any restrictions meaning.
Below with reference to several representative embodiments of the present invention, the principle and spirit of the invention are illustrated in detail.
Summary of the invention
The inventors discovered that in medical domain, different geographical, non-commensurate or different practitioners are using disease term When, disease terminology standard different (such as same disease term has various statement) and disease art used by generally existing Language standard covering not comprehensively (such as newborn term cannot be covered) situations such as, cause generate Chinese medical diagnosis on disease information (such as The information that basic medical insurance advice of settlement is recorded) in there is a large amount of disease term without ready patterns to follow, to based on Chinese medical diagnosis on disease The ICD coding works of information bring great obstacle, in this case must be by means of manually differentiating these diseases without ready patterns to follow Sick term, that is, currently used h coding's mode or Computer-Aided Coding mode, but this have the ICD manually participated in The shortcomings of coding mode there are low, of high cost, the different people of efficiency participates in that different coding results may be exported.
For this purpose, a kind of ICD encoding mechanisms the present invention provides automation.ICD cataloged procedures can be:The Chinese disease of input Sick diagnostic message;Natural language processing is carried out to Chinese medical diagnosis on disease information, obtains one or more titles to be encoded;Based on mark Quasi- terminology bank and expand terminology bank, search the standard terminology to match with title to be encoded or expand term, and by successful match Standard terminology or expand the coding of term, be determined as the coding of the title to be encoded.
Wherein, standard terminology library is created according to the ICD versions to be referred to, including standard terminology and its coding, standard art Language is the disease term included in the ICD versions to be referred to, and standard terminology is encoded with it in the ICD versions to be referred to Coding it is consistent.Expand terminology bank to include expanding term and its coding, it is the ICD versions for being not included in being referred to expand term In following various types:Standard terminology be commonly called as nickname abbreviation, the subclass disease term of standard terminology or to be joined Newly generated disease term after the ICD versions examined are announced, that expands term is encoded to the standard terminology synonymous with expanding term Coding or for expand term the immediate standard terminology of relation of genus and species coding.
In the present invention, standard terminology library cover whole disease terms described in the ICD versions to be referred to and its Coding expands terminology bank and then covers some diseases term being not included in the ICD versions to be referred to, these disease terms It is commonly called as including some areas or the disease that is commonly used of units, disease term described in nickname or abbreviation or ICD versions Subclass disease term or the newly generated some diseases term with the development of medical technology.Standard terminology library and expansion term Library covers the disease term being likely to occur in most Chinese medical diagnosis on disease information, has substantially met automatic resolution Chinese disease The requirement of disease term in sick diagnostic message, so that the ICD codings of automation are achieved.Entire ICD cataloged procedures, Without manually participating in, have many advantages, such as that coding rate is fast, at low cost, accuracy is high.
After the basic principle for describing the present invention, lower mask body introduces the various nonrestrictive embodiment party of the present invention Formula.
Application scenarios overview
Referring initially to Fig. 1, it illustrates the application scenarios that embodiments of the present invention can be implemented within.
Scene shown in Fig. 1 includes medical information processing terminal 100 and medical information processing server 200.Medical treatment letter It can be the equipment such as desktop computer, laptop, tablet computer, personal digital assistant used in doctor to cease processing terminal 100. Medical information processing server 200 can be server for running hospital information management system etc..Medical information processing terminal 100 Between medical information processing server 200 such as can by hospital lan be communicatively coupled.
When needing to carry out ICD codings based on Chinese medical diagnosis on disease information, can at medical information processing terminal 100, More specifically, be, for example, on the software interface installed on medical information processing terminal 100 the Chinese medical diagnosis on disease information of input or Person imports large batch of Chinese disease using data storage devices such as USB flash disk, mobile hard disks into medical information processing terminal 100 Diagnostic message.Medical information processing server 200 receives these Chinese medical diagnosis on disease information, and by examining these Chinese diseases Disconnected information carries out natural language processing, obtains title to be encoded;Then, medical information processing server 200 is based on standard terminology Library and expand terminology bank, the standard terminology or expand term that inquiry matches with title to be encoded, most at last with title to be encoded The coding of the standard terminology to match or the coding of expansion term to match with title to be encoded, are determined as title to be encoded Coding.
Illustrative methods
With reference to the application scenarios of Fig. 1, described with reference to figure 2A to Fig. 2 E according to exemplary embodiment of the invention ICD coding methods.
Understand spirit and principles of the present invention it should be noted that above application scene is for only for ease of and show, this The embodiment of invention is unrestricted in this regard.On the contrary, embodiments of the present invention can be applied to it is applicable any Scene.
It is the flow chart of ICD coding methods of an embodiment of the present invention and standard terminology library for example, with reference to shown in Fig. 2A With expansion terminology bank.
As shown in Figure 2 A, which can include:
Step S101 inputs Chinese medical diagnosis on disease information.
Optionally, Chinese medical diagnosis on disease information can be the medical record information or basic medical of medical worker's input Insure the information described in advice of settlement.
Step S102 carries out natural language processing to Chinese medical diagnosis on disease information, obtains one or more titles to be encoded.
Specifically, which can be based on the characteristics of Chinese medical diagnosis on disease information, and machine is carried out to Chinese medical diagnosis on disease information The processing such as tool participle, and then parse disease term from Chinese medical diagnosis on disease information, these are from the Chinese medical diagnosis on disease information In the disease term that parses be title to be encoded.
It will introduce how this illustrative methods carries out natural language to Chinese medical diagnosis on disease information by embodiment five below A kind of specific embodiment of processing.
Step S103 based on standard terminology library and expands terminology bank, searches the standard terminology to match with title to be encoded Or expand term, and by the standard terminology of successful match or expand the coding of term, it is determined as the coding of title to be encoded.
In the present embodiment, standard terminology library according to as shown in Figure 2 B the step of create:
Step A1 determines the International Classification of Diseases ICD versions to be referred to.
Optionally, the International Classification of Diseases ICD versions to be referred to can be WHO announce ICD versions (such as WHO in The ICD-10 announced in 1992) or the various localization ICD versions that are extended of ICD versions announced WHO (such as ministry of Health of China recommend ICD-10 Chinese editions).When it is implemented, suitable ICD can be selected according to actual needs As reference, this is not limited by the present invention for version.
Step A2, each disease term that will be included in the ICD versions to be referred to, is determined as standard terminology.
Step A3 according to the ICD versions to be referred to, determines the coding of each standard terminology.
Specifically, it is each to mark since the coding of each disease term being expressly recited in the ICD versions to be referred to The coding of quasi- term directly can be determined therefrom.
Step A4 stores standard terminology and its coding, obtains standard terminology library.
Optionally, the form storage standard terminology and its coding of tables of data or tree structure may be used in standard terminology library.
ICD is according to classification, the relationships such as belongs to kind and record disease term, and classification between disease term, the relationships such as to belong to kind advantageous In the speed for improving lookup specified disease term.Based on this case, when creating standard terminology library, it is possible to according to being joined The relationships such as the classification of each disease term, category kind create tables of data or tree structure in the ICD versions examined, so that standard terminology library The standard terminology of middle storage is clear in structure, convenient for searching, and is conducive to improve the speed for matching title to be encoded.
It optionally, can be with real time modifying standard terminology library, for example, when referenced ICD versions have new more new version When, according to more new version, increase, change or delete standard terminology, so that standard terminology library more meets the needs of ICD codings.
Fig. 2 C show the standard terminology library of a tree structure form of the present embodiment.
In the present embodiment, expand terminology bank and created according to the step of such as Fig. 2 D:
The following various types being not included in the ICD versions to be referred to are determined as expanding term by step B1: The standard terminology be commonly called as nickname abbreviation, the subclass disease term of the standard terminology and to be referred to described ICD versions announce after newly generated disease term.
In medical domain, different geographical, non-commensurate or different practitioners may not be when using disease term It is the disease term (i.e. standard terminology) described in ICD versions, but being commonly called as of standard terminology, nickname or abbreviation, either Title (i.e. subclass disease name) that standard terminology more refines etc.;In addition, with the development of medical technology, can constantly there be new disease Sick term occurs, and the ICD versions issued in the past just will appear the phenomenon that not covering newly generated disease term.In view of these Situation can implement the specific region of this method or concrete unit, and the standard terminology used in statistics real work is commonly called as, not Title or abbreviation, and newly generated disease term is counted, expand in terminology bank using these as term deposit is expanded, to meet The needs of ICD codings.
Step B2, when expanding term as being commonly called as of any one standard terminology, nickname or abbreviation, by the standard terminology Coding assign the expansion term;When subclass disease term of the expansion term for any one standard terminology or the new production During raw disease term, the expansion term will be assigned with the coding of the immediate standard terminology of the relation of genus and species of the expansion term.
When to expand term be being commonly called as of standard terminology, nickname or abbreviation, it is synonymy to expand term with standard terminology, It therefore, can be directly using the coding of standard terminology as the coding for expanding term.
When expansion term is the subclass disease term of any one standard terminology, in order to encode needs, it can be passed through according to clinic It tests, the immediate standard terminology of relation of genus and species with subclass disease term is determined, using the coding of the standard terminology as its subclass The coding of disease term.
It, can basis in order to encode needs since the ICD versions issued now cannot cover newly generated disease term in the past Clinical experience searches the immediate standard terminology of relation of genus and species with these newly generated disease terms, the standard that will be found Coding of the coding of term as these newly generated disease terms.
Step B3 stores the expansion term and its coding, obtains expanding terminology bank.
Optionally, expand tables of data or tree structure may be used in terminology bank form storage expansion term and its coding.
Optionally, can terminology bank be expanded with real time modifying, for example, increase being commonly called as of standard terminology, nickname or abbreviation, Increase newly generated disease term, so that expanding terminology bank covers more needs for expanding term, meeting ICD codings.
Fig. 2 E show the expansion terminology bank of a data sheet form of the present embodiment, and dash area is illustrates in Fig. 2 E Content can be not present in practical expansion terminology bank.
Optionally, during specific implementation step S103, traversal standard terminology library may be used and expand the mode of terminology bank, look into It looks for and the standard terminology of name-matches to be encoded or expansion term.May be higher in view of the time cost for traversing terminology bank, it can Selection of land can also judge the possible relation of genus and species of title to be encoded first according to the semanteme of title to be encoded, then arrive specific number It being capable of matched standard terminology or expansion term according to being searched in table or tree structure.
It will introduce how this illustrative methods searches the standard art to match with title to be encoded by embodiment six below Language or a kind of specific embodiment for expanding term.
In the present embodiment, standard terminology library and expansion terminology bank cover may in most Chinese medical diagnosis on disease information The disease term of appearance has substantially met the requirement of the disease term in automatic resolution Chinese medical diagnosis on disease information, so that The ICD codings of automation are achieved.ICD coding methods provided in this embodiment without manually participating in, have coding rate Soon, at low cost, the advantages that accuracy is high.
Embodiment one
It is the ICD coding methods of one embodiment of the invention referring to shown in Fig. 3 A to Fig. 3 C.
As shown in Figure 3A, which can include:
Step S201 inputs Chinese medical diagnosis on disease information.
Step S202 carries out natural language processing to the Chinese medical diagnosis on disease information, obtains one or more to be encoded Title.
Step S203 based on standard terminology library, expands terminology bank and Hypothetical classification terminology bank, searches and the name to be encoded Claim the standard terminology to match, expand term or Hypothetical classification term, and by the standard terminology of successful match, expand term or vacation Surely the coding of classification term is determined as the coding of the title to be encoded.
Wherein, the present embodiment creates standard terminology library using identical method with illustrative methods and expands terminology bank, this Place repeats no more.
In the present embodiment, it is assumed that classification terminology bank according to as shown in Figure 3B the step of create:
Step C1 will be not included in the ICD versions to be referred to and related and clinical to any one standard terminology It is upper acquiescence be equal to the standard terminology and not the standard terminology be commonly called as nickname abbreviation disease term, be determined as vacation Surely classify term.
Step C2 by the coding with the relevant standard terminology of Hypothetical classification term, assigns the Hypothetical classification term.
Medical domain often has such case appearance:For certain disease there are many type, one of which is clinical common class Type, other are rare clinical type, and in this case, medical worker is filling in or reading medical record often by the general designation of the disease Acquiescence is equal to the title of this clinical common type, when being diagnosed as those rare clinical types, then can write clinic clearly The title of rare type.For example, mitral stenosis is divided into rheumatic mitral stenosis and non-rheumatic mitral stenosis, clinically Commonly rheumatic mitral stenosis, non-rheumatic mitral stenosis is then very rare, and medical worker is filling in or reading disease " mitral stenosis " acquiescence is usually equal to " rheumatic mitral stenosis " during case, and it is only sharp when being diagnosed as non-rheumatic two When valve is narrow, " non-rheumatic mitral stenosis " can be used when filling in medical record, to distinguish.
However may be without recording the general designation of this disease in ICD, but its various concrete type is described, for example, ICD In without recording " mitral stenosis " this disease term, but describe " rheumatic mitral stenosis " and " non-rheumatic two Cusp is narrow ".In this case, when being referred to as progress ICD codings based on the disease occurred in Chinese medical diagnosis on disease information, will go out The situation of any concrete type is not known to be classified as now.
In the present embodiment, the general designation of disease in more than such case is determined as Hypothetical classification term.
When carrying out ICD codings, if encountering this kind of Hypothetical classification term, the clinic that can be assumed to the disease is common Type, and assign the coding of the clinical common type of the disease to the Hypothetical classification term.
For example it is assumed that classification term is " mitral stenosis ", coding is identical with the coding of " rheumatic mitral stenosis ".
Step C3 stores Hypothetical classification term and its coding, obtains Hypothetical classification terminology bank.
Optionally, it is assumed that classification terminology bank may be used tables of data or tree structure form storage Hypothetical classification term and It is encoded.
Optionally, Hypothetical classification terminology bank can also be revised in real time, for example, increasing new Hypothetical classification term or deleting Existing Hypothetical classification term, so that Hypothetical classification terminology bank more meets the needs of ICD codings.
Fig. 3 C show the Hypothetical classification terminology bank of a data sheet form of the present embodiment, and dash area is explains in Fig. 3 C Description can be not present in practical Hypothetical classification terminology bank.
Optionally, during specific implementation step S203, traversal standard terminology library may be used, expand terminology bank and Hypothetical classification The mode of terminology bank is searched and the standard terminology of name-matches to be encoded or expansion term or Hypothetical classification term.
In view of traversing the time cost of terminology bank, alternatively it is also possible to first judge to treat according to the semanteme of title to be encoded The possible relation of genus and species of encoding name, then to search in specific tables of data or tree structure can matched standard terminology or Expand term or Hypothetical classification term.
The present embodiment increases Hypothetical classification terminology bank in standard terminology library and on the basis of expanding terminology bank, by The Hypothetical classification term occurred in literary medical diagnosis on disease information is taken into account, and more broadly covers in Chinese medical diagnosis on disease information The disease term being likely to occur provides more complete for the automatic disease term differentiated in Chinese medical diagnosis on disease information of satisfaction Basis, be advantageously implemented automation ICD coding.ICD coding methods provided in this embodiment, without manually participating in, have The advantages that coding rate is fast, at low cost, accuracy is high.
Embodiment two
It is the ICD coding methods of one embodiment of the invention referring to shown in Fig. 4 A to Fig. 4 B.
As shown in Figure 4 A, which can include:
Step S301 inputs Chinese medical diagnosis on disease information.
Step S302 carries out natural language processing to Chinese medical diagnosis on disease information, obtains one or more titles to be encoded.
Step S303 based on standard terminology library, expands terminology bank and odd encoder terminology bank, searches and title phase to be encoded The standard terminology matched expands term or odd encoder term, and by the standard terminology of successful match, expand term or odd encoder term Coding, be determined as the coding of title to be encoded.
Wherein, the present embodiment creates standard terminology library using identical method with illustrative methods and expands terminology bank, this Place repeats no more.
Optionally, which is also based on Hypothetical classification terminology bank, searches the vacation to match with the title to be encoded Surely classify term, and by the coding of the Hypothetical classification term of successful match, be determined as the coding of title to be encoded, wherein, this reality The method establishment Hypothetical classification terminology bank that example can be identical with one use of embodiment is applied, details are not described herein again.
In the present embodiment, odd encoder terminology bank according to as shown in Figure 4 B the step of create:
Step D1 will be not included in the ICD versions to be referred to and by at least two different standard arts The disease term of language composition, is determined as odd encoder term.
Step D2 will form the coded combination of whole standard terminologies of the odd encoder term together, as described more The coding of encryption description.
For medical domain often there are many situation that disease concurrently occurs, corresponding disease term may be multiple standard arts The result that language is combined.Under the circumstances, the present embodiment is stored in more using this kind of disease term as odd encoder term In encryption description library, and according to the sequence for the multiple standard terminologies for forming the odd encoder term, by the volume of this multiple standard terminology Coding as the odd encoder term after code combines successively.
Such as odd encoder term " mitral stenosis merges auricular fibrillation with left atrial thrombus ", form the odd encoder term Multiple standard terminologies be respectively " mitral stenosis ", " auricular fibrillation ", " atrial thrombus ", wherein, the ICD of " mitral stenosis " I05.000 is encoded to, the ICD of " auricular fibrillation " is encoded to I487.x01, and the ICD of " atrial thrombus " is encoded to I51.302, then The ICD of " mitral stenosis merges auricular fibrillation with left atrial thrombus " is encoded to I05.0I487.x01I51.302.
Step D3 stores the odd encoder term and its coding, obtains odd encoder terminology bank.
Optionally, the form storage odd encoder term and its volume of tables of data or tree structure may be used in odd encoder terminology bank Code.
Optionally, odd encoder terminology bank can also be revised in real time, had for example, increasing new odd encoder term or deleting Odd encoder term so that odd encoder terminology bank more meet ICD coding needs.
Fig. 4 C show the odd encoder terminology bank of a data sheet form of the present embodiment, and dash area is explains in Fig. 4 C Bright content can be not present in practical Hypothetical classification terminology bank.
Optionally, during specific implementation step S303, traversal standard terminology library may be used, expand terminology bank and odd encoder art The mode of repertorie is searched and the standard terminology of name-matches to be encoded or expansion term or odd encoder term.In view of traversing art The time cost of repertorie, alternatively it is also possible to first judge that title to be encoded is possible according to the semanteme of title to be encoded and belong to kind of a pass System, then being capable of matched standard terminology or expansion term or odd encoder art to lookup in specific tables of data or tree structure Language.
The present embodiment increases odd encoder terminology bank in standard terminology library and on the basis of expanding terminology bank, will be Chinese The odd encoder term occurred in medical diagnosis on disease information is taken into account, and more broadly covering may in Chinese medical diagnosis on disease information The disease term of appearance provides more complete base to meet the disease term in the Chinese medical diagnosis on disease information of automatic resolution Plinth is advantageously implemented the ICD codings of automation.ICD coding methods provided in this embodiment without manually participating in, have coding The advantages that speed is fast, at low cost, accuracy is high.
Embodiment three
It is the ICD coding methods of one embodiment of the invention referring to shown in Fig. 5 A to Fig. 5 B.
As shown in Figure 5A, which can include:
Step S401 inputs Chinese medical diagnosis on disease information.
Step S402 carries out natural language processing to Chinese medical diagnosis on disease information, obtains one or more titles to be encoded.
Step S403 based on terminology bank is merged, locates the one or more title to be encoded that step S402 is obtained in advance Reason, judges in one or more of titles to be encoded, if includes any one or more whole merging pair for merging term As if comprising any one or more whole combining objects for merging term are substituted for corresponding merging term.
In the present embodiment, merge terminology bank according to as shown in Figure 5 B the step of create:
The single standard terminology that at least two while the standard terminology occurred can be substituted is determined as merging art by step E1 Language is determined as merging term;And each in at least two different standard terminology is determined as to the conjunction of the merging term And object.
Step E2 according to the ICD versions to be referred to, determines each coding for merging term.
Step E3 stores the whole combining objects for merging term and its coding and the merging term, is closed And terminology bank.
In ICD, if multiple disease terms occur simultaneously, can by another disease term come substitute these and meanwhile occur Multiple disease terms, when ICD is encoded, ICD provides only to export the coding of the single disease term.In the present embodiment, Case above will be belonged to, other single disease terms of disease term that are multiple while occurring can be substituted and be determined as merging art Language, and each the disease term that can be substituted is determined as combining objects.
For example, in disease category, it, can be by " gastric ulcer if " gastric ulcer " occurs simultaneously with " upper gastrointestinal bleeding " With bleeding " it substitutes, when ICD is encoded, only need the coding of output " gastric ulcer is with bleeding ".
In view of case above, the present embodiment obtains one carrying out natural language processing to Chinese medical diagnosis on disease information Or after multiple titles to be encoded, increase the step of being pre-processed to these titles to be encoded, that is, search in these titles to be encoded With the presence or absence of the combining objects that can be substituted, if wherein comprising the corresponding whole combining objects of some merging term, profit Its whole combining objects is substituted with this merging term.
Optionally, merge tables of data or tree structure may be used in terminology bank form storage merging term and its coding.
Optionally, can terminology bank be merged with real time modifying, for example, when referenced ICD versions have new more new version When, according to more new version, increase, change or delete merging term, so as to merge the needs that terminology bank more meets ICD codings.
Fig. 5 C show the merging terminology bank of a data sheet form of the present embodiment, and dash area is illustrates in Fig. 5 C Content can be not present in practical merging terminology bank.
Step S404 based on the standard terminology library created, expands terminology bank, Hypothetical classification terminology bank and odd encoder term Library, search the standard terminology to match with the pretreated titles to be encoded of step S403, expand term, Hypothetical classification term or Odd encoder term, and by the standard terminology of successful match, expand the coding of term, Hypothetical classification term or odd encoder term, really It is set to the coding of title to be encoded.
Wherein, the present embodiment creates standard terminology library using identical method with illustrative methods and expands terminology bank, with Embodiment one creates Hypothetical classification terminology bank using identical method, and the method identical with two use of embodiment creates odd encoder art Repertorie repeats no more herein.
Optionally, during specific implementation step S403, traversal standard terminology library may be used, expand terminology bank, Hypothetical classification The mode of terminology bank and odd encoder terminology bank is searched and the standard terminology of name-matches to be encoded or expansion term or Hypothetical classification Terminology bank or odd encoder term.In view of traversing the time cost of terminology bank, alternatively it is also possible to first according to title to be encoded Semanteme judges the possible relation of genus and species of title to be encoded, then to searching and can match in specific tables of data or tree structure Standard terminology or expand term or Hypothetical classification term or odd encoder term.
The present embodiment increases merging terminology bank in standard terminology library and on the basis of expanding terminology bank, by Chinese disease The merging term occurred in sick diagnostic message is taken into account, and more broadly covers and is likely to occur in Chinese medical diagnosis on disease information Disease term, for meet it is automatic differentiate the disease term in Chinese medical diagnosis on disease information and provide more complete basis, It is advantageously implemented the ICD codings of automation.ICD coding methods provided in this embodiment without manually participating in, have coding rate Soon, at low cost, the advantages that accuracy is high.
Example IV
It is the ICD coding methods of one embodiment of the invention referring to shown in Fig. 6 A.
As shown in Figure 6A, which can include:
Step S501 inputs Chinese medical diagnosis on disease information.
Step S502 carries out natural language processing to Chinese medical diagnosis on disease information, obtains one or more titles to be encoded.
Step S503 based on terminology bank is merged, locates the one or more title to be encoded that step S502 is obtained in advance Reason is judged in one or more titles to be encoded, if comprising any one or more whole combining objects for merging term, if Comprising, then by it is any one or more merge terms whole combining objects be substituted for corresponding merging term.
Step S504, based on standard terminology library, expand terminology bank, Hypothetical classification terminology bank, odd encoder terminology bank, search with Standard terminology that title to be encoded matches expands term, Hypothetical classification term, odd encoder term, and by the mark of successful match Quasi- term, the coding for expanding term, Hypothetical classification term, odd encoder term are determined as the coding of title to be encoded;It will not search To match standard terminology, expand term, Hypothetical classification term, odd encoder term title to be encoded, be determined as not determining The title to be encoded of coding;
Wherein, the present embodiment creates standard terminology library using identical method with illustrative methods and expands terminology bank, with Embodiment one creates Hypothetical classification terminology bank using identical method, and the method identical with two use of embodiment creates odd encoder art Repertorie repeats no more herein.
Step S505, will do not determine coding title to be encoded in no encryption description library without encryption description progress Match, if successful match, perform preset processing step to represent not determining this title to be encoded of coding encodes (such as output is sky, alternatively, the character informations such as display " no code can be compiled "), if it fails to match, treating for coding is not determined by this Encoding name is sent to artificial treatment platform and carries out artificial treatment.
In the present embodiment, no encryption description library includes several no encryption descriptions.These include without encryption description:In preset Cure class term;The preset terms of surgery operation;Preset nomenclature of drug term;Preset medical treatment consumptive materials term;It is and preset It checks and examines term.
Fig. 6 B show a data sheet form of the present embodiment without encryption description library, dash area is explains in Fig. 6 B Bright content can be not present in practical no encryption description library.
Often it is related to a variety of concepts of medical field, not only disease art in practical Chinese medical diagnosis on disease information Language, it is also possible to which the terms of surgery operation, medical treatment consumptive materials term, check and examine term etc., but the present invention is nomenclature of drug term It is related to the sorting code number to disease, and not to the terms of surgery operation, nomenclature of drug art in International Classification of Diseases ICD versions Language, medical treatment consumptive materials term check and term etc. are examined to carry out sorting code number, therefore, if there is hand in Chinese medical diagnosis on disease information Art operational term, medical treatment consumptive materials term, checks and examines term nomenclature of drug term, not encodes (can be compiled without code).In addition, Sorting code number is not carried out yet to traditional Chinese medical science class term in International Classification of Diseases ICD versions, therefore, if in Chinese medical diagnosis on disease information There is traditional Chinese medical science class term, also not encode (can be compiled without code).
For the term of this kind of not coding, can export a preset result (such as can export that " no code can The result of volume " etc), identify it as the terms of surgery operation, nomenclature of drug term, medical treatment consumptive materials term, inspection to show Term or traditional Chinese medical science class term are examined, only no ICD codes can assign.
In the present embodiment, for not finding the standard terminology to match, expanding term, Hypothetical classification term, odd encoder The title to be encoded of term, if can find match without encryption description, illustrate that it belongs to the terms of surgery operation, drug name Claim term, medical treatment consumptive materials term, check the one kind examined in term or traditional Chinese medical science class term, not encode, and for that cannot search To matching without encryption description, illustrate that it is not belonging to the above-mentioned type, for this kind of title to be encoded, the present embodiment is sent to Artificial treatment platform is given, by manually continuing with, concrete processing procedure, the present invention is not construed as limiting it.
Embodiment five
As illustrated in fig. 12, nature is carried out to Chinese medical diagnosis on disease information to be suitable for a kind of of illustrative methods of the present invention Language Processing to obtain the specific embodiment of title to be encoded, including:
Step S61 pre-processes Chinese medical diagnosis on disease information character string, obtains pretreated Chinese medical diagnosis on disease Information character string.
The purpose of the step is that the character in Chinese medical diagnosis on disease information character string is converted into unified coded format, with Just subsequent processing.
Optionally, which can implement according to following concrete mode:To non-in Chinese medical diagnosis on disease information character string Chinese character is into row format normalized (for example, the symbol in Chinese medical diagnosis on disease information character string is all converted to half-angle lattice Formula is all converted to full-shape form, and English alphabet therein is all converted to uppercase format or lower case format);And in deleting Non-medical term in literary medical diagnosis on disease information character string.The non-medical term dictionary that wherein non-medical term is pre-established by one There is provided, and non-medical term be remarks act on word or descriptive sentence (such as " to be checked, reason, warm tip, suggestion, Such as aggravation is please gone to a doctor at any time ").
Step S62, based on pre-establish ontology dictionary, disease degree glossary, the concurrent glossary of disease, morbidity Pretreated Chinese medical diagnosis on disease information character string is cut into first kind substring and/or the by position glossary Two type substrings.
Wherein, first kind substring and Second Type substring have independent semantic, i.e., represented medical treatment letter Breath is not influenced, and first kind substring can be with direct of ontology in ontology dictionary by the character before or after it Match, Second Type substring can not directly be matched with the ontology in ontology dictionary.
Ontology dictionary include aforesaid standards terminology bank and expand terminology bank, specifically include standard terminology and expand term and Corresponding coding, wherein, standard terminology is considered the ontology in ontology dictionary with expansion term.
It should be noted that when in the International Classification of Diseases coding method of automation provided by the invention use arrived it is aforementioned Hypothetical classification terminology bank and/or during odd encoder terminology bank, ontology dictionary should also be as consisting of assuming that classification terminology bank and/or compile more Code terminology bank (at this point, Hypothetical classification term and/or odd encoder term are also considered the ontology in ontology dictionary), so that cutting It can be with Hypothetical classification term or more when the first kind substring or Second Type substring gone out is as title to be encoded Encryption description matches.
Disease degree glossary includes several disease degree terms, and disease degree term is for describing disease acute and chronic The word of degree or disease severity or histological type or clinical stages etc..It is disease degree glossary as shown in Figure 12 B Including part disease degree term.
The concurrent glossary of disease includes the concurrent term of several diseases, and the concurrent term of disease is for describing at least two diseases The word that disease concurrently occurs.The concurrent term of part disease included as indicated in fig. 12 c for the concurrent glossary of disease.
Site of pathological change glossary includes several site of pathological change terms, and site of pathological change term is for describing disease incidence portion The word of position.The site morbidity position term included as indicated in fig. 12d for site of pathological change glossary.
The purpose of the step is that Chinese medical diagnosis on disease information is cut into the independent semantic substring (first kind Substring or Second Type substring), with effectively avoid by multiple characters with incidence relation be identified respectively from And the problem of causing to identify mistake.
The first kind substring being syncopated as and Second Type substring are determined as title to be encoded by step S63.
After the first kind substring being syncopated as and Second Type substring are determined as title to be encoded, rear It is continuous when treating encoding name using the merging terminology bank in embodiment three and being pre-processed, due to first kind substring and the The corresponding ontology of two type substrings may be expansion term, and it is standard terminology to merge the combining objects in terminology bank, Therefore, the expansion term corresponding to first kind substring and Second Type substring need to be converted to corresponding standard art Then language recycles merging terminology bank to be pre-processed.
As shown in figure 12e, step S62 is specifically included:
Whether step S70 judges pretreated Chinese medical diagnosis on disease information character string comprising symbol;If include symbol Number, then perform step S71;If step S72 is not performed comprising symbol.
Step S71, by the character between every adjacent two symbols in pretreated Chinese medical diagnosis on disease information character string It is matched as a whole with the ontology in ontology dictionary;If successful match, step S711 is performed;If it fails to match, hold Row step S712.
Step S711, using the character cutting between the adjacent two symbols out as first kind substring.
Step S712, the adjacent two symbols and its between character be determined as wouldn't cutting character string, then perform step Rapid S73.
Step S71, step S711, step S712 foundations processing rule be:Alphabet between adjacent-symbol is made It is matched for entirety with ontology, ability cutting when only matching, otherwise temporarily not cutting.
Such as to " severe arthritis, and hematocele shown in Figure 12 F;A type thymomas;The cutting of coronary heart disease ", wherein, " severe Arthritis, and hematocele ", " A types thymoma " and " coronary heart disease " are the alphabet between symbol, and can find phase Therefore the ontology matched, is split out respectively.
Step S72, will be in pretreated Chinese medical diagnosis on disease information character string and ontology dictionary using mechanical Chinese word segmentation method Ontology matched;If all characters in pretreated Chinese medical diagnosis on disease information character string can be with ontology Match, then perform step S721;If there is the list failed with Ontology Matching in pretreated Chinese medical diagnosis on disease information character string A character or multiple continuous characters, then perform step S722.
Step S721 cuts the character in pretreated Chinese medical diagnosis on disease information character string according to the matched ontology of institute It branches away as first kind substring.
Whether step S722, judgement fail with the single character of Ontology Matching or multiple continuous characters to be disease degree art The concurrent term of language, disease or site of pathological change term, if the concurrent term of disease degree term, disease or site of pathological change term, Then perform step S7221;If not disease degree term, the concurrent term of disease or site of pathological change term, then step is performed S7222。
Step S72, step S721, step S722 foundations processing rule be:It will be pretreated using mechanical Chinese word segmentation method Character in Chinese medical diagnosis on disease information character string is matched with ontology, and only alphabet can find the sheet to match Ability cutting during body, otherwise temporarily not cutting.
Such as Figure 12 G show the cutting to " coronary heart disease of hypertension ", and " high blood can be found respectively using mechanical Chinese word segmentation method The ontology that pressure " and " coronary heart disease " match, therefore, is split out respectively.
The mechanical Chinese word segmentation method that step S72 is used can be Forward Maximum Method type, reverse maximum matching type or minimum cutting Type.Specific dicing process, the present embodiment repeat no more.
Step S7221, according to fail with the single character of Ontology Matching or multiple continuous characters after the pre-treatment in Position in literary medical diagnosis on disease information character string, will fail with the single character of Ontology Matching or multiple continuous characters and its it It is preceding or can merge with the single character of Ontology Matching or multiple continuous characters cut out as the sub- word of Second Type later Symbol string, and can be with the single character of Ontology Matching or multiple continuous character cuttings out as first kind using remaining Character string.
Step S7222 integrally cuts out pretreated Chinese medical diagnosis on disease information character string as Second Type Substring.
Step S7221, the processing rule of step S7222 foundations is:Failing to the single character of Ontology Matching or more A continuous character is disease degree term, the concurrent term of disease or site of pathological change term, then performs cutting, and be during cutting It with the character before or after it is merged and is cut out.
Such as Figure 12 H show the cutting to " hyperplasia of prostate is with acute urinary retention diabetes ", using mechanical Chinese word segmentation method The ontology that " hyperplasia of prostate ", " acute urinary retention " and " diabetes " matches can be found respectively, and " companion " therein is disease Therefore concurrent term, " hyperplasia of prostate " with " acute urinary retention " is merged and is cut out, " diabetes " are individually cut out.
Such as Figure 12 I show the cutting to " the acute renal anemia of hyperplasia of prostate ", can be distinguished using mechanical Chinese word segmentation method The ontology that " hyperplasia of prostate " and " renal anemia " match is found, it is therein " acute " for disease degree term, therefore, general " hyperplasia of prostate " is individually cut out, and " acute " and " renal anemia " is merged and is cut out.
Such as Figure 12 J show the cutting to " subacute bronchitis hyperplasia of prostate ", can be divided using mechanical Chinese word segmentation method " bronchitis " is not found and ontology that " hyperplasia of prostate " matches, it is therein " subacute " for disease degree term, and Position in the Chinese medical diagnosis on disease information character string of " subacute " after the pre-treatment is beginning, therefore, by " subacute " and " branch Tracheitis " merging is cut out, and " hyperplasia of prostate " is individually cut out.
Such as Figure 12 K show the cutting to " bronchitis prostate cancer late period ", can be looked into respectively using mechanical Chinese word segmentation method Find the ontology that " bronchitis " and " prostate cancer " match, " late period " therein is disease degree term, and " late period " exists Position in pretreated Chinese medical diagnosis on disease information character string is end, and therefore, " bronchitis " is individually cut out, " prostate cancer " and " late period " merging is cut out.
Whether step S73, judgement wouldn't include preset additional character in cutting character string;It if wouldn't cutting character string In comprising additional character, then perform step S731;If additional character wouldn't not be included in cutting character string, step is performed S733。
Step S731, search wouldn't be belonging to cutting character string character model, and the character model according to belonging to this corresponds to Segmentation rules to wouldn't cutting character string carry out cutting;Wherein, the character model library that character model is pre-established by one provides, And character model has one-to-one segmentation rules.
Step 332, the character cut out is matched with the ontology in ontology dictionary, it, should if successful match The character cut out is determined as first kind substring, if it fails to match, the character that this cuts out is determined as Two type substrings;
Step S733, wouldn't cutting character string be determined directly as Second Type substring.
Step S73, step S731, step 332, the processing rule of step S733 foundations are:When wouldn't be in cutting character string During comprising preset additional character, cutting is carried out according to character model that wouldn't be belonging to cutting character string, is otherwise directly syncopated as Come;And match the character being syncopated as based on character model with ontology again, it wherein can will directly be matched with ontology Conduct first kind substring, it is impossible to directly it is matched be used as Second Type substring.
Such as preset additional character can include but is not limited to comma, pause mark, fullstop, colon, plus sige, branch, slash Line etc..
Such as following partial character model and its segmentation rules in character model library:
(1) character model:XABY types, A are number, and B is comma, pause mark or fullstop;
Segmentation rules:X and Y are cut out respectively;
(2) character model:CDE types, and one of C, E are Chinese character, D is colon;
Segmentation rules:Chinese character segmentation in C, E is come out;
(3) character model:FGH types, and F, H are Chinese character, G is plus sige;
Segmentation rules:FGH is cut out as a whole;
(4) character model:IJK types, and I, K are Chinese character, J is branch, fullstop, question mark, exclamation,
Segmentation rules:I and K are cut out respectively;
(5) character model:LOP types, and L, P are Chinese character, O is colon;
Segmentation rules:LOP is cut out as a whole;
(6) character model:STU types, and S and/or U is individual Chinese character, T is slash line;
Segmentation rules:STU is cut out as a whole.
Such as to " abdominal pain:" cutting is carried out, it understands to belong to CDE types through searching character model library, then individually cuts " abdominal pain " It branches away.
Such as to " congenital heart disease:Ventricular septal defect " carries out cutting, understands to belong to LOP through searching character model library Type, then by " congenital heart disease:Ventricular septal defect " is integrally cut out.
Such as cutting is carried out to " branch/choamydiae infection ", understand to belong to STU types through searching character model library, then will " branch/ Choamydiae infection " is integrally cut out.
Such as to " abdominal pain;Prostatitis " carries out cutting, understands to belong to IJK types through searching character model library, is then cut It is divided into " abdominal pain " and " prostatitis ".
Such as to " 1, cervical spondylosis 2, lumbar intervertebral disc bulge 3, pregnant 24+3 weeks 4, the prolapse of uterus, II degree;5th, branch/Chlamydia sense Dye " carries out cutting, understands that the character string is related to various characters model through searching character model library, the character difference being finally syncopated as For " cervical spondylosis ", " lumbar intervertebral disc bulge ", " pregnant 24+3 weeks ", " prolapse of uterus, II degree ", " branch/choamydiae infection ", these are cut The character continuation separated is matched with ontology, and " cervical spondylosis " therein, " lumbar intervertebral disc bulge " can directly be matched with ontology, Then as first kind substring, and " pregnant 24+3 week ", " prolapse of uterus, II degree ", " branch/choamydiae infection " can not with Body directly matches, then as Second Type substring.
The present embodiment has fully considered Chinese disease during natural language processing is carried out to Chinese medical diagnosis on disease information Sick diagnostic message belongs to the features such as natural language, form complexity are various, without unified standard, a variety of using what is established in advance Dictionary carries out cutting and matching to Chinese medical diagnosis on disease information character string, is identified medical diagnosis on disease title as waiting to compile using this Code title.
Embodiment six
As shown in figure 13, the standard to match to be suitable for a kind of lookup of illustrative methods of the present invention with title to be encoded Term or the specific embodiment for expanding term, including:
Step S80, if entitled first kind substring to be encoded, which is matched Ontology is determined as the standard terminology to match with the title to be encoded or expands term, if entitled Second Type to be encoded Character string then carries out each ontology in Second Type substring and ontology dictionary the parsing of the first dimension, obtains second Several first dimension analysis results of several first dimension analysis results of type substring and each ontology;
The step optionally, carries out analysis object using Second Type substring and ontology as analysis object The parsing of first dimension can include but is not limited to:
(1) letter of the beginning part in analysis object is determined, if wherein the beginning part is not letter, this parses knot Fruit is sky;
(2) the disease degree term included in analysis object is determined, if not including disease degree term wherein, this Analysis result is sky;
(3) character after comma in analysis object is determined, if not including comma wherein, this analysis result is sky;
(4) character in analysis object bracket is determined, if not including bracket wherein, this analysis result is sky; And
(5) determine in analysis object except after the letter of the beginning part, disease degree term, comma character, in bracket Character (the remaining character hereinafter referred to as in ontology) other than character, generally the core stem of analysis object.
When analysis object is Second Type substring, each first dimension analysis result can include but unlimited In:The disease degree term that is included in the letter of Second Type substring the beginning part, Second Type substring, the second class Character in type substring after comma, the character in Second Type substring bracket, remaining character.
When analysis object is ontology, each first dimension analysis result can include but is not limited to:Ontology beginning portion Point letter, include in ontology disease degree term, the character in ontology after comma, the character in ontology bracket, residue Character.
Step S81, by ontology each in each first dimension analysis result of Second Type substring and ontology dictionary Each first dimension analysis result is matched, search whether there are each first dimension analysis result of some ontology with the second class Each first dimension analysis result of type substring matches;If there is such ontology, then step S82 is performed, if not There are such ontologies, then perform step S83.
The ontology found is determined as the ontology that Second Type substring matches by step S82.
Step S83 chooses part the first dimension solution in all the first dimension analysis results of Second Type substring Result is analysed to carry out with part the first dimension analysis result in all the first dimension analysis results of ontology each in ontology dictionary Matching, and search whether the part there are this of some ontology part the first dimension analysis result and Second Type substring First dimension analysis result matches;If there is such ontology, then step S831 is performed;If there is no such sheet Body then performs step S832.
The ontology found is determined as the ontology that Second Type substring matches by step S831.
The letter of Second Type substring the beginning part is matched with the letter of ontology the beginning part respectively, by the The disease degree term included in two type substrings is matched with the disease degree term included in ontology, by the second class Character in type substring after comma is matched with the character after comma in ontology, by Second Type substring bracket Interior character is matched with the character in ontology bracket, will be in the remaining character in Second Type substring and ontology Remaining character is matched.
If the first whole dimension analysis results match, which is determined as Second Type substring phase The ontology matched.
If certain first dimension analysis results mismatch, the first dimension of selected part analysis result carries out respectively Match.
It is often the core stem of Second Type substring in view of the remaining character in Second Type substring, because This, in specific implementation, preferably, selected part the first dimension analysis result is included at least in Second Type substring Remaining character and ontology in remaining character.For example, only choose the remaining character of analysis object and disease degree term point It is not matched, alternatively, the remaining character for only choosing analysis object is matched, alternatively, the surplus of analysis object can also be chosen Character in character or bracket after remaining character and the letter of the beginning part or disease degree term or comma etc. carries out respectively Match.
Such as a certain Second Type substring is " 4 type mucopolysaccharides storage product disease ", and the parsing of the first dimension is carried out to it, Obtained analysis result is as shown in table 1, is the ontology that matches with the Second Type substring and its each the as shown in table 2 Dimension analysis result.
Table 1
Table 2
Step S832 carries out each ontology in Second Type substring and ontology dictionary the parsing of the second dimension, Obtain each second dimension solution of each ontology in each second dimension analysis result of Second Type substring and ontology dictionary Analyse result.
The step optionally, carries out analysis object using Second Type substring and ontology as analysis object The parsing of default dimension can include but is not limited to:
(1) each Chinese character in analysis object is determined;
(2) initial consonant of each Chinese character in analysis object is determined;
(3) simple or compound vowel of a Chinese syllable of each Chinese character in analysis object is determined;
(4) initial character of analysis object is determined;
(5) phonetic of the initial character of analysis object is determined;And
(6) non-chinese character in analysis object is determined, if not including non-chinese character, this analysis result wherein For sky.
When analysis object is Second Type substring, the analysis result of each dimension can include but is not limited to: The sub- character of initial consonant, Second Type of each Chinese character in each Chinese character, Second Type substring in Second Type substring Each simple or compound vowel of a Chinese syllable of Chinese character in string, the initial character of Second Type substring, Second Type substring initial character phonetic, Non-chinese character in two type substrings.
When analysis object is entry, analysis result can include but is not limited to:It is every in each Chinese character, entry in entry Each simple or compound vowel of a Chinese syllable of Chinese character in the initial consonant of a Chinese character, entry, the initial character of entry, the phonetic of initial character of entry, entry the non-Chinese Word character.
For example, table 3 is each second dimension analysis result of Second Type substring " hypertension ".
Table 3
Step S833, several of several second dimension analysis results and ontology based on Second Type substring Two-dimensions analysis result calculates the matching degree of Second Type substring and each ontology.
Specifically, which can calculate the similarity of Second Type substring and each ontology, can also calculate Total confidence level of two type substrings and each ontology.Wherein, compared to similarity, total confidence level can more embody Second Type The matching degree of substring and each ontology, but the calculating process of total confidence level compared to similarity calculating process also more It is complicated.When step S833 is embodied, if desired faster processing speed, then can select to calculate the process of similarity, if More accurately matching result is needed, then can select to calculate the process of total confidence level.
A kind of embodiment of step S833 is to calculate the similarity of Second Type substring and each ontology, specifically such as Under:
The similarity of Second Type substring and each ontology is calculated according to equation below, and similar by what is be calculated Degree is determined as the matching degree of Second Type substring and each ontology:
Wherein, M represents similarity;
T represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
T in q represent each second dimension of Second Type substring;
D represents ontology;
Tf (t in d) represent in the second identical dimension, the second dimension analysis result of Second Type substring with The frequency that second dimension analysis result of ontology matches;
Wherein, T represents the sum of ontology in ontology dictionary, and T (t) represents each second dimension parsing As a result the sum of ontology to match with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of ontology.
A kind of embodiment of step S833 is to calculate total confidence level of Second Type substring and each ontology, specifically It is as follows:
Total confidence level of Second Type substring and each ontology is calculated as follows, and total by what is be calculated Confidence level is determined as the matching degree of Second Type substring and each ontology:
1) each Chinese character in Second Type substring is determined.
2) the cosine confidence level of the matched each ontology of Second Type substring is calculated according to equation below:
Wherein, N represents cosine confidence level;
V represents the Chinese character sum that Second Type substring and its ontology to match are included;
Q represents Second Type substring;
D' represents the ontology to match with Second Type substring;
wQ,jRepresent the frequency that each Chinese character occurs in Second Type substring;
wd',jRepresent the frequency occurred in the ontology that each Chinese character matches in Second Type substring;
J represents the serial number of Chinese character that Second Type substring and its ontology to match are included.
3) total confidence level of the matched each ontology of Second Type substring is calculated according to equation below:
S=M × a+N × b
Wherein, S represents total confidence level;
M represents similarity;
A represents the corresponding preset weights of similarity M;
B represents the corresponding preset weights of cosine confidence level N;
Also, similarity M is calculated according to equation below:
Wherein, t represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
T in q represent each second dimension of Second Type substring;
D represents ontology;
Tf (t in d) represent in the second identical dimension, the second dimension analysis result of Second Type substring with The frequency that second dimension analysis result of ontology matches;
Wherein, T represents the sum of ontology in ontology dictionary, and T (t) represents each second dimension parsing As a result the sum of ontology to match with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of ontology.
Step S834 according to the matching degree of Second Type substring and each ontology, determines one or more ontology The ontology to match as Second Type substring.
Optionally, which can have following specific embodiment:According to the matching journey with Second Type substring The size of degree sorts to whole ontologies, and the ontology of the forward preset quantity that wherein sorts (such as forward 2 that sort) is true It is set to the ontology that Second Type substring matches;Alternatively, by reaching default with the matching degree of Second Type substring One or more ontologies of threshold value are determined as the ontology that Second Type substring matches.
During the specific implementation present invention, for the matching journey for the ontology that clear and definite Second Type substring matches with each It spends and it is used, can also can also match in the result of final output including Second Type substring with it Each ontology matching degree.For example, the matching degree of output Second Type substring and each ontology to match, so It can therefrom select one again by manual type according to the size of matching degree and be used as Second Type substring and match afterwards Ontology.
Step S84 reaches default by ontology that Second Type substring matches or with Second Type substring One or more ontologies with condition are determined as standard terminology or expansion term that title to be encoded matches.
The present embodiment has fully considered Chinese disease during natural language processing is carried out to Chinese medical diagnosis on disease information Sick diagnostic message belongs to the features such as natural language, form complexity are various, without unified standard, a variety of using what is established in advance Dictionary carries out cutting and matching to Chinese medical diagnosis on disease information character string, and the standard terminology that title to be encoded matches is searched with this Or expand term.
Example devices
After the method for exemplary embodiment of the invention is described, next, with reference to figure 7 to the exemplary reality of the present invention The ICD coded systems for applying mode are introduced.
The implementation of ICD coded systems may refer to the implementation of the above method, and overlaps will not be repeated.It is used below Term " module " can be the combination of the software and/or hardware of realizing predetermined function.Although the described system of following embodiment It is preferably realized with software, but the realization of the combination of hardware or software and hardware is also what may and be contemplated.
As shown in fig. 7, ICD coded systems can include:Standard terminology library creation module 61 expands terminology bank creation module 62nd, import modul 63, data processing module 64, coding module 65.
Standard terminology library creation module 61, for according to the ICD versions to be referred to, by the ICD the to be referred to versions The each disease term included in this, is determined as standard terminology;According to the ICD versions to be referred to, each mark is determined The coding of quasi- term;The standard terminology and its coding are stored, obtains standard terminology library.
Optionally, the CD versions to be referred to can be that (such as WHO was announced the ICD versions announced of WHO in 1992 years ICD-10 various localization ICD versions (such as the China Health) or to the WHO ICD versions announced extended The ICD-10 Chinese editions that portion recommends).When it is implemented, suitable ICD versions can be selected according to actual needs as reference, This is not limited by the present invention.
Expand terminology bank creation module 62, it is following various in the ICD versions to be referred to for that will be not included in Type is determined as expanding term:The standard terminology be commonly called as nickname abbreviation, the subclass disease term of the standard terminology, And the newly generated disease term after the ICD versions to be referred to are announced;Judge the expansion term to be any one A standard terminology be commonly called as nickname abbreviation when, assign the coding of the standard terminology to the expansion term;Described in judgement It, will be with the expansion when expanding subclass disease term or the newly generated disease term of the term for any one of standard terminology The coding for filling the immediate standard terminology of relation of genus and species of term assigns the expansion term;Store the expansion term and its volume Code obtains expanding terminology bank.
Import modul 63, for inputting Chinese medical diagnosis on disease information.
Optionally, Chinese medical diagnosis on disease information can be the medical record information or basic medical of medical worker's input Insure the information described in advice of settlement.
Data processing module 64, for carrying out natural language processing to the Chinese medical diagnosis on disease information, obtain one or Multiple titles to be encoded.
Specifically, data processing module 64 can be based on the characteristics of Chinese medical diagnosis on disease information, and Chinese medical diagnosis on disease is believed Breath is segmented, takes out the processing such as word, and then parse disease term from Chinese medical diagnosis on disease information, these are from the Chinese disease The disease term parsed in diagnostic message is title to be encoded.
Coding module 65 for being based on the standard terminology library and the expansion terminology bank, is searched and the name to be encoded Claim the standard terminology that matches or expand term, and by the standard terminology of successful match or expand the coding of term, be determined as institute State the coding of title to be encoded.
Optionally, as shown in figure 8, ICD coded systems, which are removed, includes above-mentioned standard terminology bank creation module 61, expansion terminology bank Except creation module 62, import modul 63, data processing module 64, coding module 65, it can also include:Hypothetical classification terminology bank Creation module 71.
Hypothetical classification terminology bank creation module 71, for that will be not included in the ICD versions to be referred to and with appointing One standard terminology of meaning it is related and clinically acquiescence be equal to the standard terminology and not the standard terminology be commonly called as it is other Claim the disease term of abbreviation be determined as Hypothetical classification term;By with the relevant standard terminology of Hypothetical classification term Coding, assigns the Hypothetical classification term;The Hypothetical classification term and its coding are stored, obtains Hypothetical classification terminology bank.
In ICD coded systems shown in Fig. 8, coding module 65 is additionally operable to based on the Hypothetical classification terminology bank, lookup and institute State the Hypothetical classification term that title to be encoded matches;By the coding of the Hypothetical classification term of successful match, it is determined as described treat The coding of encoding name.
Optionally, as shown in figure 9, ICD coded systems, which are removed, includes above-mentioned standard terminology bank creation module 61, expansion terminology bank Except creation module 62, import modul 63, data processing module 64, coding module 65, it can also include:Odd encoder terminology bank is created Model block 81.
Odd encoder terminology bank creation module 81, for that will be not included in the ICD versions to be referred to and by least The disease term of two different standard terminology compositions, is determined as odd encoder term;The odd encoder term will be formed The coded combination of whole standard terminologies together, the coding as the odd encoder term;Store the odd encoder term and its Coding, obtains odd encoder terminology bank.
In ICD coded systems shown in Fig. 9, coding module 65 is additionally operable to based on the odd encoder terminology bank, search with it is described The odd encoder term that title to be encoded matches;By the coding of the odd encoder term of successful match, it is determined as the name to be encoded The coding of title.
Optionally, as shown in Figure 10, ICD coded systems, which are removed, includes above-mentioned standard terminology bank creation module 61, expands term Except library creation module 62, import modul 63, data processing module 64, coding module 65, it can also include:Merge terminology bank wound Model block 91 and preprocessing module 92.
Merge terminology bank creation module 91, for the single mark that will can substitute at least two while the standard terminology occurred Quasi- term is determined as merging term;And each at least two standard terminology occurred simultaneously is determined as the merging The combining objects of term;According to the ICD versions to be referred to, each coding for merging term is determined;Store the conjunction And term and its coding and the whole combining objects for merging term, it obtains merging terminology bank.
Preprocessing module 92, the one or more title to be encoded for being obtained to the data processing module 64 carry out pre- Processing, judges in one or more of titles to be encoded, if includes any one or more whole merging for merging term Object, if comprising any one or more whole combining objects for merging term are substituted for corresponding merging term; Then the title to be encoded after pretreatment is sent to coding module 65.
Optionally, ICD coded systems, which are removed, includes above-mentioned standard terminology bank creation module 61, expands terminology bank creation module 62nd, except import modul 63, data processing module 64, coding module 65, can also include:Revision module in real time, for real-time Standard terminology library, expansion terminology bank, Hypothetical classification terminology bank, odd encoder terminology bank, merging terminology bank are revised.
Optionally, as shown in figure 11, ICD coded systems, which are removed, includes above-mentioned standard terminology bank creation module 61, expands term Except library creation module 62, import modul 63, data processing module 64, coding module 65, it can also include:Without coded treatment mould Block 101.
Without coded treatment module 101, for will not determine the title to be encoded of coding in no encryption description library without volume Code term is matched, if successful match, it is default not determine that the title to be encoded encoded is encoded and/or exported to this As a result, if it fails to match, this not determined to, title to be encoded of coding is sent to artificial treatment platform and carries out artificial treatment. Wherein, no encryption description library includes several no encryption descriptions.These several no encryption descriptions include:Preset traditional Chinese medical science class term; The preset terms of surgery operation;Preset nomenclature of drug term;Preset medical treatment consumptive materials term;And preset check examines art Language.
ICD coded systems provided in an embodiment of the present invention cover the Chinese diseases of the overwhelming majority by creating multiple terminology banks The disease term being likely to occur in sick diagnostic message meets wanting for the disease term in the Chinese medical diagnosis on disease information of automatic resolution It asks so that the ICD codings of automation are achieved, and ICD codings are carried out using ICD coded systems provided in an embodiment of the present invention, Without manually participating in, have many advantages, such as that coding rate is fast, at low cost, accuracy is high.
It should be noted that although several modules of ICD coded systems, this division are referred in above-detailed It is only exemplary not enforceable.In fact, according to the embodiment of the present invention, two or more above-described moulds The feature and function of block can embody in a module.Conversely, the feature and function of an above-described module can be with It is further divided into being embodied by multiple modules.
In addition, although the operation of the method for the present invention is described with particular order in the accompanying drawings, this do not require that or The operation that these operations must be performed or have to carry out shown in whole according to the particular order by implying could be realized desired As a result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step and performed and/or by one by certain steps Step is decomposed into execution of multiple steps.
Although describe spirit and principles of the present invention by reference to several specific embodiments, it should be appreciated that, this Invention is not limited to disclosed specific embodiment, does not also mean that the feature in these aspects cannot to the division of various aspects Combination is this to divide merely to the convenience of statement to be benefited.The present invention is directed to cover appended claims spirit and In the range of included various modifications and equivalent arrangements.

Claims (18)

1. a kind of International Classification of Diseases coding method of automation, including:
Step 1, Chinese medical diagnosis on disease information is inputted;
Step 2, natural language processing is carried out to the Chinese medical diagnosis on disease information, obtains one or more titles to be encoded;
Step 3, based on standard terminology library and expansion terminology bank, the standard terminology to match with the title to be encoded or expansion are searched It fills term, and by the standard terminology of successful match or expands the coding of term, be determined as the coding of the title to be encoded;
Wherein, the standard terminology library creates as follows:
The determining International Classification of Diseases ICD versions to be referred to;
The each disease term that will be included in the International Classification of Diseases ICD versions to be referred to, is determined as standard terminology;
According to the International Classification of Diseases ICD versions to be referred to, the coding of each standard terminology is determined;
The standard terminology and its coding are stored, obtains standard terminology library;
Wherein, the expansion terminology bank creates as follows:
The following various types being not included in the International Classification of Diseases ICD versions to be referred to are determined as to expand art Language:The standard terminology be commonly called as nickname abbreviation, the subclass disease term of the standard terminology and to be joined described Newly generated disease term after the International Classification of Diseases ICD versions examined are announced;
When it is described expansion term for any one of standard terminology be commonly called as nickname abbreviation when, by the volume of the standard terminology Code assigns the expansion term;
When subclass disease term of the expansion term for any one of standard terminology or the newly generated disease term When, the expansion term will be assigned with the coding of the immediate standard terminology of the relation of genus and species of the expansion term;
The expansion term and its coding are stored, obtains expanding terminology bank;
Wherein, the step 2 includes:
Step 21, the Chinese medical diagnosis on disease information character string is pre-processed, obtains pretreated Chinese medical diagnosis on disease Information character string;
Step 22, based on ontology dictionary, disease degree glossary, the concurrent glossary of disease, the site of pathological change pre-established Glossary, by the pretreated Chinese medical diagnosis on disease information character string be cut into several first kind substrings and/ Or Second Type substring;
Wherein, the ontology dictionary includes the standard terminology library and the expansion terminology bank, the standard terminology and the expansion It is ontology to fill term;
The disease degree glossary includes several disease degree terms, and the disease degree term is for describing disease urgency The word of chronic degree or disease severity or histological type or clinical stages;
The concurrent glossary of disease includes the concurrent term of several diseases, and the concurrent term of disease is for describing at least two The word that kind disease concurrently occurs;
The site of pathological change glossary includes several site of pathological change terms, and the site of pathological change term is for describing disease hair The word at sick position;
The first kind substring can directly be matched with the ontology in the ontology dictionary, the sub- character of Second Type String can not directly be matched with the ontology in the ontology dictionary;
Step 23, the first kind substring being syncopated as and Second Type substring are determined as title to be encoded;
Wherein, the step 21 includes:
To the non-Chinese character in the Chinese medical diagnosis on disease information character string into row format normalized, and delete the Chinese disease Non-medical term in sick diagnostic message character string obtains pretreated Chinese medical diagnosis on disease information character string, wherein described The non-medical term dictionary that non-medical term is pre-established by one provides, and the word that the non-medical term has been remarks effect Language;
Wherein, the step 22 includes:
Judge the pretreated Chinese medical diagnosis on disease information character string whether comprising symbol;
If the pretreated Chinese medical diagnosis on disease information character string includes symbol, by the pretreated Chinese disease Character in sick diagnostic message character string between every adjacent two symbols is matched as a whole with the ontology in ontology dictionary; If successful match, using the character cutting between the adjacent two symbols out as first kind substring;If matching is lost Lose, then by the adjacent two symbols and its between character be determined as wouldn't cutting character string, and judge described in wouldn't cutting word Whether preset additional character is included in symbol string;
If it is described wouldn't in cutting character string comprising additional character, search described in wouldn't be belonging to cutting character string character mould Type, and the corresponding segmentation rules of character model according to belonging to this to it is described wouldn't cutting character string carry out cutting, will be syncopated as The character come is matched with the ontology in ontology dictionary, if successful match, using the character cut out as the first kind Type substring, if it fails to match, using the character cut out as Second Type substring;Wherein, the character The character model library that model is pre-established by one provides, and the character model has one-to-one segmentation rules;
If described wouldn't not include additional character in cutting character string, by it is described wouldn't cutting character string be determined directly as second Type substring;
If the pretreated Chinese medical diagnosis on disease information character string is not comprising symbol, using mechanical Chinese word segmentation method by described in In single character or multiple continuous characters and the ontology dictionary in pretreated Chinese medical diagnosis on disease information character string Ontology matched;
If all characters in the pretreated Chinese medical diagnosis on disease information character string can be with Ontology Matching, foundation Matched ontology by the single character in the pretreated Chinese medical diagnosis on disease information character string or multiple continuous words Symbol is cut out as first kind substring;
Fail and the single character of Ontology Matching or more if existing in the pretreated Chinese medical diagnosis on disease information character string Whether a continuous character then fails with the single character of Ontology Matching or multiple continuous characters to be disease degree described in judgement The concurrent term of term, disease or site of pathological change term;
Fail with the single character of Ontology Matching or multiple continuous characters to be disease degree term, the concurrent term of disease when described Or during site of pathological change term, failed according to described with the single character of Ontology Matching or multiple continuous characters in the pretreatment The position in Chinese medical diagnosis on disease information character string afterwards fails described and the single character of Ontology Matching or multiple continuous Character is cut out with that can merge before or after it with the single character of Ontology Matching or multiple continuous characters as Two type substrings, and can be with Ontology Matching by remaining in the pretreated Chinese medical diagnosis on disease information character string Single character or multiple continuous character cuttings out as first kind substring;
Fail with the single character of Ontology Matching or multiple continuous characters not being disease degree term, the concurrent art of disease when described When language or site of pathological change term, the pretreated Chinese medical diagnosis on disease information character string is integrally cut out as second Type substring.
2. the International Classification of Diseases coding method of automation according to claim 1, wherein,
The step 3 further includes:Based on Hypothetical classification terminology bank, the Hypothetical classification art to match with the title to be encoded is searched Language;By the coding of the Hypothetical classification term of successful match, it is determined as the coding of the title to be encoded;
Wherein, the Hypothetical classification terminology bank creates as follows:
To be not included in the International Classification of Diseases ICD versions to be referred to and with any one of standard terminology phase Close and clinically acquiescence be equal to the standard terminology and not the standard terminology be commonly called as nickname abbreviation disease term, It is determined as Hypothetical classification term;
By the coding with the relevant standard terminology of Hypothetical classification term, the Hypothetical classification term is assigned;
The Hypothetical classification term and its coding are stored, obtains Hypothetical classification terminology bank.
3. the International Classification of Diseases coding method of automation according to claim 1, wherein,
The step 3 further includes:Based on odd encoder terminology bank, the odd encoder term to match with the title to be encoded is searched; By the coding of the odd encoder term of successful match, it is determined as the coding of the title to be encoded;
Wherein, the odd encoder terminology bank creates as follows:
It will be not included in the International Classification of Diseases ICD versions to be referred to and by at least two different standards The disease term of term composition, is determined as odd encoder term;
The coded combinations of whole standard terminologies of the odd encoder term will be formed together, the volume as the odd encoder term Code;
The odd encoder term and its coding are stored, obtains odd encoder terminology bank.
4. the International Classification of Diseases coding method of automation according to claim 1, wherein,
Before the step 3, further include:Based on terminology bank is merged, one or more of titles to be encoded are located in advance Reason;
The merging terminology bank creates as follows:
The single standard terminology of at least two while the standard terminology occurred will can be substituted, be determined as merging term;It and should Each at least two standard terminologies occurred simultaneously is determined as the combining objects of the merging term;
According to the International Classification of Diseases ICD versions to be referred to, each coding for merging term is determined;
The merging term and its coding and the whole combining objects for merging term are stored, obtain merging terminology bank;
It is described based on the merging terminology bank created, the step of pretreatment to one or more of titles to be encoded, Including:
Judge in one or more of titles to be encoded, if include any one or more whole merging pair for merging term As if comprising any one or more whole combining objects for merging term are substituted for corresponding merging term.
5. according to the International Classification of Diseases coding method of any automation of Claims 1 to 4, wherein, the step 3 it Afterwards, it further includes:
Step 4, by the title to be encoded for not determining coding and being matched without encryption description in no encryption description library, if matching Success then performs preset processing step to represent not determining this title to be encoded of coding encodes, if matching is lost It loses, then this not being determined to, the title to be encoded of coding is sent to artificial treatment platform and carries out artificial treatment;
Wherein, the no encryption description library includes several no encryption descriptions;
Several no encryption descriptions include:
Preset traditional Chinese medical science class term;
The preset terms of surgery operation;
Preset nomenclature of drug term;
Preset medical treatment consumptive materials term;And
Preset check examines term.
6. the International Classification of Diseases coding method of automation according to claim 1, wherein, the world to be referred to Classification of diseases ICD versions are the ICD versions that World Health Organization WHO is announced or World Health Organization WHO are announced The various localization ICD versions that ICD versions are extended.
7. the International Classification of Diseases coding method of automation according to claim 1, wherein, searched in the step 3 with The step of standard terminology or expansion term that the title to be encoded matches, including:
If the entitled first kind substring to be encoded, by the ontology that the first kind substring matches, really It is set to the standard terminology to match with the title to be encoded or expands term;
If the entitled Second Type substring to be encoded,:
The parsing of the first dimension is carried out to each ontology in the Second Type substring and the ontology dictionary, obtains institute State Second Type substring several first dimension analysis results and each ontology in the ontology dictionary several first Dimension analysis result;
By each the of ontology each in each first dimension analysis result of the Second Type substring and the ontology dictionary Dimension analysis result is matched, judge whether each first dimension analysis result with the Second Type substring The ontology that matches of each first dimension analysis result;
If there is each first dimension analysis result with each first dimension analysis result phase of the Second Type substring The ontology is then determined as the ontology that the Second Type substring matches by matched ontology;
If there is no each first dimension analysis result with each first dimension analysis result of the Second Type substring The ontology to match then chooses the first dimension of part in all the first dimension analysis results of the Second Type substring Analysis result is tied with part the first dimension parsing in all the first dimension analysis results of ontology each in the ontology dictionary Fruit is matched, and judges whether the described of part the first dimension analysis result and the Second Type substring The ontology that part the first dimension analysis result matches;
If there are the part the first dimension solutions of part the first dimension analysis result and the Second Type substring The ontology is then determined as the ontology that the Second Type substring matches by the ontology that matches of analysis result;
If there is no the first dimensions of the part of part the first dimension analysis result and the Second Type substring The ontology that analysis result matches then carries out the to each ontology in the Second Type substring and the ontology dictionary The parsing of two-dimensions obtains several second dimension analysis results of the Second Type substring and the ontology dictionary In each ontology several second dimension analysis results;
Several second dimensions of several second dimension analysis results and the ontology based on the Second Type substring Analysis result calculates the matching degree of the Second Type substring and each ontology;
According to the matching degree of the Second Type substring and each ontology, determine one or more ontologies as described the The ontology that two type substrings match;
By the ontology that the Second Type substring matches, be determined as standard terminology that the title to be encoded matches or Expand term.
8. the International Classification of Diseases coding method of automation according to claim 7, wherein, the sub- character of Second Type String described in each first dimension analysis result of ontology be respectively:
The Second Type substring described in directional terminology in ontology;
The Second Type substring described in grade term in ontology;
The Second Type substring described in character in ontology bracket;
The Second Type substring described in character in ontology after dash;And
The Second Type substring described in ontology except directional terminology, grade term, the character in bracket, after dash Character other than character;
The Second Type substring described in ontology all part the first dimension parsing knots in the first dimension analysis results Fruit includes:In the Second Type substring described in ontology except directional terminology, grade term, the character in bracket, broken folding The character other than character after number;And one or more of the following items:
The Second Type substring described in directional terminology in ontology, grade term;
The Second Type substring described in character in ontology bracket;
The Second Type substring described in character in ontology after dash.
9. the International Classification of Diseases coding method of automation according to claim 7, wherein, the sub- character of Second Type String described in each second dimension analysis result of ontology be respectively:
The Second Type substring described in ontology each Chinese character;
The Second Type substring described in ontology each Chinese character initial consonant;
The Second Type substring described in ontology each Chinese character simple or compound vowel of a Chinese syllable;
The Second Type substring described in ontology initial character;
The Second Type substring described in ontology initial character phonetic;And
The Second Type substring described in non-chinese character in ontology.
10. the International Classification of Diseases coding method of automation according to claim 7, wherein, it is described to be based on described second Several second dimension analysis results of type substring and several second dimension analysis results of the ontology calculate institute The step of matching degree for stating Second Type substring and each ontology, includes:
The similarity of the Second Type substring and each ontology is calculated according to equation below:
Wherein, M represents similarity;
T represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
T in q represent each second dimension of Second Type substring;
D represents ontology;
Tf (t in d) expressions are in the second identical dimension, the second dimension analysis result and ontology of Second Type substring The frequency that matches of the second dimension analysis result;
Wherein, T represents the sum of ontology in ontology dictionary, and T (t) represents each second dimension analysis result The sum of ontology to match with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of ontology;
The similarity being calculated is determined as to the matching degree of the Second Type substring and each ontology.
11. the International Classification of Diseases coding method of automation according to claim 7, wherein, it is described to be based on described second Several second dimension analysis results of type substring and several second dimension analysis results of the ontology calculate institute The step of matching degree for stating Second Type substring and each ontology, includes:
Determine each Chinese character in the Second Type substring;
The cosine confidence level of the matched each ontology of Second Type substring is calculated according to equation below:
Total confidence level of the matched each ontology of Second Type substring is calculated according to equation below:
S=M × a+N × b
Wherein, N represents cosine confidence level;
V represents the Chinese character sum that Second Type substring and its ontology to match are included;
Q represents Second Type substring;
D' represents the ontology to match with Second Type substring;
wQ,jRepresent the frequency that each Chinese character occurs in Second Type substring;
wd',jRepresent the frequency occurred in the ontology that each Chinese character matches in Second Type substring;
J represents the serial number of Chinese character that Second Type substring and its ontology to match are included;
S represents total confidence level;
M represents similarity;
A represents the corresponding preset weights of similarity M;
B represents the corresponding preset weights of cosine confidence level N;
Also, similarity M is calculated according to equation below:
Wherein, t represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
T in q represent each second dimension of Second Type substring;
D represents ontology;
Tf (t in d) expressions are in the second identical dimension, the second dimension analysis result and ontology of Second Type substring The frequency that matches of the second dimension analysis result;
Wherein, T represents the sum of ontology in ontology dictionary, and T (t) represents each second dimension analysis result The sum of ontology to match with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of ontology;
The total confidence level being calculated is determined as to the matching degree of the Second Type substring and each ontology.
12. the International Classification of Diseases coding method of automation according to claim 7, wherein, it is described according to described The matching degree of two type substrings and each ontology determines one or more ontology as the Second Type substring The step of ontology to match, including:
Size according to the matching degree with the Second Type substring sorts to whole ontologies, and it is forward wherein to sort The ontology of preset quantity be determined as the ontology that the Second Type substring matches;
Alternatively,
One or more ontologies of predetermined threshold value will be reached with the matching degree of the Second Type substring, be determined as described The ontology that Second Type substring matches.
13. a kind of International Classification of Diseases coded system of automation, including:
Standard terminology library creation module, for according to the International Classification of Diseases version to be referred to, by the state to be referred to The each disease term included in the classification of diseases ICD versions of border, is determined as standard terminology;According to the international disease to be referred to Disease classification ICD versions determine the coding of each standard terminology;The standard terminology and its coding are stored, obtains standard terminology Library;
Expand terminology bank creation module, for will be not included in the International Classification of Diseases ICD versions to be referred to Lower various types are determined as expanding term:The standard terminology be commonly called as nickname abbreviation, the subclass disease of the standard terminology Sick term and the newly generated disease term after the International Classification of Diseases ICD versions to be referred to are announced;Judge institute State expand term for any one of standard terminology be commonly called as nickname abbreviation when, the coding of the standard terminology is assigned should Expand term;Judge subclass disease term or the newly generated disease of the expansion term for any one of standard terminology During sick term, the expansion term will be assigned with the coding of the immediate standard terminology of the relation of genus and species of the expansion term;Storage institute It states and expands term and its coding, obtain expanding terminology bank;
Import modul, for inputting Chinese medical diagnosis on disease information;
Data processing module for carrying out natural language processing to the Chinese medical diagnosis on disease information, obtains one or more and treats Encoding name;
Coding module for being based on the standard terminology library and the expansion terminology bank, is searched and the title phase to be encoded The standard terminology matched expands term, and by the standard terminology of successful match or expands the coding of term, is determined as described waiting to compile The coding of code title;
The data processing module for carrying out natural language processing to the Chinese medical diagnosis on disease information, obtains one or more A title to be encoded, employing mode are:
Step 21, the Chinese medical diagnosis on disease information character string is pre-processed, obtains pretreated Chinese medical diagnosis on disease Information character string;
Step 22, based on ontology dictionary, disease degree glossary, the concurrent glossary of disease, the site of pathological change pre-established Glossary, by the pretreated Chinese medical diagnosis on disease information character string be cut into several first kind substrings and/ Or Second Type substring;
Wherein, the ontology dictionary includes the standard terminology library and the expansion terminology bank, the standard terminology and the expansion It is ontology to fill term;
The disease degree glossary includes several disease degree terms, and the disease degree term is for describing disease urgency The word of chronic degree or disease severity or histological type or clinical stages;
The concurrent glossary of disease includes the concurrent term of several diseases, and the concurrent term of disease is for describing at least two The word that kind disease concurrently occurs;
The site of pathological change glossary includes several site of pathological change terms, and the site of pathological change term is for describing disease hair The word at sick position;
The first kind substring can directly be matched with the ontology in the ontology dictionary, the sub- character of Second Type String can not directly be matched with the ontology in the ontology dictionary;
Step 23, the first kind substring being syncopated as and Second Type substring are determined as title to be encoded;
Wherein, the step 21 includes:
To the non-Chinese character in the Chinese medical diagnosis on disease information character string into row format normalized, and delete the Chinese disease Non-medical term in sick diagnostic message character string obtains pretreated Chinese medical diagnosis on disease information character string, wherein described The non-medical term dictionary that non-medical term is pre-established by one provides, and the word that the non-medical term has been remarks effect Language;
Wherein, the step 22 includes:
Judge the pretreated Chinese medical diagnosis on disease information character string whether comprising symbol;
If the pretreated Chinese medical diagnosis on disease information character string includes symbol, by the pretreated Chinese disease Character in sick diagnostic message character string between every adjacent two symbols is matched as a whole with the ontology in ontology dictionary; If successful match, using the character cutting between the adjacent two symbols out as first kind substring;If matching is lost Lose, then by the adjacent two symbols and its between character be determined as wouldn't cutting character string, and judge described in wouldn't cutting word Whether preset additional character is included in symbol string;
If it is described wouldn't in cutting character string comprising additional character, search described in wouldn't be belonging to cutting character string character mould Type, and the corresponding segmentation rules of character model according to belonging to this to it is described wouldn't cutting character string carry out cutting, will be syncopated as The character come is matched with the ontology in ontology dictionary, if successful match, using the character cut out as the first kind Type substring, if it fails to match, using the character cut out as Second Type substring;Wherein, the character The character model library that model is pre-established by one provides, and the character model has one-to-one segmentation rules;
If described wouldn't not include additional character in cutting character string, by it is described wouldn't cutting character string be determined directly as second Type substring;
If the pretreated Chinese medical diagnosis on disease information character string is not comprising symbol, using mechanical Chinese word segmentation method by described in In single character or multiple continuous characters and the ontology dictionary in pretreated Chinese medical diagnosis on disease information character string Ontology matched;
If all characters in the pretreated Chinese medical diagnosis on disease information character string can be with Ontology Matching, foundation Matched ontology by the single character in the pretreated Chinese medical diagnosis on disease information character string or multiple continuous words Symbol is cut out as first kind substring;
If exist in the pretreated Chinese medical diagnosis on disease information character string fail with the single character of Ontology Matching or
Whether multiple continuous characters then fail with the single character of Ontology Matching or multiple continuous characters to be disease described in judgement Course of disease degree term, the concurrent term of disease or site of pathological change term;
Fail with the single character of Ontology Matching or multiple continuous characters to be disease degree term, the concurrent term of disease when described Or during site of pathological change term, failed according to described with the single character of Ontology Matching or multiple continuous characters in the pretreatment The position in Chinese medical diagnosis on disease information character string afterwards fails described and the single character of Ontology Matching or multiple continuous Character is cut out with that can merge before or after it with the single character of Ontology Matching or multiple continuous characters as Two type substrings, and can be with Ontology Matching by remaining in the pretreated Chinese medical diagnosis on disease information character string Single character or multiple continuous character cuttings out as first kind substring;
Fail with the single character of Ontology Matching or multiple continuous characters not being disease degree term, the concurrent art of disease when described When language or site of pathological change term, the pretreated Chinese medical diagnosis on disease information character string is integrally cut out as second Type substring.
14. the International Classification of Diseases coded system of automation according to claim 13, wherein, the system also includes:
Hypothetical classification terminology bank creation module, for that will be not included in the International Classification of Diseases ICD versions to be referred to, It is and related to any one of standard terminology and clinically acquiescence is equal to the standard terminology and the not standard terminology Be commonly called as nickname abbreviation disease term, be determined as Hypothetical classification term;It will be with the relevant mark of the Hypothetical classification term The coding of quasi- term assigns the Hypothetical classification term;The Hypothetical classification term and its coding are stored, obtains Hypothetical classification art Repertorie;
The coding module is additionally operable to, based on the Hypothetical classification terminology bank, search the vacation to match with the title to be encoded Surely classify term;By the coding of the Hypothetical classification term of successful match, it is determined as the coding of the title to be encoded.
15. the International Classification of Diseases coded system of automation according to claim 13, wherein, the system also includes:
Odd encoder terminology bank creation module, for will be not included in the International Classification of Diseases ICD versions to be referred to and The disease term being made of at least two different standard terminologies, is determined as odd encoder term;The odd encoder will be formed The coded combination of whole standard terminologies of term together, the coding as the odd encoder term;Store the odd encoder art Language and its coding obtain odd encoder terminology bank;
The coding module is additionally operable to, based on the odd encoder terminology bank, search the more volumes to match with the title to be encoded Code term;By the coding of the odd encoder term of successful match, it is determined as the coding of the title to be encoded.
16. the International Classification of Diseases coded system of automation according to claim 13, wherein, the system also includes:
Merge terminology bank creation module, for the single standard art that will can substitute at least two while the standard terminology occurred Language is determined as merging term;And each at least two standard terminology occurred simultaneously is determined as the merging term Combining objects;According to the International Classification of Diseases ICD versions to be referred to, each coding for merging term is determined;It deposits The merging term and its coding and the whole combining objects for merging term are stored up, obtain merging terminology bank;
Preprocessing module, the one or more title to be encoded for being obtained to the data processing module are pre-processed, are sentenced In one or more of titles to be encoded of breaking, if comprising any one or more whole combining objects for merging term, if Comprising, then by it is described it is any one or more merge terms whole combining objects be substituted for corresponding merging term;Then will Title to be encoded after pretreatment is sent to the coding module.
17. according to the International Classification of Diseases coded system of any automation of claim 13~16, further include:
Without coded treatment module, for will not determine the title to be encoded of coding in no encryption description library without encryption description into Row matching, if successful match, to this do not determine coding title to be encoded do not encoded and/or exported it is preset as a result, If it fails to match, this not determined to, the title to be encoded of coding is sent to artificial treatment platform and carries out artificial treatment;
Wherein, the no encryption description library includes several no encryption descriptions;
Several no encryption descriptions include:
Preset traditional Chinese medical science class term;
The preset terms of surgery operation;
Preset nomenclature of drug term;
Preset medical treatment consumptive materials term;And
Preset check examines term.
18. the International Classification of Diseases coded system of automation according to claim 13, wherein, the state to be referred to Border classification of diseases ICD versions are the ICD versions that World Health Organization WHO is announced or World Health Organization WHO are announced The various localization ICD versions that ICD versions are extended.
CN201510496513.0A 2015-08-13 2015-08-13 A kind of International Classification of Diseases coding method of automation and system Active CN105069124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510496513.0A CN105069124B (en) 2015-08-13 2015-08-13 A kind of International Classification of Diseases coding method of automation and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510496513.0A CN105069124B (en) 2015-08-13 2015-08-13 A kind of International Classification of Diseases coding method of automation and system

Publications (2)

Publication Number Publication Date
CN105069124A CN105069124A (en) 2015-11-18
CN105069124B true CN105069124B (en) 2018-06-15

Family

ID=54498494

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510496513.0A Active CN105069124B (en) 2015-08-13 2015-08-13 A kind of International Classification of Diseases coding method of automation and system

Country Status (1)

Country Link
CN (1) CN105069124B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993227A (en) * 2019-03-29 2019-07-09 京东方科技集团股份有限公司 Method, system, device and the medium of automatic addition International Classification of Diseases coding

Families Citing this family (42)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108257667A (en) * 2016-12-28 2018-07-06 中国科学院深圳先进技术研究院 A kind of data processing method and terminal device
CN108320778A (en) * 2017-01-16 2018-07-24 医渡云(北京)技术有限公司 Medical record ICD coding methods and system
CN106844308B (en) * 2017-01-20 2020-04-03 天津艾登科技有限公司 Method for automatic disease code conversion using semantic recognition
CN107784057B (en) * 2017-03-03 2020-07-28 平安医疗健康管理股份有限公司 Medical data matching method and device
CN107833605A (en) * 2017-03-14 2018-03-23 北京大瑞集思技术有限公司 A kind of coding method, device, server and the system of hospital's medical record information
CN107256344A (en) * 2017-06-20 2017-10-17 上海联影医疗科技有限公司 Data processing method, device and radiotherapy management system
CN107491437B (en) * 2017-08-25 2020-09-29 广州宝荣科技应用有限公司 Chinese medicine syndrome semantic recognition method and device based on natural language
CN107577826B (en) * 2017-10-25 2018-05-15 山东众阳软件有限公司 Classification of diseases coding method and system based on raw diagnostic data
CN107731269B (en) * 2017-10-25 2020-06-26 山东众阳软件有限公司 Disease coding method and system based on original diagnosis data and medical record file data
CN107705839B (en) * 2017-10-25 2020-06-26 山东众阳软件有限公司 Disease automatic coding method and system
CN108182972B (en) * 2017-12-15 2021-07-20 中电科软件信息服务有限公司 Intelligent coding method and system for Chinese disease diagnosis based on word segmentation network
CN108170828B (en) * 2018-01-09 2022-04-29 深圳市第二人民医院 Structured clinical diagnosis term set construction method and system
CN108172265A (en) * 2018-01-09 2018-06-15 深圳市第二人民医院 Clinical diagnosis term set update method and its system
CN108182285B (en) * 2018-01-29 2021-03-02 中国平安人寿保险股份有限公司 Information processing method, terminal and computer readable storage medium
CN108446260A (en) * 2018-02-06 2018-08-24 天津艾登科技有限公司 The method and system of automation disease code conversion are carried out based on semantic approximate match algorithm
CN108564991A (en) * 2018-04-13 2018-09-21 重庆医科大学附属儿童医院 Digitization coding case history wrong identification system based on ICD and its recognition methods
CN108920661B (en) * 2018-07-04 2023-08-08 平安健康保险股份有限公司 International disease classification marking method, device, computer equipment and storage medium
CN109256216B (en) * 2018-08-14 2023-06-27 平安医疗健康管理股份有限公司 Medical data processing method, medical data processing device, computer equipment and storage medium
CN109545297A (en) * 2018-10-30 2019-03-29 平安医疗健康管理股份有限公司 A kind of disease coding method and calculating equipment based on big data
CN110491465B (en) * 2019-08-20 2020-09-15 山东众阳健康科技集团有限公司 Disease classification coding method, system, device and medium based on deep learning
CN110852076B (en) * 2019-10-12 2023-05-30 云知声智能科技股份有限公司 Method and device for automatic disease code conversion
CN111046882B (en) * 2019-12-05 2023-01-24 清华大学 Disease name standardization method and system based on profile hidden Markov model
CN110895580B (en) * 2019-12-12 2020-07-07 山东众阳健康科技集团有限公司 ICD operation and operation code automatic matching method based on deep learning
CN111210916B (en) * 2019-12-23 2021-06-29 望海康信(北京)科技股份公司 Medical record home page coding method and system
CN111259664B (en) * 2020-01-14 2023-03-24 腾讯科技(深圳)有限公司 Method, device and equipment for determining medical text information and storage medium
CN111554369B (en) * 2020-04-29 2023-08-04 杭州依图医疗技术有限公司 Medical data processing method, interaction method and storage medium
CN111626876A (en) * 2020-05-27 2020-09-04 泰康保险集团股份有限公司 Insurance auditing method, insurance auditing device, electronic equipment and storage medium
CN112131339A (en) * 2020-09-28 2020-12-25 上海梅斯医药科技有限公司 Name standardization standard processing method, device, computer and storage medium
CN112445917A (en) * 2020-11-05 2021-03-05 中国中医科学院中医药信息研究所 Method and device for constructing traditional medical disease body
CN112562818B (en) * 2020-12-02 2022-06-24 薛蕴菁 System and method for designing and realizing diagnosis logic based on structured report sub-template
CN112668280A (en) * 2020-12-29 2021-04-16 杭州依图医疗技术有限公司 Medical data processing method and device and storage medium
CN112735544A (en) * 2020-12-30 2021-04-30 杭州依图医疗技术有限公司 Medical record data processing method and device and storage medium
CN112700826A (en) * 2020-12-30 2021-04-23 杭州依图医疗技术有限公司 Medical data processing method and device and storage medium
CN112687397B (en) * 2020-12-31 2023-05-09 四川大学华西医院 Rare disease knowledge base processing method and device and readable storage medium
CN112836006B (en) * 2021-01-12 2022-09-23 山东众阳健康科技集团有限公司 Multi-diagnostic intelligent coding method, system, medium and equipment
CN112818085A (en) * 2021-01-28 2021-05-18 东软集团股份有限公司 Value range data matching method and device, storage medium and electronic equipment
CN113033154B (en) * 2021-05-31 2021-08-20 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Reading understanding-based medical concept coding method and device and storage medium
CN113641714A (en) * 2021-08-31 2021-11-12 平安医疗健康管理股份有限公司 Medical data correction method, device, computer equipment and storage medium
CN115964472A (en) * 2021-12-03 2023-04-14 奥码哈(杭州)医疗科技有限公司 ICD coding method, ICD coding query method, coding system and query system
CN115017326B (en) * 2022-05-12 2023-08-18 青岛普瑞盛医药科技有限公司 Medical coding method and device
CN115130431A (en) * 2022-07-05 2022-09-30 上海妙一生物科技有限公司 Coding method and coding device based on medical diseases and medicines
CN115080751B (en) * 2022-08-16 2022-11-11 之江实验室 Medical standard term management system and method based on general model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102456100A (en) * 2010-11-03 2012-05-16 通用电气公司 Systems, methods, and apparatus for computer-assisted full medical code scheme to code scheme mapping
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102456100A (en) * 2010-11-03 2012-05-16 通用电气公司 Systems, methods, and apparatus for computer-assisted full medical code scheme to code scheme mapping
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
中文分词算法的研究与实现;林冬盛;《中国优秀硕士学位论文全文数据库信息科技辑》;20110815(第08期);23-24 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993227A (en) * 2019-03-29 2019-07-09 京东方科技集团股份有限公司 Method, system, device and the medium of automatic addition International Classification of Diseases coding
CN109993227B (en) * 2019-03-29 2021-09-24 京东方科技集团股份有限公司 Method, system, apparatus and medium for automatically adding international disease classification code

Also Published As

Publication number Publication date
CN105069124A (en) 2015-11-18

Similar Documents

Publication Publication Date Title
CN105069124B (en) A kind of International Classification of Diseases coding method of automation and system
CN106934220B (en) Disease class entity recognition method and device towards multi-data source
CN105095665B (en) A kind of natural language processing method and system of Chinese medical diagnosis on disease information
CN109299472B (en) Text data processing method and device, electronic equipment and computer readable medium
CN105069123B (en) A kind of automatic coding and system of Chinese surgical procedure information
CN105184053B (en) A kind of automatic coding and system of Chinese medical service item information
CN111090461B (en) Code annotation generation method based on machine translation model
CN111708874A (en) Man-machine interaction question-answering method and system based on intelligent complex intention recognition
US9043206B2 (en) System and methods for matching an utterance to a template hierarchy
CN111401066B (en) Artificial intelligence-based word classification model training method, word processing method and device
CN110442869A (en) A kind of medical treatment text handling method and its device, equipment and storage medium
CN105138829B (en) A kind of natural language processing method and system of Chinese medical information
Kanwal et al. Urdu named entity recognition: Corpus generation and deep learning applications
CN104484411B (en) A kind of construction method of the semantic knowledge-base based on dictionary
CN109918672B (en) Structural processing method of thyroid ultrasound report based on tree structure
CN106844351A (en) A kind of medical institutions towards multi-data source organize class entity recognition method and device
CN103235775B (en) A kind of statistical machine translation method merging translation memory and phrase translation model
CN110287482A (en) Semi-automation participle corpus labeling training device
CN113658720A (en) Method, apparatus, electronic device and storage medium for matching diagnostic name and ICD code
CN106484676A (en) Biological Text protein reference resolution method based on syntax tree and domain features
Popescu-Belis et al. Reference resolution beyond coreference: a conceptual frame and its application
CN110263345A (en) Keyword extracting method, device and storage medium
CN114358021A (en) Task type dialogue statement reply generation method based on deep learning and storage medium
Zhang et al. Constructing covid-19 knowledge graph from a large corpus of scientific articles
Sangiacomo et al. Recreating the Network of Early Modern Natural Philosophy: A Mono-and Multilingual Text Data Vectorization Method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant