CN105069124A - Automatic ICD (International Classification of Diseases) coding method and system - Google Patents

Automatic ICD (International Classification of Diseases) coding method and system Download PDF

Info

Publication number
CN105069124A
CN105069124A CN201510496513.0A CN201510496513A CN105069124A CN 105069124 A CN105069124 A CN 105069124A CN 201510496513 A CN201510496513 A CN 201510496513A CN 105069124 A CN105069124 A CN 105069124A
Authority
CN
China
Prior art keywords
term
disease
coding
terminology
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510496513.0A
Other languages
Chinese (zh)
Other versions
CN105069124B (en
Inventor
金以东
朱华玲
陈志永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Original Assignee
Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ebaotech Internet Medical Information Technology (beijing) Co Ltd filed Critical Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Priority to CN201510496513.0A priority Critical patent/CN105069124B/en
Publication of CN105069124A publication Critical patent/CN105069124A/en
Application granted granted Critical
Publication of CN105069124B publication Critical patent/CN105069124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The embodiment of the present invention provides an automatic ICD (International Classification of Diseases) coding method. The method comprises: inputting Chinese disease diagnosis information; carrying out natural language processing on the Chinese disease diagnosis information to obtain one or a plurality of names to be coded; based on a standard terminology database and an expansion terminology database, searching standard terminologies or expansion terminologies matched with the names to be coded and determining codes of the standard terminologies or expansion terminologies successful in matching into codes of the names to be coded, wherein the standard terminologies are disease terminologies included in the ICD version to be referred to, and the expansion terminologies are proverbs, another names or abbreviations of the standard terminologies, or subclass disease terminologies of the standard terminologies, or newly generated disease terminologies. By the method disclosed by the embodiment of the present invention, ICD coding can be automatically completed without manual participation; and the method has the advantages of high coding speed, low cost, high accuracy and the like. Moreover, the embodiment of the present invention also provides an automatic ICD coding system.

Description

A kind of International Classification of Diseases coding method of robotization and system
Technical field
Embodiments of the present invention relate to classification of diseases field, relate to a kind of International Classification of Diseases coding method and system of robotization particularly.
Background technology
International Classification of Diseases (InternationalClassificationofDiseases, ICD) be some feature according to disease, according to rule, disease is classified, and by the system that the method for coding represents, in the application of China existing two more than ten years.The most popular ICD version in the current whole world is the ICD-10 that World Health Organization (WHO) WHO announced in 1992.According to the regulation of WHO, WHO only provides 4 ICD-10 encoded, and various countries or area can be carried out expansion to ICD-10 as required and be formed localization version (such as can increase disease quantity by adding extended code).
ICD makes disease term be able to standardization and format, is the application foundation of medical information, managing medical information, is also the important evidence of medical insurance clearing, therefore, effectively uses the development of ICD to health care system to have very important effect.
In ICD application, be mainly divided into h coding and Computer-Aided Coding two kinds of modes at present.In China, h coding's mode is used till today always, and the Record room of large hospital is all provided with professional coder post, based on coding criterion, can be inquired about by dictionary library by academic program and training, selects with the identical or close coding of the diagnostic result of doctor.Along with network and informationalized development, Computer-Aided Coding has become the focus in this field, and there is very strong development potentiality, domestic adopt at present builds classification of diseases path and code database more, be configured in infosystem, diagnostic result according to artificial input guides and recommends coding automatically, confirms by manually carrying out selection.
Summary of the invention
No matter be current h coding's mode or Computer-Aided Coding mode, all need artificial participation just can complete, and this artificial participation process exists efficiency is low, cost is high shortcoming, and different people participates in exporting different coding results, is unfavorable for the carrying out of the work such as the examination & verification that managing medical information, medical insurance are settled accounts.
In addition, because the Chinese medical diagnosis on disease information of doctor's input belongs to natural language, form complexity is various, there is no unified standard (such as, adopt multilingual mixing to express, use grammer lack of standardization, typing has false information, adopt abbreviation or be commonly called as replace standard terminology, be mingled with gibberish such as symbol etc. in word), coding difficulty is increased further, and error rate is also higher.
For this reason, a kind of ICD coded system of improvement is starved of.
In the present context, embodiments of the present invention expect the International Classification of Diseases coding method and the system that provide a kind of robotization.
In the first aspect of embodiment of the present invention, provide a kind of International Classification of Diseases coding method of robotization, comprising:
Step 1, inputs Chinese medical diagnosis on disease information;
Step 2, carries out natural language processing to described Chinese medical diagnosis on disease information, obtains one or more title to be encoded;
Step 3, based on standard terminology storehouse with expand terminology bank, searches the standard terminology that matches with described title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of described title to be encoded;
Wherein, described standard terminology storehouse creates in the following manner:
Determine will the International Classification of Diseases ICD version of reference;
By described each disease term that will comprise in the ICD version of reference, be defined as standard terminology;
According to described will the ICD version of reference, determine the coding of each standard terminology;
Store described standard terminology and coding thereof, obtain standard terminology storehouse;
Wherein, described expansion terminology bank creates in the following manner:
To described will the following type in the ICD version of reference being defined as expanding term be contained in: described standard terminology be commonly called as another name abbreviation, the subclass disease term of described standard terminology, and described will the ICD version of reference announce after the new disease term produced;
When described expansion term to be standard terminology described in any one be commonly called as another name abbreviation time, the coding of this standard terminology is given this expansion term;
When the disease term of the subclass disease term that described expansion term is standard terminology described in any one or described new generation, give this expansion term by the coding with the immediate standard terminology of the relation of genus and species of this expansion term;
Store described expansion term and coding thereof, obtain expanding terminology bank.
In the second aspect of embodiment of the present invention, provide a kind of International Classification of Diseases coded system of robotization, comprising:
Standard terminology storehouse creation module, for according to will the International Classification of Diseases version of reference, by described each disease term that will comprise in the ICD version of reference, be defined as standard terminology; According to described will the ICD version of reference, determine the coding of each standard terminology; Store described standard terminology and coding thereof, obtain standard terminology storehouse;
Expand terminology bank creation module, for will described will the following type in the ICD version of reference being defined as expanding term be contained in: described standard terminology be commonly called as another name abbreviation, the subclass disease term of described standard terminology, and described will the ICD version of reference announce after the new disease term produced; Judge that described expansion term is standard terminology described in any one be commonly called as another name abbreviation time, the coding of this standard terminology is given this expansion term; When judging the disease term of subclass disease term that described expansion term is standard terminology described in any one or described new generation, give this expansion term by the coding with the immediate standard terminology of the relation of genus and species of this expansion term; Store described expansion term and coding thereof, obtain expanding terminology bank;
Import module, for inputting Chinese medical diagnosis on disease information;
Data processing module, for carrying out natural language processing to described Chinese medical diagnosis on disease information, obtains one or more title to be encoded;
Coding module, for based on described standard terminology storehouse and described expansion terminology bank, search the standard terminology that matches with described title to be encoded or expand term, and by the standard terminology that the match is successful or the coding expanding term, being defined as the coding of described title to be encoded.
According to International Classification of Diseases coding method and the system of embodiment of the present invention, the present invention has taken into full account that the Chinese medical diagnosis on disease information that doctor inputs belongs to natural language, form complexity is various, there is no the features such as unified standard, the multiple dictionary set up according to ICD-9-CM-3 is in advance utilized to mate Chinese operation technique information character string, so that automatically, fast, identify operation technique title exactly and it is encoded, whole process is without the need to artificial participation, automatically ICD coding can be completed, without the need to artificial participation, improve coding rate, reduce coding cost, and ensure that coding accuracy.
summary of the invention
The present inventor finds, in medical domain, different geographical, commensurate or different practitioner be not when using disease term, the disease terminology standard difference (such as same disease term has various statement) that ubiquity adopts, and disease terminology standard covers not comprehensively situation such as (such as can not cover newborn term), cause occurring a large amount of disease term without ready patterns to follow in the Chinese medical diagnosis on disease information (information that such as basic medical insurance advice of settlement is recorded) produced, bring great obstacle to the ICD coding work based on Chinese medical diagnosis on disease information, in this case these disease terms without ready patterns to follow must be differentiated by means of artificial, namely conventional at present h coding's mode or Computer-Aided Coding mode, but this to have the artificial ICD coded system participated in there is efficiency low, cost is high, different people participates in exporting the shortcomings such as different coding results.
For this reason, the invention provides a kind of ICD encoding mechanism of robotization.ICD cataloged procedure can be: input Chinese medical diagnosis on disease information; Natural language processing is carried out to Chinese medical diagnosis on disease information, obtains one or more title to be encoded; Based on standard terminology storehouse with expand terminology bank, search the standard terminology that matches with title to be encoded or expand term, and by the coding of the standard terminology that the match is successful or expansion term, being defined as the coding of described title to be encoded.
Wherein, standard terminology storehouse according to the ICD version of reference create, comprise standard terminology and coding thereof, standard terminology be the disease term that will comprise in the ICD version of reference, the coding of standard terminology with its will coding in the ICD version of reference consistent.Expand terminology bank and comprise expansion term and coding thereof, expand term be not contained in will following type in the ICD version of reference: standard terminology be commonly called as another name abbreviation, the subclass disease term of standard terminology, or will the ICD version of reference announce after the new disease term produced, that expands term is encoded to the coding with the standard terminology expanding term synonym, or is the coding with the immediate standard terminology of relation of genus and species expanding term.
In the present invention, standard terminology storehouse cover the whole disease term that will record in the ICD version of reference and coding thereof, expand terminology bank then cover be not contained in will some diseases term in the ICD version of reference, these disease terms comprise some areas or the disease that often uses of unit is commonly called as, has another name called or abbreviation, or the subclass disease term of the disease term recorded in ICD version, or the some diseases term newly produced along with the development of medical technology.Standard terminology storehouse and expansion terminology bank cover the disease term that may occur in most Chinese medical diagnosis on disease information, substantially meet the requirement of automatically differentiating the disease term in Chinese medical diagnosis on disease information, thus the ICD of robotization coding is achieved.Whole ICD cataloged procedure, without the need to artificial participation, has that coding rate is fast, cost is low, accuracy advantages of higher.
After describing ultimate principle of the present invention, lower mask body introduces various non-limiting embodiment of the present invention.
application scenarios overview
First with reference to figure 1, it illustrates the application scenarios that embodiments of the present invention can be implemented wherein.
Scene shown in Fig. 1 comprises medical information processing terminal 100 and medical information processing server 200.Medical information processing terminal 100 can be doctor's equipment such as desktop computer, notebook computer, panel computer, personal digital assistant used.Medical information processing server 200 can be the server etc. running hospital information management system.Such as can be communicated to connect by hospital lan etc. between medical information processing terminal 100 and medical information processing server 200.
When needs carry out ICD coding based on Chinese medical diagnosis on disease information, can at medical information processing terminal 100 place, more specifically, be such as on medical information processing terminal 100 install software interface on input Chinese medical diagnosis on disease information, or, utilize the data storage device such as USB flash disk, portable hard drive in medical information processing terminal 100, import large batch of Chinese medical diagnosis on disease information.Medical information processing server 200 receives these Chinese medical diagnosis on disease information, and by carrying out natural language processing to these Chinese medical diagnosis on disease information, obtains title to be encoded; Then, medical information processing server 200 is based on standard terminology storehouse and expand terminology bank, the standard terminology that inquiry matches with title to be encoded or expand term, the coding of the standard terminology matched with title to be encoded the most at last or the coding of expansion term matched with title to be encoded, be defined as the coding of title to be encoded.
illustrative methods
Below in conjunction with the application scenarios of Fig. 1, with reference to figure 2A to Fig. 2 E, the ICD coding method according to exemplary embodiment of the invention is described.
It should be noted that above-mentioned application scenarios is only that embodiments of the present invention are unrestricted in this regard for the ease of understanding spirit of the present invention and principle and illustrating.On the contrary, embodiments of the present invention can be applied to applicable any scene.
Such as, shown in Fig. 2 A, for the process flow diagram of the ICD coding method of an embodiment of the present invention and standard terminology storehouse with expand terminology bank.
As shown in Figure 2 A, this ICD coding method can comprise:
Step S101, inputs Chinese medical diagnosis on disease information.
Alternatively, Chinese medical diagnosis on disease information can be the medical record information that medical worker inputs, and also can be the information recorded in basic medical insurance advice of settlement.
Step S102, carries out natural language processing to Chinese medical diagnosis on disease information, obtains one or more title to be encoded.
Particularly, this step can based on the feature of Chinese medical diagnosis on disease information, carry out the process such as mechanical Chinese word segmentation to Chinese medical diagnosis on disease information, and then from Chinese medical diagnosis on disease information, parse disease term, these disease terms parsed from this Chinese medical diagnosis on disease information are title to be encoded.
This illustrative methods will be introduced how Chinese medical diagnosis on disease information will be carried out a kind of specific embodiment of natural language processing below by embodiment five.
Step S103, based on standard terminology storehouse with expand terminology bank, searches the standard terminology that matches with title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of title to be encoded.
In the present embodiment, standard terminology storehouse creates according to step as shown in Figure 2 B:
Steps A 1, determine will the International Classification of Diseases ICD version of reference.
Alternatively, to the International Classification of Diseases ICD version of reference can be the ICD version (ICD-10 that such as WHO announced in 1992) that WHO announces, also can be the various localized ICD version (the ICD-10 Chinese edition of such as ministry of Health of China recommendation) that the ICD version announced WHO is expanded.During concrete enforcement, can select suitable ICD version according to actual needs as a reference, the present invention is not construed as limiting this.
Steps A 2, by each disease term that will comprise in the ICD version of reference, be defined as standard terminology.
Steps A 3, according to will the ICD version of reference, determine the coding of each standard terminology.
Particularly, due to clearly to describe the coding of each disease term in the ICD version of reference, therefore the coding of each standard terminology can directly therefrom be determined.
Steps A 4, storage standards term and coding thereof, obtain standard terminology storehouse.
Alternatively, standard terminology storehouse can adopt form storage standards term and the coding thereof of tables of data or tree structure.
ICD is according to classification, belongs to the relation record disease terms such as kind, and the relations such as the classification between disease term, genus kind are conducive to improving the speed of searching specified disease term.Based on this situation, when creating standard terminology storehouse, just can according to will in the ICD version of reference each disease term classification, belong to relation establishment tables of data or the tree structures such as kind, with the standard terminology clear in structure making to store in standard terminology storehouse, be convenient to search, be conducive to the speed improving coupling title to be encoded.
Alternatively, can also real time modifying standard terminology storehouse, such as, when referenced ICD version has new renewal version, according to renewal version, to increase, amendment or delete standard terminology, with the needs making standard terminology storehouse more meet ICD coding.
Fig. 2 C is depicted as the standard terminology storehouse of a tree structure form of the present embodiment.
In the present embodiment, expand terminology bank and create according to the step of such as Fig. 2 D:
Step B1, to described will the following type in the ICD version of reference being defined as expanding term be contained in: described standard terminology be commonly called as another name abbreviation, the subclass disease term of described standard terminology, and described will the ICD version of reference announce after the new disease term produced.
In medical domain, different geographical, non-commensurate or different practitioner are when using disease term, it may be not the disease term (i.e. standard terminology) recorded in ICD version, but being commonly called as of standard terminology, to have another name called or abbreviation, or the title (i.e. subclass disease name) etc. of standard terminology more refinement; In addition, along with the development of medical technology, can constantly have new disease term to occur, the ICD version in the past issued just there will be the phenomenon not covering the new disease term produced.Consider these situations, concrete region or the concrete unit of this method can implemented, being commonly called as of standard terminology of using in statistics real work, to have another name called or abbreviation, and the new disease term produced of statistics, using these as expanding term stored in expansion terminology bank, to meet the needs of ICD coding.
Step B2, when expand term be being commonly called as of any one standard terminology, have another name called or abbreviation time, the coding of this standard terminology is given this expansion term; When described expansion term is the disease term of the subclass disease term of any one standard terminology or described new generation, give this expansion term by the coding with the immediate standard terminology of the relation of genus and species of this expansion term.
Expand term be being commonly called as of standard terminology, have another name called or abbreviation time, expanding term with standard terminology is synonymy, therefore, can directly using the coding of the coding of standard terminology as expansion term.
When expansion term is the subclass disease term of any one standard terminology, in order to needs of encoding, according to clinical experience, the immediate standard terminology of relation of genus and species with subclass disease term can be determined, using the coding of the coding of this standard terminology as its subclass disease term.
Because the ICD version issued now can not cover the new disease term produced in the past, in order to needs of encoding, can according to clinical experience, search the immediate standard terminology of relation of genus and species of the disease term of new generation with these, using the coding of standard terminology that the finds coding as the disease term of these new generations.
Step B3, stores described expansion term and coding thereof, obtains expanding terminology bank.
Alternatively, expanding terminology bank can adopt the form of tables of data or tree structure to store expansion term and coding thereof.
Alternatively, terminology bank can also be expanded by real time modifying, such as, increase being commonly called as of standard terminology, have another name called or abbreviation, increase the new disease term produced, more expand term to make expansion terminology bank contain, meet the needs of ICD coding.
Fig. 2 E is depicted as the expansion terminology bank of a tables of data form of the present embodiment, and in Fig. 2 E, dash area is for explaining description, can not appear in actual expansion terminology bank.
Alternatively, during concrete implementation step S103, the mode of traversal standard terminology storehouse and expansion terminology bank can be adopted, search with the standard terminology of name-matches to be encoded or expand term.Consider that the time cost of traversal terminology bank may be higher, alternatively, also first according to the semanteme of title to be encoded, the relation of genus and species that title to be encoded is possible can be judged, in concrete tables of data or tree structure, then search the standard terminology that can mate or expand term.
The standard terminology matched with title to be encoded or a kind of specific embodiment expanding term how is searched below by being introduced this illustrative methods by embodiment six.
In the present embodiment, standard terminology storehouse and expansion terminology bank cover the disease term that may occur in most Chinese medical diagnosis on disease information, substantially meet the requirement of automatically differentiating the disease term in Chinese medical diagnosis on disease information, thus the ICD of robotization coding is achieved.The ICD coding method that the present embodiment provides, without the need to artificial participation, has that coding rate is fast, cost is low, accuracy advantages of higher.
Accompanying drawing explanation
In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, below the accompanying drawing used required in describing embodiment is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 schematically shows the application scenarios that embodiments of the present invention can be implemented wherein;
Fig. 2 A schematically shows the schematic flow sheet of ICD coding method in exemplifying method;
Fig. 2 B schematically shows the schematic flow sheet creating standard terminology storehouse in exemplifying method;
Fig. 2 C schematically shows the standard terminology storehouse of tables of data form in exemplifying method;
Fig. 2 D schematically shows the schematic flow sheet creating in exemplifying method and expand terminology bank;
Fig. 2 E schematically shows the expansion terminology bank of tables of data form in exemplifying method;
Fig. 3 A schematically shows the schematic flow sheet of ICD coding method in the embodiment of the present invention one;
Fig. 3 B schematically shows the schematic flow sheet creating Hypothetical classification terminology bank in the embodiment of the present invention one;
Fig. 3 C schematically shows the Hypothetical classification terminology bank of tables of data form in the embodiment of the present invention one;
Fig. 4 A schematically shows the schematic flow sheet of ICD coding method in the embodiment of the present invention two;
Fig. 4 B schematically shows the schematic flow sheet creating odd encoder terminology bank in the embodiment of the present invention two;
Fig. 4 C schematically shows the odd encoder terminology bank of tables of data form in the embodiment of the present invention two;
Fig. 5 A schematically shows the schematic flow sheet of ICD coding method in the embodiment of the present invention three;
Fig. 5 B schematically shows the schematic flow sheet creating in the embodiment of the present invention three and merge terminology bank;
Fig. 5 C schematically shows the merging terminology bank of tables of data form in the embodiment of the present invention three;
Fig. 6 A schematically shows the schematic flow sheet of ICD coding method in the embodiment of the present invention four;
Fig. 6 B schematically show tables of data form in the embodiment of the present invention four without encryption description storehouse;
Fig. 7 schematically shows the structured flowchart of ICD coded system in exemplifying equipment;
Fig. 8 schematically shows the structured flowchart of another kind of ICD coded system in exemplifying equipment;
Fig. 9 schematically shows the structured flowchart of another ICD coded system in exemplifying equipment;
Figure 10 schematically shows the structured flowchart of another ICD coded system in exemplifying equipment;
Figure 11 schematically shows the structured flowchart of another ICD coded system in exemplifying equipment;
Figure 12 A schematically shows the process flow diagram in the embodiment of the present invention five, Chinese medical diagnosis on disease information being carried out to natural language processing;
Figure 12 B schematically shows the part disease degree term that disease degree glossary comprises;
Figure 12 C schematically shows the concurrent term of part disease that the concurrent glossary of disease comprises;
Figure 12 D schematically shows the site morbidity position term that site of pathological change glossary comprises;
Figure 12 E schematically shows the process flow diagram of cutting first kind substring and Second Type substring in the embodiment of the present invention five;
Figure 12 F schematically shows a kind of segmentation rules;
Figure 12 G schematically shows another kind of segmentation rules;
Figure 12 H schematically shows another segmentation rules;
Figure 12 I schematically shows another segmentation rules;
Figure 12 J schematically shows another segmentation rules;
Figure 12 K schematically shows another segmentation rules;
Figure 13 schematically shows in the embodiment of the present invention six and searches the standard terminology matched with title to be encoded or the process flow diagram expanding term.
In the accompanying drawings, identical or corresponding label represents identical or corresponding part.
Embodiment
Below with reference to some illustrative embodiments, principle of the present invention and spirit are described.Should be appreciated that providing these embodiments is only used to enable those skilled in the art understand better and then realize the present invention, and not limit the scope of the invention by any way.On the contrary, provide these embodiments to be to make the disclosure more thorough and complete, and the scope of the present disclosure intactly can be conveyed to those skilled in the art.
Art technology technician know, embodiments of the present invention can be implemented as a kind of system, device, equipment, method or computer program.Therefore, the disclosure can be implemented as following form, that is: hardware, completely software (comprising firmware, resident software, microcode etc.) completely, or the form that hardware and software combines.
According to the embodiment of the present invention, a kind of International Classification of Diseases coding method and system of robotization is proposed.
In this article, it is front to diagnosing patient and disease therapy that " clinical " alleged by it will be appreciated that herein refers to that doctor comes to bed personally, the business practice of general reference medical institutions.
In addition, any number of elements in accompanying drawing is all unrestricted for example, and any name is all only for distinguishing, and does not have any limitation.
Below with reference to some representative embodiments of the present invention, explaination principle of the present invention and spirit in detail.
embodiment one
Shown in Fig. 3 A to Fig. 3 C, it is the ICD coding method of one embodiment of the invention.
As shown in Figure 3A, this ICD coding method can comprise:
Step S201, inputs Chinese medical diagnosis on disease information.
Step S202, carries out natural language processing to described Chinese medical diagnosis on disease information, obtains one or more title to be encoded.
Step S203, based on standard terminology storehouse, expand terminology bank and Hypothetical classification terminology bank, search match with described title to be encoded standard terminology, expand term or Hypothetical classification term, and by the coding of standard terminology, expansion term or the Hypothetical classification term that the match is successful, be defined as the coding of described title to be encoded.
Wherein, the present embodiment and illustrative methods adopt identical method create standard terminology storehouse and expand terminology bank, repeat no more herein.
In the present embodiment, Hypothetical classification terminology bank creates according to step as shown in Figure 3 B:
Step C1, by be not contained in will in the ICD version of reference and relevant to any one standard terminology and give tacit consent to clinically be equal to this standard terminology and not this standard terminology be commonly called as have another name called the disease term of abbreviation, be defined as Hypothetical classification term.
Step C2, by the coding of this standard terminology relevant to Hypothetical classification term, gives this Hypothetical classification term.
Medical domain often has this situation to occur: certain disease is divided into polytype, wherein one is clinical common type, other are rare clinical type, in this case, the general designation of this disease acquiescence is equal to the title of this clinical common type by medical worker often filling in or read medical record, when being diagnosed as those rare clinical types, then can write the title of rare clinical type clearly.Such as, mitral stenosis is divided into rheumatic mitral stenosis and non-rheumatic mitral stenosis, commonly rheumatic mitral stenosis clinically, non-rheumatic mitral stenosis is then very rare, " mitral stenosis " acquiescence is equal to " rheumatic mitral stenosis " when filling in or read medical record by medical worker usually, and only have when being diagnosed as non-rheumatic mitral stenosis, just can use " non-rheumatic mitral stenosis " when filling in medical record, to distinguish.
But in ICD, may not record the general designation of this disease, but describe its various particular type, such as, do not record " mitral stenosis " this disease term in ICD, but describe " rheumatic mitral stenosis " and " non-rheumatic mitral stenosis ".In this case, when carrying out ICD coding based on the disease general designation occurred in Chinese medical diagnosis on disease information, just there will be the situation not knowing to be classified as any particular type.
In the present embodiment, the general designation of disease in above this situation is defined as Hypothetical classification term.
When carrying out ICD coding, if run into this kind of Hypothetical classification term, the clinical common type of this disease can be assumed to, and be given this Hypothetical classification term by the coding of the clinical common type of this disease.
Such as, Hypothetical classification term is " mitral stenosis ", and its coding is identical with the coding of " rheumatic mitral stenosis ".
Step C3, stores Hypothetical classification term and coding thereof, obtains Hypothetical classification terminology bank.
Alternatively, Hypothetical classification terminology bank can adopt the form of tables of data or tree structure to store Hypothetical classification term and coding thereof.
Alternatively, Hypothetical classification terminology bank can also be revised in real time, such as, increase new Hypothetical classification term, or delete existing Hypothetical classification term, with the needs making Hypothetical classification terminology bank more meet ICD coding.
Fig. 3 C is depicted as the Hypothetical classification terminology bank of a tables of data form of the present embodiment, and in Fig. 3 C, dash area is for explaining description, can not appear in actual Hypothetical classification terminology bank.
Alternatively, during concrete implementation step S203, traversal standard terminology storehouse can be adopted, expand the mode of terminology bank and Hypothetical classification terminology bank, search with the standard terminology of name-matches to be encoded or expand term or Hypothetical classification term.
Consider the time cost of traversal terminology bank, alternatively, also first according to the semanteme of title to be encoded, the relation of genus and species that title to be encoded is possible can be judged, in concrete tables of data or tree structure, then search the standard terminology that can mate or expand term or Hypothetical classification term.
The present embodiment is in standard terminology storehouse and expand on the basis of terminology bank, turn increase Hypothetical classification terminology bank, the Hypothetical classification term occurred in Chinese medical diagnosis on disease information is taken into account, cover the disease term that may occur in Chinese medical diagnosis on disease information more broadly, providing more complete basis for meeting the disease term differentiated in Chinese medical diagnosis on disease information automatically, being conducive to the ICD coding realizing robotization.The ICD coding method that the present embodiment provides, without the need to artificial participation, has that coding rate is fast, cost is low, accuracy advantages of higher.
embodiment two
Shown in Fig. 4 A to Fig. 4 B, it is the ICD coding method of one embodiment of the invention.
As shown in Figure 4 A, this ICD coding method can comprise:
Step S301, inputs Chinese medical diagnosis on disease information.
Step S302, carries out natural language processing to Chinese medical diagnosis on disease information, obtains one or more title to be encoded.
Step S303, based on standard terminology storehouse, expand terminology bank and odd encoder terminology bank, search match with title to be encoded standard terminology, expand term or odd encoder term, and by the coding of standard terminology, expansion term or the odd encoder term that the match is successful, be defined as the coding of title to be encoded.
Wherein, the present embodiment and illustrative methods adopt identical method create standard terminology storehouse and expand terminology bank, repeat no more herein.
Alternatively, this step can also based on Hypothetical classification terminology bank, search the Hypothetical classification term matched with described title to be encoded, and by the coding of the Hypothetical classification term that the match is successful, be defined as the coding of title to be encoded, wherein, the present embodiment can adopt identical method to create Hypothetical classification terminology bank with embodiment one, repeats no more herein.
In the present embodiment, odd encoder terminology bank creates according to step as shown in Figure 4 B:
Step D1, by not being contained in described the disease term that will form in the ICD version of reference and by least two different described standard terminologys, is defined as odd encoder term.
Step D2, by the coded combination of whole standard terminologys of the described odd encoder term of composition together, as the coding of described odd encoder term.
Medical domain often has the situation of the concurrent appearance of various diseases, and corresponding disease term may be the result that multiple standard terminology is combined.Under the circumstances, the present embodiment using this kind of disease term as odd encoder term stored in odd encoder terminology bank, and according to the order of multiple standard terminologys of this odd encoder term of composition, using the coding as this odd encoder term after the coding of this multiple standard terminology combines successively.
Such as odd encoder term " mitral stenosis merges auricular fibrillation companion left atrial thrombus ", the multiple standard terminologys forming this odd encoder term are respectively " mitral stenosis ", " auricular fibrillation ", " atrial thrombus ", wherein, the ICD of " mitral stenosis " is encoded to I05.000, the ICD of " auricular fibrillation " is encoded to I487.x01, the ICD of " atrial thrombus " is encoded to I51.302, then the ICD of " mitral stenosis merges auricular fibrillation companion left atrial thrombus " is encoded to I05.0I487.x01I51.302.
Step D3, stores described odd encoder term and coding thereof, obtains odd encoder terminology bank.
Alternatively, odd encoder terminology bank can adopt the form of tables of data or tree structure to store odd encoder term and coding thereof.
Alternatively, odd encoder terminology bank can also be revised in real time, such as, increase new odd encoder term, or delete existing odd encoder term, with the needs making odd encoder terminology bank more meet ICD coding.
Fig. 4 C is depicted as the odd encoder terminology bank of a tables of data form of the present embodiment, and in Fig. 4 C, dash area is for explaining description, can not appear in actual Hypothetical classification terminology bank.
Alternatively, during concrete implementation step S303, traversal standard terminology storehouse can be adopted, expand the mode of terminology bank and odd encoder terminology bank, search with the standard terminology of name-matches to be encoded or expand term or odd encoder term.Consider the time cost of traversal terminology bank, alternatively, also first according to the semanteme of title to be encoded, the relation of genus and species that title to be encoded is possible can be judged, in concrete tables of data or tree structure, then search the standard terminology that can mate or expand term or odd encoder term.
The present embodiment is in standard terminology storehouse and expand on the basis of terminology bank, turn increase odd encoder terminology bank, the odd encoder term occurred in Chinese medical diagnosis on disease information is taken into account, cover the disease term that may occur in Chinese medical diagnosis on disease information more broadly, providing more complete basis for meeting the disease term differentiated in Chinese medical diagnosis on disease information automatically, being conducive to the ICD coding realizing robotization.The ICD coding method that the present embodiment provides, without the need to artificial participation, has that coding rate is fast, cost is low, accuracy advantages of higher.
embodiment three
Shown in Fig. 5 A to Fig. 5 B, it is the ICD coding method of one embodiment of the invention.
As shown in Figure 5A, this ICD coding method can comprise:
Step S401, inputs Chinese medical diagnosis on disease information.
Step S402, carries out natural language processing to Chinese medical diagnosis on disease information, obtains one or more title to be encoded.
Step S403, based on merging terminology bank, pre-service is carried out to the title one or more to be encoded that step S402 obtains, judge in described one or more title to be encoded, whether comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of described any one or more merging term are replaced to corresponding merging term.
In the present embodiment, merge terminology bank and create according to step as shown in Figure 5 B:
Step e 1, the single standard terminology that can substitute at least two standard terminologys simultaneously occurred is defined as merging term, is defined as merging term; And each in these at least two different standard terminologys is defined as the combining objects of this merging term.
Step e 2, according to described will the ICD version of reference, determine that each merges the coding of term.
Step e 3, stores whole combining objects of described merging term and coding and described merging term, obtains merging terminology bank.
In ICD, if multiple disease term occurs simultaneously, can carry out alternative multiple disease terms that these occur simultaneously by another disease term, when ICD encodes, ICD specifies the coding only exporting this single disease term.In the present embodiment, will belong to above situation, the single disease term that can substitute other multiple disease terms simultaneously occurred is defined as merging term, and each disease term that can be replaced is defined as combining objects.
Such as, in disease category, if " gastric ulcer " and " upper gastrointestinal bleeding " occur simultaneously, then can be substituted by " gastric ulcer is accompanied hemorrhage ", during ICD coding, only need the coding exporting " companion is hemorrhage for gastric ulcer ".
Consider above situation, after the present embodiment obtains one or more title to be encoded carrying out natural language processing to Chinese medical diagnosis on disease information, increase the pretreated step of these titles to be encoded, namely search in these titles to be encoded whether exist can be replaced combining objects, if wherein comprise whole combining objects that some merging terms are corresponding, then this merging term is utilized to substitute its whole combining objects.
Alternatively, merging terminology bank can adopt the form of tables of data or tree structure to store merging term and coding thereof.
Alternatively, all right real time modifying merges terminology bank, and such as, when referenced ICD version has new renewal version, according to renewal version, increase, amendment or deletion merge term, with the needs making merging terminology bank more meet ICD coding.
Fig. 5 C is depicted as the merging terminology bank of a tables of data form of the present embodiment, and in Fig. 5 C, dash area is for explaining description, can not appear in actual merging terminology bank.
Step S404, based on the standard terminology storehouse created, expand terminology bank, Hypothetical classification terminology bank and odd encoder terminology bank, search pretreated title to be encoded matches with step S403 standard terminology, expand term, Hypothetical classification term or odd encoder term, and by the standard terminology that the match is successful, the coding expanding term, Hypothetical classification term or odd encoder term, be defined as the coding of title to be encoded.
Wherein, the present embodiment and illustrative methods adopt identical method create standard terminology storehouse and expand terminology bank, adopt identical method to create Hypothetical classification terminology bank with embodiment one, adopt identical method to create odd encoder terminology bank with embodiment two, all repeat no more herein.
Alternatively, during concrete implementation step S403, traversal standard terminology storehouse can be adopted, expand the mode of terminology bank, Hypothetical classification terminology bank and odd encoder terminology bank, search with the standard terminology of name-matches to be encoded or expand term or Hypothetical classification terminology bank or odd encoder term.Consider the time cost of traversal terminology bank, alternatively, also can first according to the semanteme of title to be encoded, judge the relation of genus and species that title to be encoded is possible, in concrete tables of data or tree structure, then search the standard terminology that can mate or expand term or Hypothetical classification term or odd encoder term.
The present embodiment is in standard terminology storehouse and expand on the basis of terminology bank, turn increase merging terminology bank, the merging term occurred in Chinese medical diagnosis on disease information is taken into account, cover the disease term that may occur in Chinese medical diagnosis on disease information more broadly, providing more complete basis for meeting the disease term differentiated in Chinese medical diagnosis on disease information automatically, being conducive to the ICD coding realizing robotization.The ICD coding method that the present embodiment provides, without the need to artificial participation, has that coding rate is fast, cost is low, accuracy advantages of higher.
embodiment four
Shown in Fig. 6 A, it is the ICD coding method of one embodiment of the invention.
As shown in Figure 6A, this ICD coding method can comprise:
Step S501, inputs Chinese medical diagnosis on disease information.
Step S502, carries out natural language processing to Chinese medical diagnosis on disease information, obtains one or more title to be encoded.
Step S503, based on merging terminology bank, pre-service is carried out to the title one or more to be encoded that step S502 obtains, judge in one or more title to be encoded, whether comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of any one or more merging term are replaced to corresponding merging term.
Step S504, based on standard terminology storehouse, expand terminology bank, Hypothetical classification terminology bank, odd encoder terminology bank, search match with title to be encoded standard terminology, expand term, Hypothetical classification term, odd encoder term, and by the standard terminology that the match is successful, the coding expanding term, Hypothetical classification term, odd encoder term, be defined as the coding of title to be encoded; To the standard terminology matched, the title to be encoded expanding term, Hypothetical classification term, odd encoder term do not found, be defined as the title to be encoded not determining to encode;
Wherein, the present embodiment and illustrative methods adopt identical method create standard terminology storehouse and expand terminology bank, adopt identical method to create Hypothetical classification terminology bank with embodiment one, adopt identical method to create odd encoder terminology bank with embodiment two, all repeat no more herein.
Step S505, by do not determine encode title to be encoded with without mating without encryption description in encryption description storehouse, if the match is successful, then perform default treatment step not to represent this not to be determined to the title to be encoded of encoding is encoded and (such as exports as empty, or, the character informations such as display " can compile without code "), if it fails to match, then this is not determined that the title to be encoded of encoding is sent to artificial treatment platform and carries out artificial treatment.
In the present embodiment, comprise some without encryption description without encryption description storehouse.These comprise without encryption description: the traditional Chinese medical science class term preset; The terms of surgery operation preset; The nomenclature of drug term preset; The medical treatment consumptive materials term preset; And the inspection inspection term preset.
Fig. 6 B be depicted as a tables of data form of the present embodiment without encryption description storehouse, in Fig. 6 B, dash area is for explaining description, can not appear at actual in encryption description storehouse.
The multiple concept of medical field is often related in actual Chinese medical diagnosis on disease information, it is not only disease term, may be also the terms of surgery operation, nomenclature of drug term, medical treatment consumptive materials term, check inspection term etc., but the present invention only relates to the sorting code number to disease, and not to the terms of surgery operation in International Classification of Diseases ICD version, nomenclature of drug term, medical treatment consumptive materials term, check that inspection term etc. carries out sorting code number, therefore, if there are the terms of surgery operation in Chinese medical diagnosis on disease information, nomenclature of drug term, medical treatment consumptive materials term, check inspection term, will not encode (namely can compile without code).In addition, in International Classification of Diseases ICD version, also sorting code number is not carried out to traditional Chinese medical science class term, therefore, if there is traditional Chinese medical science class term in Chinese medical diagnosis on disease information, also will not encode (namely can compile without code).
For this kind of term that will not encode, a result preset (such as can export the result of " can compile without code " and so on) can be exported, to show to identify, it is the terms of surgery operation, nomenclature of drug term, medical treatment consumptive materials term, check inspection term or traditional Chinese medical science class term, does not just have ICD code to give.
In the present embodiment, for not finding the standard terminology matched, expand term, Hypothetical classification term, the title to be encoded of odd encoder term, match if can find without encryption description, then illustrate that it belongs to the terms of surgery operation, nomenclature of drug term, medical treatment consumptive materials term, check the one in inspection term or traditional Chinese medical science class term, will not encode, and for can not find match without encryption description, illustrate that it does not belong to the above-mentioned type, for this kind of title to be encoded, the present embodiment sends it to artificial treatment platform, by manually continuing process, concrete processing procedure, the present invention is not construed as limiting it.
embodiment five
As illustrated in fig. 12, for being applicable to a kind of of exemplifying method, natural language processing being carried out to obtain the embodiment of title to be encoded to Chinese medical diagnosis on disease information, comprising:
Step S61, carries out pre-service to Chinese medical diagnosis on disease information character string, obtains pretreated Chinese medical diagnosis on disease information character string.
The object of this step is that the character conversion in Chinese medical diagnosis on disease information character string is become unified coded format, so that subsequent treatment.
Alternatively, this step can be implemented according to following concrete mode: carry out form normalized (such as to the non-Chinese character in Chinese medical diagnosis on disease information character string, symbol in Chinese medical diagnosis on disease information character string is all converted to half width form or is all converted to full-shape form, English alphabet wherein is all converted to uppercase format or lower case format); And the non-medical term deleted in Chinese medical diagnosis on disease information character string.Wherein non-medical term is provided by the non-medical term dictionary that sets up in advance, and non-medical term has been the word of remarks effect or descriptive statement (such as " to be checked, reason, warm tip, suggestion, please go to a doctor at any time as aggravation " etc.).
Step S62, based on the body dictionary set up in advance, disease degree glossary, the concurrent glossary of disease, site of pathological change glossary, pretreated Chinese medical diagnosis on disease information character string is cut into first kind substring and/or Second Type substring.
Wherein, first kind substring and Second Type substring have independent semanteme, namely represented medical information does not affect by the character before or after it, and first kind substring directly can mate with the body in body dictionary, Second Type substring directly can not mate with the body in body dictionary.
Body dictionary comprises aforesaid standards terminology bank and expands terminology bank, specifically comprises standard terminology and expands term and encode accordingly, and wherein, standard terminology and expansion term are considered the body in body dictionary.
It should be noted that, when having used aforesaid Hypothetical classification terminology bank and/or odd encoder terminology bank in the International Classification of Diseases coding method of robotization provided by the invention, body dictionary also should comprise Hypothetical classification terminology bank and/or odd encoder terminology bank (now, Hypothetical classification term and/or odd encoder term are also considered the body in body dictionary), can match with Hypothetical classification term or odd encoder term to make the first kind substring that is syncopated as or Second Type substring as during title to be encoded.
Disease degree glossary comprises some disease degree terms, and disease degree term is the word for describing disease acute and chronic degree or disease severity or histological type or clinical stages etc.Be part disease degree term that disease degree glossary comprises as shown in Figure 12 B.
The concurrent glossary of disease comprises the concurrent term of some diseases, and the concurrent term of disease is the word for describing at least two kinds of concurrent appearance of disease.The concurrent term of part disease comprised for the concurrent glossary of disease as indicated in fig. 12 c.
Site of pathological change glossary comprises some site of pathological change terms, and site of pathological change term is the word for describing disease incidence position.Be site morbidity position term that site of pathological change glossary comprises as indicated in fig. 12d.
The object of this step Chinese medical diagnosis on disease information is cut into have independent semantic substring (first kind substring or Second Type substring), effectively to avoid being carried out respectively identifying by multiple characters with incidence relation thus causing the problem of identification error.
Step S63, is defined as title to be encoded by the first kind substring be syncopated as and Second Type substring.
After the first kind substring be syncopated as and Second Type substring are defined as title to be encoded, merging terminology bank in later use embodiment three treats encoding name when carrying out pre-service, because first kind substring and body corresponding to Second Type substring may be expand term, and the combining objects merged in terminology bank is standard terminology, therefore, expansion term corresponding to first kind substring and Second Type substring need be converted to corresponding standard terminology, and then utilize merging terminology bank to carry out pre-service.
As shown in figure 12e, step S62 specifically comprises:
Step S70, judges whether pretreated Chinese medical diagnosis on disease information character string comprises symbol; If comprise symbol, then perform step S71; If do not comprise symbol, then perform step S72.
Step S71, mates with the body in body dictionary as a whole by the character between adjacent two symbols every in pretreated Chinese medical diagnosis on disease information character string; If the match is successful, then perform step S711; If it fails to match, then perform step S712.
Step S711, using the character cutting between this adjacent two symbols out as first kind substring.
Step S712, this adjacent two symbols and between character be defined as wouldn't cutting character string, then perform step S73.
The processing rule of step S71, step S711, step S712 foundation is: mated with body as a whole by the alphabet between adjacent-symbol, just cutting when only having coupling, otherwise temporarily refuses cutting.
Such as, to " severe arthritis, and hematocele shown in Figure 12 F; A type thymoma; Coronary heart disease " cutting, wherein, " severe arthritis, and hematocele ", " A type thymoma " and " coronary heart disease " are the alphabet between symbol, and can find the body matched, and therefore, are split out respectively.
Step S72, adopts mechanical Chinese word segmentation method to be mated with the body in body dictionary by pretreated Chinese medical diagnosis on disease information character string; If all characters in pretreated Chinese medical diagnosis on disease information character string all can with Ontology Matching, then perform step S721; Fail and the single character of Ontology Matching or multiple continuous print character if exist in pretreated Chinese medical diagnosis on disease information character string, then perform step S722.
Step S721, according to the body that mates using the character cutting in pretreated Chinese medical diagnosis on disease information character string out as first kind substring.
Step S722, judge whether fail with the single character of Ontology Matching or multiple continuous print character is disease degree term, the concurrent term of disease or site of pathological change term, if the concurrent term of disease degree term, disease or site of pathological change term, then perform step S7221; If not disease degree term, the concurrent term of disease or site of pathological change term, then perform step S7222.
The processing rule of step S72, step S721, step S722 foundation is: adopt mechanical Chinese word segmentation method to be mated with body by the character in pretreated Chinese medical diagnosis on disease information character string, just cutting when only having alphabet can find the body matched, otherwise temporarily refuse cutting.
Such as Figure 12 G is depicted as the cutting to " coronary heart disease of hypertension ", and employing mechanical Chinese word segmentation method can find the body that " hypertension " and " coronary heart disease " matches respectively, therefore, is split out respectively.
The mechanical Chinese word segmentation method that step S72 adopts can be Forward Maximum Method type, reverse maximum matching type, or minimum cutting type.Concrete dicing process, the present embodiment repeats no more.
Step S7221, according to failing and the position in the single character of Ontology Matching or multiple continuous print character Chinese medical diagnosis on disease information character string after the pre-treatment, cut out failing as Second Type substring with single character or multiple continuous print character of Ontology Matching and can merging with the single character of Ontology Matching or multiple continuous print character before or after it, and can with the single character of Ontology Matching or multiple continuous print character cutting out as first kind substring using remaining.
Step S7222, cuts out pretreated Chinese medical diagnosis on disease information character string entirety as Second Type substring.
The processing rule of step S7221, step S7222 foundation is: if failing with the single character of Ontology Matching or multiple continuous print character is disease degree term, the concurrent term of disease or site of pathological change term, then perform cutting, and during cutting be by itself and its before or after character merge cut out.
Such as Figure 12 H is depicted as the cutting to " hyperplasia of prostate companion AUR diabetes ", adopt the body that mechanical Chinese word segmentation method can find " hyperplasia of prostate " respectively, " AUR " and " diabetes " matches, " companion " is wherein the concurrent term of disease, therefore, " hyperplasia of prostate " and " AUR " are merged and cut out, " diabetes " cut out separately.
Such as Figure 12 I is depicted as the cutting to " the acute renal anemia of hyperplasia of prostate ", employing mechanical Chinese word segmentation method can find the body that " hyperplasia of prostate " and " renal anemia " matches respectively, " acute " is wherein disease degree term, therefore, " hyperplasia of prostate " is cut out separately, " acute " and " renal anemia " is merged and cuts out.
Such as Figure 12 J is depicted as the cutting to " subacute bronchitis hyperplasia of prostate ", employing mechanical Chinese word segmentation method can find the body that " bronchitis " and " hyperplasia of prostate " matches respectively, " subacute " is wherein disease degree term, and the position in " subacute " Chinese medical diagnosis on disease information character string is after the pre-treatment beginning, therefore, " subacute " and " bronchitis " is merged and cuts out, " hyperplasia of prostate " is cut out separately.
Such as Figure 12 K is depicted as the cutting to " bronchitis prostate cancer late period ", employing mechanical Chinese word segmentation method can find the body that " bronchitis " and " prostate cancer " matches respectively, " late period " is wherein disease degree term, and the position in " late period " Chinese medical diagnosis on disease information character string is after the pre-treatment end, therefore, " bronchitis " is cut out separately, " prostate cancer " and " late period " is merged and cuts out.
Step S73, judges whether wouldn't comprise default special symbol in cutting character string; If special symbol wouldn't be comprised in cutting character string, then perform step S731; If special symbol wouldn't do not comprised in cutting character string, then perform step S733.
Step S731, searching wouldn't character model belonging to cutting character string, and according to segmentation rules corresponding to this affiliated character model to cutting character string carrying out cutting; Wherein, character model is provided by the character model storehouse that sets up in advance, and character model has segmentation rules one to one.
Step 332, the character cut out is mated with the body in body dictionary, if the match is successful, then the character that this cuts out is defined as first kind substring, if it fails to match, then the character that this cuts out is defined as Second Type substring;
Step S733, cutting character string wouldn't directly be defined as Second Type substring.
The processing rule of step S73, step S731, step 332, step S733 foundation is: when comprising default special symbol in cutting character string, according to carrying out cutting by the character model belonging to cutting character string, otherwise directly cuts out; And the character be syncopated as based on character model is mated with body again, using wherein can with body directly mate as first kind substring, can not directly mate as Second Type substring.
The special symbol such as preset can include but not limited to comma, pause mark, fullstop, colon, plus sige, branch, slash line etc.
Be the partial character model in character model storehouse and segmentation rules thereof such as:
(1) character model: XABY type, A is numeral, and B is comma, pause mark or fullstop;
Segmentation rules: respectively X and Y is cut out;
(2) character model: CDE type, and one of C, E are Chinese character, D is colon;
Segmentation rules: by the Chinese character segmentation in C, E out;
(3) character model: FGH type, and F, H are Chinese character, G is plus sige;
Segmentation rules: FGH is cut out as a whole;
(4) character model: IJK type, and I, K are Chinese character, J is branch, fullstop, question mark, exclamation,
Segmentation rules: I and K is cut out respectively;
(5) character model: LOP type, and L, P are Chinese character, O is colon;
Segmentation rules: LOP is cut out as a whole;
(6) character model: STU type, and S and/or U is individual Chinese character, T is slash line;
Segmentation rules: STU is cut out as a whole.
Such as to " stomachache:? " carry out cutting, through searching, character model storehouse is known belongs to CDE type, then " stomachache " cut out separately.
Such as carry out cutting to " congenital heart disease: ventricular septal defect ", through searching, character model storehouse is known belongs to LOP type, then " congenital heart disease: ventricular septal defect " entirety cut out.
Such as carry out cutting to " propping up/choamydiae infection ", through searching, character model storehouse is known belongs to STU type, then entirety cuts out " will to prop up/choamydiae infection ".
Such as to " stomachache; Prostatitis " carry out cutting, through searching, character model storehouse is known belongs to IJK type, be then " stomachache " and " prostatitis " by its cutting.
Such as to " 1, cervical spondylopathy 2, lumbar intervertebral disc bulge 3, pregnant 24+3 week 4, the prolapse of uterus, II degree, 5, / choamydiae infection " carry out cutting, various characters model is related to through searching character model storehouse this character string known, the character be finally syncopated as is respectively " cervical spondylopathy ", " lumbar intervertebral disc bulge ", " pregnant 24+3 week ", " the prolapse of uterus, II degree ", "/choamydiae infection ", the character these be syncopated as continues to mate with body, " cervical spondylopathy " wherein, " lumbar intervertebral disc bulge " directly can mate with body, then as first kind substring, and " pregnant 24+3 week ", " the prolapse of uterus, II degree ", "/choamydiae infection " directly can not mate with body, then as Second Type substring.
The present embodiment is carrying out in the process of natural language processing to Chinese medical diagnosis on disease information, taken into full account Chinese medical diagnosis on disease information belong to natural language, form complexity various, there is no the features such as unified standard, utilize multiple dictionary set up in advance to carry out cutting and coupling to Chinese medical diagnosis on disease information character string, using this, medical diagnosis on disease title is identified as title to be encoded.
embodiment six
As shown in figure 13, searching the standard terminology matched with title to be encoded or the embodiment expanding term for being applicable to a kind of of exemplifying method, comprising:
Step S80, if name to be encoded is called first kind substring, then by body that this first kind substring matches, be defined as title to be encoded matches with this standard terminology or expand term, if name to be encoded is called Second Type substring, then each body in Second Type substring and body dictionary is carried out to the parsing of the first dimension, obtain some first dimension analysis results of Second Type substring, and some first dimension analysis results of each body;
This step is using Second Type substring and body as analysis object, and alternatively, parsing analysis object being carried out to the first dimension can include but not limited to:
(1) determine the letter of the beginning part in analysis object, if wherein the beginning part is not letter, then this analysis result is empty;
(2) determine the disease degree term comprised in analysis object, if wherein do not comprise disease degree term, then this analysis result is empty;
(3) determine the character in analysis object after comma, if wherein do not comprise comma, then this analysis result is empty;
(4) determine the character in analysis object bracket, if wherein do not comprise bracket, then this analysis result is empty; And,
(5) determine the character (the residue character hereinafter referred to as in body) except the character after the letter of the beginning part, disease degree term, comma, character in bracket in analysis object, be generally the core stem of analysis object.
When analysis object is Second Type substring, its each the first dimension analysis result can include but not limited to: the character in the disease degree term comprised in the letter of Second Type substring the beginning part, Second Type substring, Second Type substring after comma, the character in Second Type substring bracket, residue character.
When analysis object is body, its each the first dimension analysis result can include but not limited to: the character in the disease degree term comprised in the letter of body the beginning part, body, body after comma, the character in body bracket, residue character.
Step S81, each first dimension analysis result of Second Type substring is mated with each first dimension analysis result of each body in body dictionary, searches each first dimension analysis result that whether there is certain body and all match with each first dimension analysis result of Second Type substring; If there is such body, then perform step S82, if there is no such body, then perform step S83.
Step S82, is defined as the body that Second Type substring matches by the body found.
Step S83, choose the part first dimension analysis result in whole first dimension analysis results of Second Type substring to mate with the part first dimension analysis result in whole first dimension analysis results of each body in body dictionary, and this part first dimension analysis result searching this part first dimension analysis result and the Second Type substring that whether there is certain body matches; If there is such body, then perform step S831; If there is no such body, then perform step S832.
Step S831, is defined as the body that Second Type substring matches by the body found.
Respectively the letter of Second Type substring the beginning part is mated with the letter of body the beginning part, the disease degree term comprised in Second Type substring is mated with the disease degree term comprised in body, character after comma in Second Type substring is mated with the character after comma in body, character in Second Type substring bracket is mated with the character in body bracket, the residue character in Second Type substring is mated with the residue character in body.
If the first whole dimension analysis results all mates, then this body is defined as the body that Second Type substring matches.
If some first dimension analysis result does not mate, then selected part first dimension analysis result mates respectively.
Consider the core stem of residue character in Second Type substring Second Type substring often, therefore, in concrete enforcement, preferably, selected part first dimension analysis result at least comprises the residue character in Second Type substring, and the residue character in body.Such as, residue character and the disease degree term of only choosing analysis object mate respectively, or, the residue character only choosing analysis object mates, or, the residue character of analysis object can also be chosen and mate respectively with the character after the letter of the beginning part or disease degree term or comma or the character etc. in bracket.
Such as a certain Second Type substring is " the long-pending disease of 4 type mucopolysaccharide storage ", it is carried out to the parsing of the first dimension, the analysis result obtained is as shown in table 1, and as shown in table 2 is the body and each the first dimension analysis result thereof that match with this Second Type substring.
Table 1
Table 2
Step S832, carries out the parsing of the second dimension to each body in Second Type substring and body dictionary, obtain each second dimension analysis result of Second Type substring, and each second dimension analysis result of each body in body dictionary.
This step is using Second Type substring and body as analysis object, and alternatively, parsing analysis object being carried out to default dimension can include but not limited to:
(1) each Chinese character in analysis object is determined;
(2) initial consonant of each Chinese character in analysis object is determined;
(3) simple or compound vowel of a Chinese syllable of each Chinese character in analysis object is determined;
(4) initial character of analysis object is determined;
(5) phonetic of the initial character of analysis object is determined; And,
(6) determine the non-Chinese character in analysis object, if wherein do not comprise non-Chinese character, then this analysis result is empty.
When analysis object is Second Type substring, the analysis result of its each dimension can include but not limited to: the non-Chinese character in each Chinese character in Second Type substring, Second Type substring in the initial consonant of each Chinese character, Second Type substring in the phonetic of the simple or compound vowel of a Chinese syllable of each Chinese character, the initial character of Second Type substring, the initial character of Second Type substring, Second Type substring.
When analysis object is entry, analysis result can include but not limited to: simple or compound vowel of a Chinese syllable, the initial character of entry, the phonetic of the initial character of entry, the non-Chinese character of entry of each Chinese character in the initial consonant of each Chinese character, entry in each Chinese character in entry, entry.
Such as, table 3 is each second dimension analysis result of Second Type substring " hypertension ".
Table 3
Step S833, based on some second dimension analysis results of Second Type substring, and some second dimension analysis results of body, calculate the matching degree of Second Type substring and each body.
Particularly, this step can calculate the similarity of Second Type substring and each body, also can calculate total degree of confidence of Second Type substring and each body.Wherein, compared to similarity, total degree of confidence more can embody the matching degree of Second Type substring and each body, but the computation process of total degree of confidence is also more complicated compared to the computation process of similarity.During this step S833 of concrete enforcement, if desired processing speed faster, then can the process of seletion calculation similarity, if desired matching result more accurately, then can the process of the total degree of confidence of seletion calculation.
A kind of embodiment of step S833 is the similarity calculating Second Type substring and each body, specific as follows:
According to the similarity of following formulae discovery Second Type substring and each body, and the similarity calculated is defined as the matching degree of Second Type substring and each body:
M = Σ t i n q ( t f t i n d · i d f ( t ) 2 · t . g e t B o o s t ( ) · n o r m ( t , d ) )
Wherein, M represents similarity;
T represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
Tinq represents each second dimension of Second Type substring;
D represents body;
Tf (tind) represents in the second identical dimension, the frequency that the second dimension analysis result of Second Type substring and the second dimension analysis result of body match;
wherein, T represents the sum of body in body dictionary, and T (t) represents the sum of the body that each second dimension analysis result all matches with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of body.
A kind of embodiment of step S833 is the total degree of confidence calculating Second Type substring and each body, specific as follows:
According to total degree of confidence of following process computation Second Type substring and each body, and the total degree of confidence calculated is defined as the matching degree of Second Type substring and each body:
1) each Chinese character in Second Type substring is determined.
2) the cosine degree of confidence of each body matched with it according to following formulae discovery Second Type substring:
N = Σ j = 1 V w Q , j × w d ′ , j Σ j = 1 V w Q , j 2 × Σ j = 1 V w d ′ , j 2
Wherein, N represents cosine degree of confidence;
V represents the Chinese character sum that Second Type substring and the body matched thereof comprise;
Q represents Second Type substring;
D' represents and the body that Second Type substring matches;
W q,jrepresent the frequency that each Chinese character occurs in Second Type substring;
W d', jrepresent the frequency occurred in the body that each Chinese character matches at Second Type substring;
J represents the sequence number of the Chinese character that Second Type substring and the body matched thereof comprise.
3) total degree of confidence of each body matched with it according to following formulae discovery Second Type substring:
S=M×a+N×b
Wherein, S represents total degree of confidence;
M represents similarity;
A represents the preset weights that similarity M is corresponding;
B represents the preset weights that cosine degree of confidence N is corresponding;
Further, similarity M is according to following formulae discovery:
M = Σ t i n q ( t f t i n d · i d f ( t ) 2 · t . g e t B o o s t ( ) · n o r m ( t , d ) )
Wherein, t represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
Tinq represents each second dimension of Second Type substring;
D represents body;
Tf (tind) represents in the second identical dimension, the frequency that the second dimension analysis result of Second Type substring and the second dimension analysis result of body match;
wherein, T represents the sum of body in body dictionary, and T (t) represents the sum of the body that each second dimension analysis result all matches with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of body.
Step S834, according to the matching degree of Second Type substring and each body, determines the body that one or more body matches as Second Type substring.
Alternatively, this step can have following embodiment: sort to whole body according to the size of the matching degree with Second Type substring, and the body of the predetermined number wherein sorting forward (2 that such as sort forward) is defined as the body that Second Type substring matches; Or, the matching degree with Second Type substring is reached one or more bodies of predetermined threshold value, is defined as the body that Second Type substring matches.
When concrete enforcement is of the present invention, in order to clear and definite Second Type substring and body that each matches matching degree and it is used, can also can also comprise the matching degree of each body that Second Type substring matches with it in the final result exported.Such as, export the matching degree of Second Type substring and each body matched, then according to the size of matching degree, therefrom can select a body matched as Second Type substring again by manual type.
Step S84, the body match Second Type substring or reach one or more bodies of preset matching condition with Second Type substring, is defined as standard terminology that title to be encoded matches or expands term.
The present embodiment is carrying out in the process of natural language processing to Chinese medical diagnosis on disease information, taken into full account Chinese medical diagnosis on disease information belong to natural language, form complexity various, there is no the features such as unified standard, utilize multiple dictionary set up in advance to carry out cutting and coupling to Chinese medical diagnosis on disease information character string, search with this standard terminology that title to be encoded matches or expand term.
example devices
After the method describing exemplary embodiment of the invention, next, the ICD coded system with reference to figure 7 pairs of exemplary embodiment of the invention is introduced.
The enforcement of ICD coded system see the enforcement of said method, can repeat part and repeats no more.Following used term " module " can be realize the software of predetermined function and/or the combination of hardware.Although the system described by following examples preferably realizes with software, hardware, or the realization of the combination of software and hardware also may and conceived.
As shown in Figure 7, ICD coded system can comprise: standard terminology storehouse creation module 61, the terminology bank creation module 62 that expands, importing module 63, data processing module 64, coding module 65.
Standard terminology storehouse creation module 61, for according to will the ICD version of reference, by described each disease term that will comprise in the ICD version of reference, be defined as standard terminology; According to described will the ICD version of reference, determine the coding of each standard terminology; Store described standard terminology and coding thereof, obtain standard terminology storehouse.
Alternatively, to the CD version of reference can be the ICD version (ICD-10 that such as WHO announced in 1992) that WHO announces, also can be the various localized ICD version (the ICD-10 Chinese edition of such as ministry of Health of China recommendation) that the ICD version announced WHO is expanded.During concrete enforcement, can select suitable ICD version according to actual needs as a reference, the present invention is not construed as limiting this.
Expand terminology bank creation module 62, for will described will the following type in the ICD version of reference being defined as expanding term be contained in: described standard terminology be commonly called as another name abbreviation, the subclass disease term of described standard terminology, and described will the ICD version of reference announce after the new disease term produced; Judge that described expansion term is standard terminology described in any one be commonly called as another name abbreviation time, the coding of this standard terminology is given this expansion term; When judging the disease term of subclass disease term that described expansion term is standard terminology described in any one or described new generation, give this expansion term by the coding with the immediate standard terminology of the relation of genus and species of this expansion term; Store described expansion term and coding thereof, obtain expanding terminology bank.
Import module 63, for inputting Chinese medical diagnosis on disease information.
Alternatively, Chinese medical diagnosis on disease information can be the medical record information that medical worker inputs, and also can be the information recorded in basic medical insurance advice of settlement.
Data processing module 64, for carrying out natural language processing to described Chinese medical diagnosis on disease information, obtains one or more title to be encoded.
Particularly, data processing module 64 can based on the feature of Chinese medical diagnosis on disease information, Chinese medical diagnosis on disease information is carried out to participle, taken out the process such as word, and then disease term is parsed from Chinese medical diagnosis on disease information, these disease terms parsed from this Chinese medical diagnosis on disease information are title to be encoded.
Coding module 65, for based on described standard terminology storehouse and described expansion terminology bank, search the standard terminology that matches with described title to be encoded or expand term, and by the standard terminology that the match is successful or the coding expanding term, being defined as the coding of described title to be encoded.
Alternatively, as shown in Figure 8, ICD coded system, except comprising above-mentioned standard terminology storehouse creation module 61, expanding terminology bank creation module 62, importing except module 63, data processing module 64, coding module 65, can also comprise: Hypothetical classification terminology bank creation module 71.
Hypothetical classification terminology bank creation module 71, for by be not contained in described will in the ICD version of reference and relevant to standard terminology described in any one and give tacit consent to clinically be equal to this standard terminology and not this standard terminology be commonly called as have another name called the disease term of abbreviation be defined as Hypothetical classification term; By the coding of this standard terminology relevant to described Hypothetical classification term, give described Hypothetical classification term; Store described Hypothetical classification term and coding thereof, obtain Hypothetical classification terminology bank.
In the coded system of ICD shown in Fig. 8, coding module 65 also for based on described Hypothetical classification terminology bank, searches the Hypothetical classification term matched with described title to be encoded; By the coding of the Hypothetical classification term that the match is successful, be defined as the coding of described title to be encoded.
Alternatively, as shown in Figure 9, ICD coded system, except comprising above-mentioned standard terminology storehouse creation module 61, expanding terminology bank creation module 62, importing except module 63, data processing module 64, coding module 65, can also comprise: odd encoder terminology bank creation module 81.
Odd encoder terminology bank creation module 81, for by not being contained in described the disease term that will form in the ICD version of reference and by least two different described standard terminologys, is defined as odd encoder term; By the coded combination of whole standard terminologys of the described odd encoder term of composition together, as the coding of described odd encoder term; Store described odd encoder term and coding thereof, obtain odd encoder terminology bank.
In the coded system of ICD shown in Fig. 9, coding module 65 also for based on described odd encoder terminology bank, searches the odd encoder term matched with described title to be encoded; By the coding of the odd encoder term that the match is successful, be defined as the coding of described title to be encoded.
Alternatively, as shown in Figure 10, ICD coded system, except comprising above-mentioned standard terminology storehouse creation module 61, expanding terminology bank creation module 62, importing except module 63, data processing module 64, coding module 65, can also comprise: merge terminology bank creation module 91 and pretreatment module 92.
Merge terminology bank creation module 91, for can substitute the single standard terminology of at least two standard terminologys occurred simultaneously, be defined as merging term; And each in these at least two standard terminologys simultaneously occurred is defined as the combining objects of this merging term; According to described will the ICD version of reference, determine that each merges the coding of term; Store whole combining objects of described merging term and coding and described merging term, obtain merging terminology bank.
Pretreatment module 92, title one or more to be encoded for obtaining described data processing module 64 carries out pre-service, judge in described one or more title to be encoded, whether comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of described any one or more merging term are replaced to corresponding merging term; Then the title to be encoded after pre-service is sent to coding module 65.
Alternatively, ICD coded system is except comprising above-mentioned standard terminology storehouse creation module 61, expanding terminology bank creation module 62, importing except module 63, data processing module 64, coding module 65, can also comprise: revise module in real time, for revising standard terminology storehouse, expansion terminology bank, Hypothetical classification terminology bank, odd encoder terminology bank, merging terminology bank in real time.
Alternatively, as shown in figure 11, ICD coded system, except comprising above-mentioned standard terminology storehouse creation module 61, expanding terminology bank creation module 62, importing except module 63, data processing module 64, coding module 65, can also comprise: without coded treatment module 101.
Without coded treatment module 101, for will not determine encode title to be encoded with without mating without encryption description in encryption description storehouse, if the match is successful, then this is not determined that the title to be encoded of encoding is encoded and/or exports default result, if it fails to match, then this is not determined that the title to be encoded of encoding is sent to artificial treatment platform and carries out artificial treatment.Wherein, comprise without encryption description storehouse some without encryption description.These are some comprises without encryption description: the traditional Chinese medical science class term preset; The terms of surgery operation preset; The nomenclature of drug term preset; The medical treatment consumptive materials term preset; And the inspection inspection term preset.
The ICD coded system that the embodiment of the present invention provides, by creating multiple terminology bank to contain the disease term that may occur in most Chinese medical diagnosis on disease information, meet the requirement of automatically differentiating the disease term in Chinese medical diagnosis on disease information, the ICD of robotization is encoded be achieved, the ICD coded system utilizing the embodiment of the present invention to provide carries out ICD coding, without the need to artificial participation, have that coding rate is fast, cost is low, accuracy advantages of higher.
Although it should be noted that the some modules being referred to ICD coded system in above-detailed, this division is only exemplary not enforceable.In fact, according to the embodiment of the present invention, the Characteristic and function of two or more modules above-described can be specialized in a module.Otherwise, the Characteristic and function of an above-described module can Further Division for be specialized by multiple module.
In addition, although describe the operation of the inventive method in the accompanying drawings with particular order, this is not that requirement or hint must perform these operations according to this particular order, or must perform the result that all shown operation could realize expectation.Additionally or alternatively, some step can be omitted, multiple step be merged into a step and perform, and/or a step is decomposed into multiple step and perform.
Although describe spirit of the present invention and principle with reference to some embodiments, but should be appreciated that, the present invention is not limited to disclosed embodiment, can not combine to be benefited to the feature that the division of each side does not mean that in these aspects yet, this division is only the convenience in order to state.The present invention is intended to contain the interior included various amendment of spirit and scope and the equivalent arrangements of claims.

Claims (21)

1. an International Classification of Diseases coding method for robotization, comprising:
Step 1, inputs Chinese medical diagnosis on disease information;
Step 2, carries out natural language processing to described Chinese medical diagnosis on disease information, obtains one or more title to be encoded;
Step 3, based on standard terminology storehouse with expand terminology bank, searches the standard terminology that matches with described title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of described title to be encoded;
Wherein, described standard terminology storehouse creates in the following manner:
Determine will the International Classification of Diseases ICD version of reference;
By described each disease term that will comprise in the ICD version of reference, be defined as standard terminology;
According to described will the ICD version of reference, determine the coding of each standard terminology;
Store described standard terminology and coding thereof, obtain standard terminology storehouse;
Wherein, described expansion terminology bank creates in the following manner:
To described will the following type in the ICD version of reference being defined as expanding term be contained in: described standard terminology be commonly called as another name abbreviation, the subclass disease term of described standard terminology, and described will the ICD version of reference announce after the new disease term produced;
When described expansion term to be standard terminology described in any one be commonly called as another name abbreviation time, the coding of this standard terminology is given this expansion term;
When the disease term of the subclass disease term that described expansion term is standard terminology described in any one or described new generation, give this expansion term by the coding with the immediate standard terminology of the relation of genus and species of this expansion term;
Store described expansion term and coding thereof, obtain expanding terminology bank.
2. the International Classification of Diseases coding method of robotization according to claim 1, wherein,
Described step 3 also comprises: based on Hypothetical classification terminology bank, searches the Hypothetical classification term matched with described title to be encoded; By the coding of the Hypothetical classification term that the match is successful, be defined as the coding of described title to be encoded;
Wherein, described Hypothetical classification terminology bank creates in the following manner:
By be not contained in described will in the ICD version of reference and relevant to standard terminology described in any one and give tacit consent to clinically be equal to this standard terminology and not this standard terminology be commonly called as have another name called the disease term of abbreviation, be defined as Hypothetical classification term;
By the coding of this standard terminology relevant to described Hypothetical classification term, give described Hypothetical classification term;
Store described Hypothetical classification term and coding thereof, obtain Hypothetical classification terminology bank.
3. the International Classification of Diseases coding method of robotization according to claim 1, wherein,
Described step 3 also comprises: based on odd encoder terminology bank, searches the odd encoder term matched with described title to be encoded; By the coding of the odd encoder term that the match is successful, be defined as the coding of described title to be encoded;
Wherein, described odd encoder terminology bank creates in the following manner:
By not being contained in described the disease term that will form in the ICD version of reference and by least two different described standard terminologys, be defined as odd encoder term;
By the coded combination of whole standard terminologys of the described odd encoder term of composition together, as the coding of described odd encoder term;
Store described odd encoder term and coding thereof, obtain odd encoder terminology bank.
4. the International Classification of Diseases coding method of robotization according to claim 1, wherein,
Before described step 3, also comprise: based on merging terminology bank, pre-service is carried out to described one or more title to be encoded;
Described merging terminology bank creates in the following manner:
The single standard terminology of at least two standard terminologys simultaneously occurred can be substituted, be defined as merging term; And each in these at least two standard terminologys simultaneously occurred is defined as the combining objects of this merging term;
According to described will the ICD version of reference, determine that each merges the coding of term;
Store whole combining objects of described merging term and coding and described merging term, obtain merging terminology bank;
Described based on the merging terminology bank created, pretreated step is carried out to described one or more title to be encoded, comprising:
Judge, in described one or more title to be encoded, whether to comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of described any one or more merging term are replaced to corresponding merging term.
5., according to the International Classification of Diseases coding method of the arbitrary described robotization of Claims 1 to 4, wherein, after described step 3, also comprise:
Step 4, by do not determine encode title to be encoded with without mating without encryption description in encryption description storehouse, if the match is successful, then perform default treatment step and this is not determined that the title to be encoded of encoding is encoded to represent, if it fails to match, then this is not determined that the title to be encoded of encoding is sent to artificial treatment platform and carries out artificial treatment;
Wherein, described comprise without encryption description storehouse some without encryption description;
Describedly somely to comprise without encryption description:
The traditional Chinese medical science class term preset;
The terms of surgery operation preset;
The nomenclature of drug term preset;
The medical treatment consumptive materials term preset; And
The inspection inspection term preset.
6. the International Classification of Diseases coding method of robotization according to claim 1, wherein, described will the ICD version of reference be the ICD version that World Health Organization (WHO) WHO announces, or to the various localized ICD version that the ICD version that World Health Organization (WHO) WHO announces is expanded.
7. the International Classification of Diseases coding method of robotization according to claim 1, wherein, described step 2 comprises:
Step 21, carries out pre-service to described Chinese medical diagnosis on disease information character string, obtains pretreated Chinese medical diagnosis on disease information character string;
Step 22, based on the body dictionary set up in advance, disease degree glossary, the concurrent glossary of disease, site of pathological change glossary, described pretreated Chinese medical diagnosis on disease information character string is cut into some first kind substrings and/or Second Type substring;
Wherein, described body dictionary comprises described standard terminology storehouse and described expansion terminology bank, and described standard terminology and described expansion term are body;
Described disease degree glossary comprises some disease degree terms, and described disease degree term is the word for describing disease acute and chronic degree or disease severity or histological type or clinical stages;
The concurrent glossary of described disease comprises the concurrent term of some diseases, and the concurrent term of described disease is the word for describing at least two kinds of concurrent appearance of disease;
Described site of pathological change glossary comprises some site of pathological change terms, and described site of pathological change term is the word for describing disease incidence position;
Described first kind substring directly can mate with the body in described body dictionary, and described Second Type substring directly can not mate with the body in described body dictionary;
Step 23, is defined as title to be encoded by the first kind substring be syncopated as and Second Type substring.
8. the International Classification of Diseases coding method of robotization according to claim 7, wherein, described step 21 comprises:
Form normalized is carried out to the non-Chinese character in described Chinese medical diagnosis on disease information character string, and the non-medical term deleted in described Chinese medical diagnosis on disease information character string, obtain pretreated Chinese medical diagnosis on disease information character string, wherein said non-medical term is provided by the non-medical term dictionary that sets up in advance, and described non-medical term has been the word of remarks effect.
9. the International Classification of Diseases coding method of robotization according to claim 8, wherein, described step 22 comprises:
Judge whether described pretreated Chinese medical diagnosis on disease information character string comprises symbol;
If described pretreated Chinese medical diagnosis on disease information character string comprises symbol, then the character between adjacent two symbols every in described pretreated Chinese medical diagnosis on disease information character string is mated with the body in body dictionary as a whole; If the match is successful, then using the character cutting between this adjacent two symbols out as first kind substring; If it fails to match, then by this adjacent two symbols and between character be defined as wouldn't cutting character string, and whether wouldn't comprise default special symbol in cutting character string described in judging;
Special symbol wouldn't be comprised in cutting character string if described, then search described wouldn't character model belonging to cutting character string, and the segmentation rules corresponding according to this affiliated character model cutting character string wouldn't carry out cutting to described, the character cut out is mated with the body in body dictionary, if the match is successful, the character then this cut out is as first kind substring, if it fails to match, then the character this cut out is as Second Type substring; Wherein, described character model is provided by the character model storehouse that sets up in advance, and described character model has segmentation rules one to one;
Special symbol wouldn't do not comprised in cutting character string if described, then cutting character string wouldn't directly be defined as Second Type substring by described;
If described pretreated Chinese medical diagnosis on disease information character string does not comprise symbol, then mechanical Chinese word segmentation method is adopted the single character in described pretreated Chinese medical diagnosis on disease information character string or multiple continuous print character to be mated with the body in described body dictionary;
If all characters in described pretreated Chinese medical diagnosis on disease information character string all can with Ontology Matching, then according to the body that mates using the single character in described pretreated Chinese medical diagnosis on disease information character string or multiple continuous print character cutting out as first kind substring;
Whether fail and the single character of Ontology Matching or multiple continuous print character if exist in described pretreated Chinese medical diagnosis on disease information character string, then failing with the single character of Ontology Matching or multiple continuous print character described in judging is disease degree term, the concurrent term of disease or site of pathological change term;
To fail with the single character of Ontology Matching or multiple continuous print character be disease degree term when described, when the concurrent term of disease or site of pathological change term, fail and the single character of Ontology Matching or the position of multiple continuous print character in described pretreated Chinese medical diagnosis on disease information character string according to described, fail with single character or multiple continuous print character of Ontology Matching using described and can merge with the single character of Ontology Matching or multiple continuous print character before or after it and cut out as Second Type substring, and can with the single character of Ontology Matching or multiple continuous print character cutting out as first kind substring using remaining in described pretreated Chinese medical diagnosis on disease information character string,
When described to fail with the single character of Ontology Matching or multiple continuous print character for disease degree term, the concurrent term of disease or site of pathological change term time, described pretreated Chinese medical diagnosis on disease information character string entirety is cut out as Second Type substring.
10. the International Classification of Diseases coding method of robotization according to claim 7, wherein, search the standard terminology matched with described title to be encoded or the step expanding term in described step 3, comprising:
If described name to be encoded is called first kind substring, then by the body that this first kind substring matches, be defined as title to be encoded matches with this standard terminology or expand term;
If described name to be encoded is called Second Type substring, then:
Each body in described Second Type substring and described body dictionary is carried out to the parsing of the first dimension, obtain some first dimension analysis results of described Second Type substring, and some first dimension analysis results of each body in described body dictionary;
Each first dimension analysis result of described Second Type substring is mated with each first dimension analysis result of each body in described body dictionary, judges whether to exist the body that each first dimension analysis result all matches with each first dimension analysis result of described Second Type substring;
If there is the body that each first dimension analysis result all matches with each first dimension analysis result of described Second Type substring, then this body is defined as the body that described Second Type substring matches;
If there is no the body that all matches with each first dimension analysis result of described Second Type substring of each first dimension analysis result, then choose the part first dimension analysis result in whole first dimension analysis results of described Second Type substring to mate with the part first dimension analysis result in whole first dimension analysis results of each body in described body dictionary, and the body that the described part first dimension analysis result judging whether to exist described part first dimension analysis result and described Second Type substring matches;
If the body that the described part first dimension analysis result that there is described part first dimension analysis result and described Second Type substring matches, then this body is defined as the body that described Second Type substring matches;
If the body that the described part first dimension analysis result that there is not described part first dimension analysis result and described Second Type substring matches, then each body in described Second Type substring and described body dictionary is carried out to the parsing of the second dimension, obtain some second dimension analysis results of described Second Type substring, and some second dimension analysis results of each body in described body dictionary;
Based on some second dimension analysis results of described Second Type substring, and some second dimension analysis results of described body, calculate the matching degree of described Second Type substring and each body;
According to the matching degree of described Second Type substring and each body, determine the body that one or more body matches as described Second Type substring;
By the body that described Second Type substring matches, be defined as standard terminology that described title to be encoded matches or expand term.
The International Classification of Diseases coding method of 11. robotizations according to claim 10, wherein, described Second Type substring described in body each first dimension analysis result respectively:
Described Second Type substring described in directional terminology in body;
Described Second Type substring described in grade term in body;
Described Second Type substring described in character in body bracket;
Described Second Type substring described in character in body after dash; And,
Described Second Type substring described in character in body except the character in directional terminology, grade term, bracket, character after dash;
Described Second Type substring described in body whole first dimension analysis results in part first dimension analysis result comprise: in described two type substrings described in character in body except the character in directional terminology, grade term, bracket, character after dash; And, one or more in the following:
Described Second Type substring described in directional terminology, grade term in body;
Described Second Type substring described in character in body bracket;
Described Second Type substring described in character in body after dash.
The International Classification of Diseases coding method of 12. robotizations according to claim 10, wherein, described Second Type substring described in body each second dimension analysis result respectively:
Described Second Type substring described in each Chinese character of body;
Described Second Type substring described in the initial consonant of each Chinese character of body;
Described Second Type substring described in the simple or compound vowel of a Chinese syllable of each Chinese character of body;
Described Second Type substring described in the initial character of body;
Described Second Type substring described in the phonetic of initial character of body; And,
Described Second Type substring described in non-Chinese character in body.
The International Classification of Diseases coding method of 13. robotizations according to claim 10, wherein, described some second dimension analysis results based on described Second Type substring, and some second dimension analysis results of described body, the step calculating the matching degree of described Second Type substring and each body comprises:
Similarity according to Second Type substring described in following formulae discovery and each body:
M = Σ t i n q ( t f t i n d · i d f ( t ) 2 · t . g e t B o o s t ( ) · n o r m ( t , d ) )
Wherein, M represents similarity;
T represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
Tinq represents each second dimension of Second Type substring;
D represents body;
Tf (tind) represents in the second identical dimension, the frequency that the second dimension analysis result of Second Type substring and the second dimension analysis result of body match;
wherein, T represents the sum of body in body dictionary, and T (t) represents the sum of the body that each second dimension analysis result all matches with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of body;
The similarity calculated is defined as the matching degree of described Second Type substring and each body.
The International Classification of Diseases coding method of 14. robotizations according to claim 10, wherein, described some second dimension analysis results based on described Second Type substring, and some second dimension analysis results of described body, the step calculating the matching degree of described Second Type substring and each body comprises:
Determine each Chinese character in described Second Type substring;
The cosine degree of confidence of each body matched with it according to Second Type substring described in following formulae discovery:
N = Σ j = 1 V w Q , j × w d ′ , j Σ j = 1 V w Q , j 2 × Σ j = 1 V w d ′ , j 2
Total degree of confidence of each body matched with it according to Second Type substring described in following formulae discovery:
S=M×a+N×b
Wherein, N represents cosine degree of confidence;
V represents the Chinese character sum that Second Type substring and the body matched thereof comprise;
Q represents Second Type substring;
D' represents and the body that Second Type substring matches;
W q,jrepresent the frequency that each Chinese character occurs in Second Type substring;
W d', jrepresent the frequency occurred in the body that each Chinese character matches at Second Type substring;
J represents the sequence number of the Chinese character that Second Type substring and the body matched thereof comprise;
S represents total degree of confidence;
M represents similarity;
A represents the preset weights that similarity M is corresponding;
B represents the preset weights that cosine degree of confidence N is corresponding;
Further, similarity M is according to following formulae discovery:
M = Σ t i n q ( t f t i n d · i d f ( t ) 2 · t . g e t B o o s t ( ) · n o r m ( t , d ) )
Wherein, t represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
Tinq represents each second dimension of Second Type substring;
D represents body;
Tf (tind) represents in the second identical dimension, the frequency that the second dimension analysis result of Second Type substring and the second dimension analysis result of body match;
wherein, T represents the sum of body in body dictionary, and T (t) represents the sum of the body that each second dimension analysis result all matches with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of body;
The total degree of confidence calculated is defined as the matching degree of described Second Type substring and each body.
The International Classification of Diseases coding method of 15. robotizations according to claim 10, wherein, the described matching degree according to described Second Type substring and each body, determine the step of the body that one or more body matches as described Second Type substring, comprising:
According to the size of the matching degree with described Second Type substring, whole body is sorted, and the body of the forward predetermined number that wherein sorts is defined as the body that described Second Type substring matches;
Or,
Matching degree with described Second Type substring is reached one or more bodies of predetermined threshold value, be defined as the body that described Second Type substring matches.
The International Classification of Diseases coded system of 16. 1 kinds of robotizations, comprising:
Standard terminology storehouse creation module, for according to will the International Classification of Diseases version of reference, by described each disease term that will comprise in the ICD version of reference, be defined as standard terminology; According to described will the ICD version of reference, determine the coding of each standard terminology; Store described standard terminology and coding thereof, obtain standard terminology storehouse;
Expand terminology bank creation module, for will described will the following type in the ICD version of reference being defined as expanding term be contained in: described standard terminology be commonly called as another name abbreviation, the subclass disease term of described standard terminology, and described will the ICD version of reference announce after the new disease term produced; Judge that described expansion term is standard terminology described in any one be commonly called as another name abbreviation time, the coding of this standard terminology is given this expansion term; When judging the disease term of subclass disease term that described expansion term is standard terminology described in any one or described new generation, give this expansion term by the coding with the immediate standard terminology of the relation of genus and species of this expansion term; Store described expansion term and coding thereof, obtain expanding terminology bank;
Import module, for inputting Chinese medical diagnosis on disease information;
Data processing module, for carrying out natural language processing to described Chinese medical diagnosis on disease information, obtains one or more title to be encoded;
Coding module, for based on described standard terminology storehouse and described expansion terminology bank, search the standard terminology that matches with described title to be encoded or expand term, and by the standard terminology that the match is successful or the coding expanding term, being defined as the coding of described title to be encoded.
The International Classification of Diseases coded system of 17. robotizations according to claim 16, wherein, described system also comprises:
Hypothetical classification terminology bank creation module, for by be not contained in described will in the ICD version of reference and relevant to standard terminology described in any one and give tacit consent to clinically be equal to this standard terminology and not this standard terminology be commonly called as have another name called the disease term of abbreviation, be defined as Hypothetical classification term; By the coding of this standard terminology relevant to described Hypothetical classification term, give described Hypothetical classification term; Store described Hypothetical classification term and coding thereof, obtain Hypothetical classification terminology bank;
Described coding module, also for based on described Hypothetical classification terminology bank, searches the Hypothetical classification term matched with described title to be encoded; By the coding of the Hypothetical classification term that the match is successful, be defined as the coding of described title to be encoded.
The International Classification of Diseases coded system of 18. robotizations according to claim 16, wherein, described system also comprises:
Odd encoder terminology bank creation module, for by not being contained in described the disease term that will form in the ICD version of reference and by least two different described standard terminologys, is defined as odd encoder term; By the coded combination of whole standard terminologys of the described odd encoder term of composition together, as the coding of described odd encoder term; Store described odd encoder term and coding thereof, obtain odd encoder terminology bank;
Described coding module, also for based on described odd encoder terminology bank, searches the odd encoder term matched with described title to be encoded; By the coding of the odd encoder term that the match is successful, be defined as the coding of described title to be encoded.
The International Classification of Diseases coded system of 19. robotizations according to claim 16, wherein, described system also comprises:
Merge terminology bank creation module, for can substitute the single standard terminology of at least two standard terminologys occurred simultaneously, be defined as merging term; And each in these at least two standard terminologys simultaneously occurred is defined as the combining objects of this merging term; According to described will the ICD version of reference, determine that each merges the coding of term; Store whole combining objects of described merging term and coding and described merging term, obtain merging terminology bank;
Pretreatment module, title one or more to be encoded for obtaining described data processing module carries out pre-service, judge in described one or more title to be encoded, whether comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of described any one or more merging term are replaced to corresponding merging term; Then the title to be encoded after pre-service is sent to described coding module.
20., according to the International Classification of Diseases coded system of the arbitrary described robotization of claim 16 ~ 19, also comprise:
Without coded treatment module, for will not determine encode title to be encoded with without mating without encryption description in encryption description storehouse, if the match is successful, then this is not determined that the title to be encoded of encoding is encoded and/or exports default result, if it fails to match, then this is not determined that the title to be encoded of encoding is sent to artificial treatment platform and carries out artificial treatment;
Wherein, described comprise without encryption description storehouse some without encryption description;
Describedly somely to comprise without encryption description:
The traditional Chinese medical science class term preset;
The terms of surgery operation preset;
The nomenclature of drug term preset;
The medical treatment consumptive materials term preset; And
The inspection inspection term preset.
The International Classification of Diseases coded system of 21. robotizations according to claim 16, wherein, described will the ICD version of reference be the ICD version that World Health Organization (WHO) WHO announces, or to the various localized ICD version that the ICD version that World Health Organization (WHO) WHO announces is expanded.
CN201510496513.0A 2015-08-13 2015-08-13 A kind of International Classification of Diseases coding method of automation and system Active CN105069124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510496513.0A CN105069124B (en) 2015-08-13 2015-08-13 A kind of International Classification of Diseases coding method of automation and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510496513.0A CN105069124B (en) 2015-08-13 2015-08-13 A kind of International Classification of Diseases coding method of automation and system

Publications (2)

Publication Number Publication Date
CN105069124A true CN105069124A (en) 2015-11-18
CN105069124B CN105069124B (en) 2018-06-15

Family

ID=54498494

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510496513.0A Active CN105069124B (en) 2015-08-13 2015-08-13 A kind of International Classification of Diseases coding method of automation and system

Country Status (1)

Country Link
CN (1) CN105069124B (en)

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844308A (en) * 2017-01-20 2017-06-13 天津艾登科技有限公司 A kind of use semantics recognition carries out the method for automating disease code conversion
CN107256344A (en) * 2017-06-20 2017-10-17 上海联影医疗科技有限公司 Data processing method, device and radiotherapy management system
CN107491437A (en) * 2017-08-25 2017-12-19 广州宝荣科技应用有限公司 A kind of TCM syndrome method for recognizing semantics and device based on natural language
CN107577826A (en) * 2017-10-25 2018-01-12 山东众阳软件有限公司 Classification of diseases coding method and system based on raw diagnostic data
CN107705839A (en) * 2017-10-25 2018-02-16 山东众阳软件有限公司 Disease automatic coding and system
CN107731269A (en) * 2017-10-25 2018-02-23 山东众阳软件有限公司 Disease code method and system based on raw diagnostic data and patient file data
CN107784057A (en) * 2017-03-03 2018-03-09 平安医疗健康管理股份有限公司 Medical data matching process and device
CN107833605A (en) * 2017-03-14 2018-03-23 北京大瑞集思技术有限公司 A kind of coding method, device, server and the system of hospital's medical record information
CN108172265A (en) * 2018-01-09 2018-06-15 深圳市第二人民医院 Clinical diagnosis term set update method and its system
CN108170828A (en) * 2018-01-09 2018-06-15 深圳市第二人民医院 Structural clinical diagnoses terminology construction method and its system
CN108182972A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network
CN108182285A (en) * 2018-01-29 2018-06-19 中国平安人寿保险股份有限公司 Information processing method, terminal and computer readable storage medium
CN108257667A (en) * 2016-12-28 2018-07-06 中国科学院深圳先进技术研究院 A kind of data processing method and terminal device
CN108320778A (en) * 2017-01-16 2018-07-24 医渡云(北京)技术有限公司 Medical record ICD coding methods and system
CN108446260A (en) * 2018-02-06 2018-08-24 天津艾登科技有限公司 The method and system of automation disease code conversion are carried out based on semantic approximate match algorithm
CN108564991A (en) * 2018-04-13 2018-09-21 重庆医科大学附属儿童医院 Digitization coding case history wrong identification system based on ICD and its recognition methods
CN108920661A (en) * 2018-07-04 2018-11-30 平安健康保险股份有限公司 International Classification of Diseases labeling method, device, computer equipment and storage medium
CN109256216A (en) * 2018-08-14 2019-01-22 平安医疗健康管理股份有限公司 Medical data processing method, device, computer equipment and storage medium
CN109545297A (en) * 2018-10-30 2019-03-29 平安医疗健康管理股份有限公司 A kind of disease coding method and calculating equipment based on big data
CN109993227A (en) * 2019-03-29 2019-07-09 京东方科技集团股份有限公司 Method, system, device and the medium of automatic addition International Classification of Diseases coding
CN110491465A (en) * 2019-08-20 2019-11-22 山东众阳健康科技集团有限公司 Classification of diseases coding method, system, equipment and medium based on deep learning
CN110852076A (en) * 2019-10-12 2020-02-28 云知声智能科技股份有限公司 Method and device for automatic disease code conversion
CN110895580A (en) * 2019-12-12 2020-03-20 山东众阳健康科技集团有限公司 ICD operation and operation code automatic matching method based on deep learning
CN111046882A (en) * 2019-12-05 2020-04-21 清华大学 Disease name standardization method and system based on profile hidden Markov model
CN111210916A (en) * 2019-12-23 2020-05-29 望海康信(北京)科技股份公司 Medical record home page coding method and system
CN111259664A (en) * 2020-01-14 2020-06-09 腾讯科技(深圳)有限公司 Method, device and equipment for determining medical text information and storage medium
CN111554369A (en) * 2020-04-29 2020-08-18 杭州依图医疗技术有限公司 Medical data processing method, interaction method and storage medium
CN111626876A (en) * 2020-05-27 2020-09-04 泰康保险集团股份有限公司 Insurance auditing method, insurance auditing device, electronic equipment and storage medium
CN112131339A (en) * 2020-09-28 2020-12-25 上海梅斯医药科技有限公司 Name standardization standard processing method, device, computer and storage medium
CN112445917A (en) * 2020-11-05 2021-03-05 中国中医科学院中医药信息研究所 Method and device for constructing traditional medical disease body
CN112562818A (en) * 2020-12-02 2021-03-26 薛蕴菁 System and method for designing and realizing diagnosis logic based on structured report sub-template
CN112632909A (en) * 2020-10-30 2021-04-09 中核核电运行管理有限公司 Data object English coding method and device
CN112668280A (en) * 2020-12-29 2021-04-16 杭州依图医疗技术有限公司 Medical data processing method and device and storage medium
CN112687397A (en) * 2020-12-31 2021-04-20 四川大学华西医院 Rare disease knowledge base processing method and device and readable storage medium
CN112700826A (en) * 2020-12-30 2021-04-23 杭州依图医疗技术有限公司 Medical data processing method and device and storage medium
CN112735544A (en) * 2020-12-30 2021-04-30 杭州依图医疗技术有限公司 Medical record data processing method and device and storage medium
CN112818085A (en) * 2021-01-28 2021-05-18 东软集团股份有限公司 Value range data matching method and device, storage medium and electronic equipment
CN112836006A (en) * 2021-01-12 2021-05-25 山东众阳健康科技集团有限公司 Multi-diagnosis intelligent coding method, system, medium and equipment
CN113033154A (en) * 2021-05-31 2021-06-25 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Reading understanding-based medical concept coding method and device and storage medium
CN113641714A (en) * 2021-08-31 2021-11-12 平安医疗健康管理股份有限公司 Medical data correction method, device, computer equipment and storage medium
CN115017326A (en) * 2022-05-12 2022-09-06 青岛普瑞盛医药科技有限公司 Medical coding method and device
CN115080751A (en) * 2022-08-16 2022-09-20 之江实验室 Medical standard term management system and method based on general model
CN115964472A (en) * 2021-12-03 2023-04-14 奥码哈(杭州)医疗科技有限公司 ICD coding method, ICD coding query method, coding system and query system
WO2024007810A1 (en) * 2022-07-05 2024-01-11 上海妙一生物科技有限公司 Coding method and apparatus based on medical diseases and medicines

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102456100A (en) * 2010-11-03 2012-05-16 通用电气公司 Systems, methods, and apparatus for computer-assisted full medical code scheme to code scheme mapping
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102456100A (en) * 2010-11-03 2012-05-16 通用电气公司 Systems, methods, and apparatus for computer-assisted full medical code scheme to code scheme mapping
US20130086069A1 (en) * 2010-11-03 2013-04-04 General Electric Company Systems, methods, and apparatus for computer-assisted full medical code scheme to code scheme mapping
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
林冬盛: "中文分词算法的研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (66)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108257667A (en) * 2016-12-28 2018-07-06 中国科学院深圳先进技术研究院 A kind of data processing method and terminal device
CN108320778A (en) * 2017-01-16 2018-07-24 医渡云(北京)技术有限公司 Medical record ICD coding methods and system
CN106844308A (en) * 2017-01-20 2017-06-13 天津艾登科技有限公司 A kind of use semantics recognition carries out the method for automating disease code conversion
CN106844308B (en) * 2017-01-20 2020-04-03 天津艾登科技有限公司 Method for automatic disease code conversion using semantic recognition
CN107784057B (en) * 2017-03-03 2020-07-28 平安医疗健康管理股份有限公司 Medical data matching method and device
CN107784057A (en) * 2017-03-03 2018-03-09 平安医疗健康管理股份有限公司 Medical data matching process and device
CN107833605A (en) * 2017-03-14 2018-03-23 北京大瑞集思技术有限公司 A kind of coding method, device, server and the system of hospital's medical record information
CN107256344A (en) * 2017-06-20 2017-10-17 上海联影医疗科技有限公司 Data processing method, device and radiotherapy management system
CN107491437A (en) * 2017-08-25 2017-12-19 广州宝荣科技应用有限公司 A kind of TCM syndrome method for recognizing semantics and device based on natural language
CN107577826B (en) * 2017-10-25 2018-05-15 山东众阳软件有限公司 Classification of diseases coding method and system based on raw diagnostic data
CN107705839B (en) * 2017-10-25 2020-06-26 山东众阳软件有限公司 Disease automatic coding method and system
CN107731269B (en) * 2017-10-25 2020-06-26 山东众阳软件有限公司 Disease coding method and system based on original diagnosis data and medical record file data
CN107731269A (en) * 2017-10-25 2018-02-23 山东众阳软件有限公司 Disease code method and system based on raw diagnostic data and patient file data
CN107705839A (en) * 2017-10-25 2018-02-16 山东众阳软件有限公司 Disease automatic coding and system
CN107577826A (en) * 2017-10-25 2018-01-12 山东众阳软件有限公司 Classification of diseases coding method and system based on raw diagnostic data
CN108182972A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network
CN108170828A (en) * 2018-01-09 2018-06-15 深圳市第二人民医院 Structural clinical diagnoses terminology construction method and its system
CN108172265A (en) * 2018-01-09 2018-06-15 深圳市第二人民医院 Clinical diagnosis term set update method and its system
CN108170828B (en) * 2018-01-09 2022-04-29 深圳市第二人民医院 Structured clinical diagnosis term set construction method and system
CN108182285A (en) * 2018-01-29 2018-06-19 中国平安人寿保险股份有限公司 Information processing method, terminal and computer readable storage medium
CN108446260A (en) * 2018-02-06 2018-08-24 天津艾登科技有限公司 The method and system of automation disease code conversion are carried out based on semantic approximate match algorithm
CN108564991A (en) * 2018-04-13 2018-09-21 重庆医科大学附属儿童医院 Digitization coding case history wrong identification system based on ICD and its recognition methods
CN108920661B (en) * 2018-07-04 2023-08-08 平安健康保险股份有限公司 International disease classification marking method, device, computer equipment and storage medium
CN108920661A (en) * 2018-07-04 2018-11-30 平安健康保险股份有限公司 International Classification of Diseases labeling method, device, computer equipment and storage medium
CN109256216B (en) * 2018-08-14 2023-06-27 平安医疗健康管理股份有限公司 Medical data processing method, medical data processing device, computer equipment and storage medium
CN109256216A (en) * 2018-08-14 2019-01-22 平安医疗健康管理股份有限公司 Medical data processing method, device, computer equipment and storage medium
CN109545297A (en) * 2018-10-30 2019-03-29 平安医疗健康管理股份有限公司 A kind of disease coding method and calculating equipment based on big data
CN109993227B (en) * 2019-03-29 2021-09-24 京东方科技集团股份有限公司 Method, system, apparatus and medium for automatically adding international disease classification code
CN109993227A (en) * 2019-03-29 2019-07-09 京东方科技集团股份有限公司 Method, system, device and the medium of automatic addition International Classification of Diseases coding
WO2021032219A3 (en) * 2019-08-20 2021-04-15 山东众阳健康科技集团有限公司 Method and system for disease classification coding based on deep learning, and device and medium
CN110491465A (en) * 2019-08-20 2019-11-22 山东众阳健康科技集团有限公司 Classification of diseases coding method, system, equipment and medium based on deep learning
CN110852076B (en) * 2019-10-12 2023-05-30 云知声智能科技股份有限公司 Method and device for automatic disease code conversion
CN110852076A (en) * 2019-10-12 2020-02-28 云知声智能科技股份有限公司 Method and device for automatic disease code conversion
CN111046882A (en) * 2019-12-05 2020-04-21 清华大学 Disease name standardization method and system based on profile hidden Markov model
CN111046882B (en) * 2019-12-05 2023-01-24 清华大学 Disease name standardization method and system based on profile hidden Markov model
CN110895580B (en) * 2019-12-12 2020-07-07 山东众阳健康科技集团有限公司 ICD operation and operation code automatic matching method based on deep learning
CN110895580A (en) * 2019-12-12 2020-03-20 山东众阳健康科技集团有限公司 ICD operation and operation code automatic matching method based on deep learning
CN111210916A (en) * 2019-12-23 2020-05-29 望海康信(北京)科技股份公司 Medical record home page coding method and system
CN111259664B (en) * 2020-01-14 2023-03-24 腾讯科技(深圳)有限公司 Method, device and equipment for determining medical text information and storage medium
CN111259664A (en) * 2020-01-14 2020-06-09 腾讯科技(深圳)有限公司 Method, device and equipment for determining medical text information and storage medium
CN111554369A (en) * 2020-04-29 2020-08-18 杭州依图医疗技术有限公司 Medical data processing method, interaction method and storage medium
CN111554369B (en) * 2020-04-29 2023-08-04 杭州依图医疗技术有限公司 Medical data processing method, interaction method and storage medium
CN111626876A (en) * 2020-05-27 2020-09-04 泰康保险集团股份有限公司 Insurance auditing method, insurance auditing device, electronic equipment and storage medium
CN112131339A (en) * 2020-09-28 2020-12-25 上海梅斯医药科技有限公司 Name standardization standard processing method, device, computer and storage medium
CN112632909A (en) * 2020-10-30 2021-04-09 中核核电运行管理有限公司 Data object English coding method and device
CN112632909B (en) * 2020-10-30 2024-06-11 中核核电运行管理有限公司 English coding method and device for data object
CN112445917A (en) * 2020-11-05 2021-03-05 中国中医科学院中医药信息研究所 Method and device for constructing traditional medical disease body
CN112562818B (en) * 2020-12-02 2022-06-24 薛蕴菁 System and method for designing and realizing diagnosis logic based on structured report sub-template
CN112562818A (en) * 2020-12-02 2021-03-26 薛蕴菁 System and method for designing and realizing diagnosis logic based on structured report sub-template
CN112668280A (en) * 2020-12-29 2021-04-16 杭州依图医疗技术有限公司 Medical data processing method and device and storage medium
CN112735544A (en) * 2020-12-30 2021-04-30 杭州依图医疗技术有限公司 Medical record data processing method and device and storage medium
CN112700826A (en) * 2020-12-30 2021-04-23 杭州依图医疗技术有限公司 Medical data processing method and device and storage medium
CN112687397A (en) * 2020-12-31 2021-04-20 四川大学华西医院 Rare disease knowledge base processing method and device and readable storage medium
CN112687397B (en) * 2020-12-31 2023-05-09 四川大学华西医院 Rare disease knowledge base processing method and device and readable storage medium
CN112836006A (en) * 2021-01-12 2021-05-25 山东众阳健康科技集团有限公司 Multi-diagnosis intelligent coding method, system, medium and equipment
CN112836006B (en) * 2021-01-12 2022-09-23 山东众阳健康科技集团有限公司 Multi-diagnostic intelligent coding method, system, medium and equipment
CN112818085A (en) * 2021-01-28 2021-05-18 东软集团股份有限公司 Value range data matching method and device, storage medium and electronic equipment
CN113033154A (en) * 2021-05-31 2021-06-25 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Reading understanding-based medical concept coding method and device and storage medium
CN113033154B (en) * 2021-05-31 2021-08-20 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Reading understanding-based medical concept coding method and device and storage medium
CN113641714A (en) * 2021-08-31 2021-11-12 平安医疗健康管理股份有限公司 Medical data correction method, device, computer equipment and storage medium
CN115964472A (en) * 2021-12-03 2023-04-14 奥码哈(杭州)医疗科技有限公司 ICD coding method, ICD coding query method, coding system and query system
CN115017326A (en) * 2022-05-12 2022-09-06 青岛普瑞盛医药科技有限公司 Medical coding method and device
CN115017326B (en) * 2022-05-12 2023-08-18 青岛普瑞盛医药科技有限公司 Medical coding method and device
WO2024007810A1 (en) * 2022-07-05 2024-01-11 上海妙一生物科技有限公司 Coding method and apparatus based on medical diseases and medicines
CN115080751B (en) * 2022-08-16 2022-11-11 之江实验室 Medical standard term management system and method based on general model
CN115080751A (en) * 2022-08-16 2022-09-20 之江实验室 Medical standard term management system and method based on general model

Also Published As

Publication number Publication date
CN105069124B (en) 2018-06-15

Similar Documents

Publication Publication Date Title
CN105069124A (en) Automatic ICD (International Classification of Diseases) coding method and system
CN109299472B (en) Text data processing method and device, electronic equipment and computer readable medium
CN111090461B (en) Code annotation generation method based on machine translation model
EP3230896B1 (en) Localization complexity of arbitrary language assets and resources
CN105095665B (en) A kind of natural language processing method and system of Chinese medical diagnosis on disease information
CN103189860B (en) Combine the machine translation apparatus and machine translation method of syntax transformation model and vocabulary transformation model
RU2610241C2 (en) Method and system for text synthesis based on information extracted as rdf-graph using templates
CN105069123A (en) Automatic coding method and system for Chinese surgical operation information
JP5586817B2 (en) Extracting treelet translation pairs
CN105184053B (en) A kind of automatic coding and system of Chinese medical service item information
CN103235775B (en) A kind of statistical machine translation method merging translation memory and phrase translation model
CN105138829B (en) A kind of natural language processing method and system of Chinese medical information
CN1726488A (en) Integrated development tool for building a natural language understanding application
CN109829173B (en) English place name translation method and device
CN113076133B (en) Deep learning-based Java program internal annotation generation method and system
CN108665141B (en) Method for automatically extracting emergency response process model from emergency plan
JP4661415B2 (en) Expression fluctuation processing system
White An alphabet-reduction algorithm for chordal n-grams
Beltrachini et al. Semantic parsing for conversational question answering over knowledge graphs
Pakzad et al. An improved joint model: POS tagging and dependency parsing
Kratochvíl et al. Abui Wordnet: Using a Toolbox Dictionary to develop a wordnet for a low-resource language
Passarotti et al. Improvements in parsing the Index Thomisticus treebank. revision, combination and a feature model for medieval Latin
KR20230126578A (en) How to design a data model for data utilization
van Cranenburgh Rich statistical parsing and literary language
Kotzé et al. Large aligned treebanks for syntax-based machine translation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant