CN105069123A - Automatic coding method and system for Chinese surgical operation information - Google Patents

Automatic coding method and system for Chinese surgical operation information Download PDF

Info

Publication number
CN105069123A
CN105069123A CN201510496500.3A CN201510496500A CN105069123A CN 105069123 A CN105069123 A CN 105069123A CN 201510496500 A CN201510496500 A CN 201510496500A CN 105069123 A CN105069123 A CN 105069123A
Authority
CN
China
Prior art keywords
type
term
operation technique
character
substring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510496500.3A
Other languages
Chinese (zh)
Other versions
CN105069123B (en
Inventor
金以东
陈志永
朱华玲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Original Assignee
Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ebaotech Internet Medical Information Technology (beijing) Co Ltd filed Critical Ebaotech Internet Medical Information Technology (beijing) Co Ltd
Priority to CN201510496500.3A priority Critical patent/CN105069123B/en
Publication of CN105069123A publication Critical patent/CN105069123A/en
Application granted granted Critical
Publication of CN105069123B publication Critical patent/CN105069123B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24522Translation of natural language queries to structured queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The embodiment of the present invention provides an automatic coding method and system for Chinese surgical operation information. The method comprises: carrying out natural language processing on the input Chinese surgical operation information to obtain names to be coded; and searching standard terminologies or expansion terminologies matched with the names to be coded and determining codes of the standard terminologies or expansion terminologies successful in matching into codes of the names to be coded, wherein the standard terminologies are surgical operation names specified in the International Classification of Diseases (ICD), the codes of the standard teerminologies are codes of corresponding surgical operation names specified in the ICD, the expansion terminologies are words with a synonymous relationship with the standard terminologies or words with a genus-species relationship with the standard terminologies, and the expansion terminologies are consistent with the codes corresponding to the standard terminologies with the synonymous relationship or the genus-species relationship. According to the present invention, the surgical opeartion names can be automatically, rapidly and accurately identified and coded without manual participation in the whole process; and the automatic coding method and system for the Chinese surgical operation information has the advantages of high coding speed, low cost, high accuracy and the like.

Description

A kind of automatic coding of Chinese operation technique information and system
Technical field
Embodiments of the present invention relate to medical information field, and more specifically, embodiments of the present invention relate to a kind of automatic coding and system of Chinese operation technique information.
Background technology
This part embodiments of the present invention be intended to for stating in claims provide background or context.Description is not herein because be included in just admit it is prior art in this part.
At present at medicine and hygiene fields, operation differentiated control system is generally adopted to manage the execution of operation technique, in differentiated control, the writing and encode very important of operation technique title.Generally, write operation technique title by surgical doctor, then by medical record administrator, it is encoded, write correct operation technique title and accurate coding is the basis that operation technique carries out smoothly, be conducive to the standardization improving operation technique, reduce medical-risk.
The regulation medical and health industry unification of the Ministry of Public Health of China performs operation technique coding according to ICD-9-CM-3.ICD-9-CM-3 refers to " International Classification of Diseases 9th edition clinical modification the 3rd volume ", be one for carrying out the professional books of guidance to different types of areas to medical operating and disease.Operation, according to the right disease of needle, the complexity of operating process and the requirement for technology, is carried out classifying and encoding by ICD-9-CM-3.
Summary of the invention
Current most medical and health organization, is still the coding work relying on medical record administrator manually to complete operation technique, there is the shortcoming that efficiency is low, cost is high.And, the operation technique information of writing in medical record due to surgical doctor belongs to natural language, form complexity is various, there is no unified standard (such as, adopt multilingual mixing to express, use grammer lack of standardization, typing has false information, adopt abbreviation or be commonly called as replace standard terminology, be mingled with gibberish such as symbol etc. in word), medical record administrator often needs could finally determine operation technique title in conjunction with the detailed content of medical record and complete coding, reduce further the efficiency of coding, also often there is higher error rate.
For this reason, the invention provides a kind of automatic coding mechanism of operation technique title, to identify operation technique title automatically, quickly and accurately and to encode to it.
In the present context, embodiments of the present invention expect the automatic coding and the system that provide a kind of Chinese operation technique information.
In the first aspect of embodiment of the present invention, provide a kind of automatic coding of Chinese operation technique information, comprising:
Step 1, inputs Chinese operation technique information;
Step 2, carries out natural language processing to described Chinese operation technique information, obtains one or more title to be encoded;
Step 3, based on the standard terminology storehouse set up in advance with expand terminology bank, searches the standard terminology that matches with described title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of described title to be encoded;
Wherein, described standard terminology storehouse comprises some standard terminologys and coding thereof, described standard terminology is the operation technique title specified in International Classification of Diseases ICD, and the coding of described standard terminology is the coding of the corresponding operation technique title specified in International Classification of Diseases ICD;
Described expansion terminology bank comprises some expansion terms and coding thereof, and described expansion term has the word of synonymy with described standard terminology or has the word of relation of genus and species;
The coding that described expansion term is corresponding with the described standard terminology with synonymy or relation of genus and species is consistent.
In the second aspect of embodiment of the present invention, provide a kind of automatic coding system (ACOM) of Chinese operation technique information, comprising:
Import module, for inputting Chinese operation technique information;
Natural language processing module, for carrying out natural language processing to described Chinese operation technique information, obtains one or more title to be encoded;
Mate endowed module, for based on the standard terminology storehouse set up in advance and expand terminology bank, search the standard terminology that matches with described title to be encoded or expand term, and by the standard terminology that the match is successful or the coding expanding term, being defined as the coding of described title to be encoded;
Wherein, described standard terminology storehouse comprises some standard terminologys and coding thereof, described standard terminology is the operation technique title specified in International Classification of Diseases ICD, and the coding of described standard terminology is the coding of the corresponding operation technique title specified in International Classification of Diseases ICD;
Described expansion terminology bank comprises some expansion terms and coding thereof, and described expansion term has the word of synonymy with described standard terminology or has the word of relation of genus and species;
The coding that described expansion term is corresponding with the described standard terminology with synonymy or relation of genus and species is consistent.
By means of technique scheme, the present invention taken into full account Chinese operation technique information that surgical doctor inputs belong to natural language, form complexity various, there is no the features such as unified standard, the multiple dictionary set up according to ICD-9-CM-3 is in advance utilized to mate Chinese operation technique information character string, automatically, to identify operation technique title quickly and accurately and to encode to it, whole process is without the need to artificial participation, improve coding rate, reduce coding cost, and ensure that coding accuracy.
Accompanying drawing explanation
By reference to accompanying drawing reading detailed description hereafter, above-mentioned and other objects of exemplary embodiment of the invention, feature and advantage will become easy to understand.In the accompanying drawings, show some embodiments of the present invention by way of example, and not by way of limitation, wherein:
Fig. 1 schematically shows the application scenarios that embodiments of the present invention can be implemented wherein;
Fig. 2 schematically shows the automatic coding process flow diagram of the Chinese operation technique information of exemplifying implementation method;
Fig. 3 schematically shows the automatic coding process flow diagram of the Chinese operation technique information of the embodiment of the present invention one;
Fig. 4 schematically shows the automatic coding process flow diagram of the Chinese operation technique information of the embodiment of the present invention two;
Fig. 5 schematically shows the automatic coding process flow diagram of the Chinese operation technique information of the embodiment of the present invention three;
Fig. 6 schematically shows the automatic coding process flow diagram of the Chinese operation technique information of the embodiment of the present invention four;
Fig. 7 schematically shows the automatic coding process flow diagram of the Chinese operation technique information of the embodiment of the present invention five;
Fig. 8 schematically shows the natural language processing process flow diagram of the embodiment of the present invention six;
Fig. 9 schematically shows the cutting first kind substring of the embodiment of the present invention six and the process flow diagram of Second Type substring;
What Figure 10 schematically showed the embodiment of the present invention seven searches the standard terminology matched with title to be encoded or the process flow diagram expanding term;
Figure 11 schematically shows the automatic coding system (ACOM) block diagram of the Chinese operation technique information of exemplifying implementation method.
In the accompanying drawings, identical or corresponding label represents identical or corresponding part.
Embodiment
Below with reference to some illustrative embodiments, principle of the present invention and spirit are described.Should be appreciated that providing these embodiments is only used to enable those skilled in the art understand better and then realize the present invention, and not limit the scope of the invention by any way.On the contrary, provide these embodiments to be to make the disclosure more thorough and complete, and the scope of the present disclosure intactly can be conveyed to those skilled in the art.
Art technology technician know, embodiments of the present invention can be implemented as a kind of system, device, equipment, method or computer program.Therefore, the disclosure can be implemented as following form, that is: hardware, completely software (comprising firmware, resident software, microcode etc.) completely, or the form that hardware and software combines.
According to the embodiment of the present invention, a kind of method and apparatus is proposed.
In this article, any number of elements in accompanying drawing is all unrestricted for example, and any name is all only for distinguishing, and does not have any limitation.
Below with reference to some representative embodiments of the present invention, explaination principle of the present invention and spirit in detail.
application scenarios overview
First with reference to figure 1, it illustrates the application scenarios that embodiments of the present invention can be implemented wherein.
Scene shown in Fig. 1 comprises medical information platform 100 and Chinese operation technique information automatic coding system (ACOM) 200.Medical information platform 100 can be loaded into the software in the equipment such as doctor's desktop computer used, notebook computer, panel computer, personal digital assistant.Chinese operation technique information automatic coding system (ACOM) 200 can be run on the software etc. in Medicine information service device.Such as can be communicated to connect by hospital lan etc. between medical information platform 100 and Chinese operation technique information automatic coding system (ACOM) 200.
After surgical doctor inputs Chinese operation technique information in medical information platform 100, Chinese operation technique information is transferred to Chinese operation technique information automatic coding system (ACOM) 200, by Chinese operation technique information automatic coding system (ACOM) 200, natural language processing and automatic coding are carried out to it, last output encoder result.
illustrative methods
Below in conjunction with the application scenarios of Fig. 1, be described with reference to Figure 2 the automatic coding of the Chinese operation technique information according to exemplary embodiment of the invention.It should be noted that above-mentioned application scenarios is only that embodiments of the present invention are unrestricted in this regard for the ease of understanding spirit of the present invention and principle and illustrating.On the contrary, embodiments of the present invention can be applied to applicable any scene.
As shown in Figure 2, the automatic coding of Chinese operation technique information, comprising:
Step S101, inputs Chinese operation technique information.
Step S102, carries out natural language processing to Chinese operation technique information, obtains one or more title to be encoded.
This step based on the feature of operation technique information, can be carried out the process such as mechanical Chinese word segmentation to operation operation information, obtains title to be encoded.This illustrative methods will be introduced how Chinese operation technique information will be carried out a kind of specific embodiment of natural language processing below by embodiment six.
Step S103, based on the standard terminology storehouse set up in advance with expand terminology bank, searches the standard terminology that matches with title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of title to be encoded.
In exemplifying method, standard terminology storehouse comprises some standard terminologys and coding thereof, standard terminology is the operation technique title specified in International Classification of Diseases ICD, and the coding of standard terminology is the coding of the corresponding operation technique title specified in International Classification of Diseases ICD.
Expand terminology bank and comprise some expansion terms and coding thereof, expanding term is have the word of synonymy with standard terminology or have the word of relation of genus and species.Wherein, when expansion term and standard terminology have synonymy, can be being commonly called as of standard terminology, have another name called or abbreviation etc., expand term and standard terminology when there is relation of genus and species, can be conceptive or apply and comprise standard terminology (the high rank of operation technique type relative to standard terminology represents) or included by standard terminology (the low rank of operation technique type relative to standard terminology represents).In order to needs of encoding, according to clinical experience, the coding that order expands term corresponding with the standard terminology with synonymy or relation of genus and species is consistent.
In this illustrative methods, all right revised standard terminology bank in real time, expansion terminology bank, such as, increase new expansion term, or delete existing expansion term, with the needs making standard terminology storehouse, expansion terminology bank more meets ICD-9-CM-3 coding.
In this illustrative methods, standard terminology storehouse and expand terminology bank and form a body dictionary, standard terminology and to expand term be body in this body dictionary, the part of standards term that comprises for body dictionary as shown in table 1 and expansion term and coding thereof.
Table 1
The standard terminology matched with title to be encoded or a kind of specific embodiment expanding term how is searched below by being introduced this illustrative methods by embodiment seven.
embodiment one
As shown in Figure 3, be a kind of automatic coding of Chinese operation technique information specifically, comprise:
Step S201, inputs Chinese operation technique information.
Step S202, carries out natural language processing to Chinese operation technique information, obtains one or more title to be encoded.
Step S203, based on the standard terminology storehouse set up in advance, expand terminology bank and Hypothetical classification terminology bank, search match with title to be encoded standard terminology, expand term or Hypothetical classification term, and by the coding of standard terminology, expansion term or the Hypothetical classification term that the match is successful, be defined as the coding of title to be encoded.
The present embodiment is in illustrative methods, add Hypothetical classification terminology bank, and this Hypothetical classification terminology bank comprises some Hypothetical classification terms and coding thereof.
Hypothetical classification term represents ad hoc type treatment means, and ad hoc type treatment means corresponds to multiple resection operation type, and this multiple resection operation type is standard terminology.
Hypothetical classification term has standard terminology one to one, the coding being encoded to the standard terminology of its correspondence of Hypothetical classification term.According to the regulation of ICD-9-CM-3, if this ad hoc type treatment means for disease be the non-malignant tumors of site of pathological change, then the standard terminology that Hypothetical classification term is corresponding is the disease damage resection of site of pathological change; If this ad hoc type treatment means for disease be the malignant tumour of site of pathological change and do not need to do organ transplant, then the standard terminology that Hypothetical classification term is corresponding is the total resection of site of pathological change; If this ad hoc type treatment means for disease be the malignant tumour of site of pathological change and do not do organ transplant and be not suitable for entirely cutting, then the standard terminology that Hypothetical classification term is corresponding is the ablation of site of pathological change.
In the present embodiment, Hypothetical classification terminology bank can also be revised in real time, such as, increase new Hypothetical classification term, or delete existing Hypothetical classification term, with the needs making Hypothetical classification terminology bank more meet ICD-9-CM-3 coding.
Such as, table 2 is depicted as standard terminology and the coding of part Hypothetical classification term that Hypothetical classification terminology bank comprises and correspondence thereof.
Table 2
Hypothetical classification term (ad hoc type treatment means) Standard terminology Coding
Liver Cancer under Radical Operation Partial hepatectomy 50.22011
Hepatic cyst resection Hepatopathy damages resection 50.29009
Diverticulectomy of stomach Excision of lesion of stomach 43.42004
As shown in table 2, ad hoc type treatment means " Liver Cancer under Radical Operation " for disease be the malignant tumour of liver and be not suitable for doing organ transplant, then corresponding standard terminology is " partial hepatectomy ".
As shown in table 2 again, ad hoc type treatment means " hepatic cyst resection " for disease be the non-malignant tumors of liver, then corresponding standard terminology is " hepatopathy damage resection ".
As shown in table 2 again, ad hoc type treatment means " resection of gastric carcinoma " for disease be the malignant tumour of stomach and need to do organ transplant, then corresponding standard terminology is " Radical Gastrectomy ".
embodiment two
As shown in Figure 4, be a kind of automatic coding of Chinese operation technique information specifically, comprise:
Step S301, inputs Chinese operation technique information.
Step S302, carries out natural language processing to Chinese operation technique information, obtains one or more title to be encoded.
Step S303, based on the standard terminology storehouse set up in advance, expand terminology bank and odd encoder terminology bank, search match with title to be encoded standard terminology, expand term or odd encoder term, and by the coding of standard terminology, expansion term or the odd encoder term that the match is successful, be defined as the coding of title to be encoded.
The present embodiment is in illustrative methods, add odd encoder terminology bank, and this odd encoder terminology bank comprises some odd encoder terms and coding thereof.
Odd encoder term is ad hoc type operation technique type; The prerequisite that ad hoc type operation technique type performs is another kind of operation technique type; Ad hoc type operation technique type and another kind of operation technique type are standard terminology or expand term;
The combination of the coding being encoded to ad hoc type operation technique type of odd encoder term and the coding of another kind of operation technique type.
In actual clinical, if doctor has write an operation technique title, the prerequisite that this operation technique performs has been another operation technique, and so this operation technique title just belongs to odd encoder term.
In the present embodiment, odd encoder terminology bank can also be revised in real time, such as, increase new odd encoder term, or delete existing odd encoder term, with the needs making odd encoder terminology bank more meet ICD-9-CM-3 coding.
Such as, table 3 is depicted as standard terminology and the coding of part odd encoder term that odd encoder terminology bank comprises and correspondence thereof.
Table 3
embodiment three
As shown in Figure 5, be a kind of automatic coding of Chinese operation technique information specifically, comprise:
Step S401, inputs Chinese operation technique information.
Step S402, carries out natural language processing to Chinese operation technique information, obtains one or more title to be encoded.
Step S403, based on the merging terminology bank set up in advance, treats encoding name and carries out pre-service.
Merge terminology bank and comprise some merging terms and coding thereof; Wherein, merging term is the single standard terminology that can substitute the other standards term that at least two occur simultaneously that International Classification of Diseases ICD specifies; At least two other standards terms simultaneously occurred are the combining objects of this merging term; Merge terminology bank and also comprise each whole combining objects merging term.Wherein, any one combining objects that term is different from its correspondence is merged.
Doctor may write multiple operation technique title in a medical record clinically, according to the regulation of ICD-9-CM-3, these operation technique titles can be classified as an operation technique title, and namely above multiple operation technique title is actual is multiple steps of an operation technique title.
In the present embodiment, revision can also merge terminology bank in real time, such as, increase new merging term, or delete existing merging term, or amendment combining objects, with the needs making merging terminology bank more meet ICD-9-CM-3 coding.
Table 4 merges term and coding thereof and whole combining objects for merging terminology bank comprise.
Table 4
Step S403 is specially: judge, in one or more title to be encoded, whether to comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of any one or more merging term is replaced to corresponding merging term.
Step S404, based on the standard terminology storehouse set up in advance with expand terminology bank, searches the standard terminology that matches with title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of title to be encoded.
embodiment four
As shown in Figure 6, be a kind of automatic coding of Chinese operation technique information specifically, comprise:
Step S501, inputs Chinese operation technique information.
Step S502, carries out natural language processing to Chinese operation technique information, obtains one or more title to be encoded.
Step S503, based on the omission terminology bank set up in advance, treats encoding name and carries out pre-service.
Omit terminology bank and comprise some omission terms and coding thereof; Wherein, the single standard terminology that can substitute at least two standard terminologys simultaneously occurred that term is ICD-9-CM-3 regulation is omitted; Omission term is one at least two standard terminologys simultaneously occurred; At least two standard terminologys simultaneously occurred are the omission object of this omission term; Omit terminology bank and also comprise whole omission objects that each omits term;
In medical record when some operation technique title occurs simultaneously, some operation techniques are the leading operations of other operation techniques, and according to the regulation of ICD-9-CM-3, operation technique title corresponding to some of them is without the need to coding.
In the present embodiment, revision can also omit terminology bank in real time, such as, increase new omission term, or delete existing omission term, or object be omitted in amendment, with the needs making omission term Kuku more meet ICD-9-CM-3 coding.
Table 5 omits term and coding thereof for omitting terminology bank comprise and all omits object.
Table 5
Step S503 is specially: judge in one or more title to be encoded, whether comprises whole omission objects of any one or more omission term, if comprise, then whole omission objects of any one or more omission term is replaced to corresponding omission term.
Step S504, based on the standard terminology storehouse set up in advance with expand terminology bank, searches the standard terminology that matches with title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of title to be encoded.
embodiment five
As shown in Figure 7, be a kind of automatic coding of Chinese operation technique information specifically, comprise:
Step S601, inputs Chinese operation technique information.
Step S602, carries out natural language processing to Chinese operation technique information, obtains one or more title to be encoded.
Step S603, based on the standard terminology storehouse set up in advance with expand terminology bank, searches the standard terminology that matches with title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of title to be encoded.
This completes the standard terminology searching and match with title to be encoded or expands this process of term, likely searches the standard terminology matched less than title to be encoded or expand term in this process.This is because the body (no matter being standard terminology or expansion term) in body dictionary is all words that operation technique title is relevant, but in the Chinese operation technique information of reality, often relate to the multiple concept of medical field, it is not only operation technique title, also may relate to disease name (such as " fracture of sternum Flail chest "), nomenclature of drug (such as " cetirizine "), medical treatment consumptive materials title (such as " pseudoxanthoma elasticum gum ") etc., but the present invention is just to the coding of operation action name, therefore, if there is disease name in Chinese operation technique information, nomenclature of drug, medical treatment consumptive materials title etc., the present invention can select will not encode to it.In addition, although it is represent operation technique information that actual Chinese operation technique information also may comprise some, such as, but can not determine the word specifically corresponding to which kind of operation technique title, some does not meet ICD-9-CM-3 taxonomic hierarchies, can not determine its concrete corresponding operation technique title.Such as " attrition ", although represent operation technique title, its concept is too general, cannot determine specifically art is worn down at what position, is facial attrition, cheekbone attrition or Laser final guidance shell; For another example, " sex change art " although represent operation technique title, specifically man becomes female urethra displacement plasty or man and becomes vagina reconstruction and also cannot determine.
Consider above problem, exemplifying method also presets one without encryption description storehouse, and this comprises some without encryption description without encryption description storehouse.These comprise without encryption description: preset for representing operation technique information but the word of operation technique title cannot being determined; The disease name preset; The nomenclature of drug preset; And the medical treatment consumptive materials title preset.
Such as, table 6 is depicted as the part that comprises without encryption description dictionary without encryption description.
Table 6
Step S604, by do not determine encode title to be encoded with without mating without encryption description in encryption description storehouse, if the match is successful, then perform default treatment step and this is not determined that the title to be encoded of encoding is encoded to represent, if it fails to match, then this is not determined that the title to be encoded of encoding is sent to artificial treatment platform and carries out artificial treatment.
Wherein, for not finding the standard terminology matched or the title to be encoded expanding term, match if can find without encryption description, then illustrate that it belongs to and represent operation technique information but the word that operation technique title cannot be determined, disease name, nomenclature of drug, one in medical treatment consumptive materials title, will not encode, and for not finding the title to be encoded without encryption description matched, illustrate that it does not belong to the above-mentioned type, for this kind of title to be encoded, the present embodiment sends it to artificial treatment platform, by manually continuing process, concrete processing procedure, the present invention is not construed as limiting it.
embodiment six
As shown in Figure 8, for being applicable to a kind of of exemplifying method, natural language processing being carried out to obtain the embodiment of title to be encoded to Chinese operation technique information, comprising:
Step S71, carries out pre-service to Chinese operation technique information character string, obtains pretreated Chinese operation technique information character string.
The object of this step is that the character conversion in Chinese operation technique information character string is become unified coded format, so that subsequent treatment.
Alternatively, this step can be implemented according to following concrete mode: carry out form normalized (such as to the non-Chinese character in Chinese operation technique information character string, symbol in Chinese operation technique information character string is all converted to half width form or is all converted to full-shape form, English alphabet wherein is all converted to uppercase format or lower case format); And the non-medical term deleted in Chinese operation technique information character string.Wherein non-medical term is provided by the non-medical term dictionary that sets up in advance, and non-medical term be the word of remarks effect, phrase or descriptive statement (such as " open inspection ", " mending emergency treatment book keeping operation ", " berth expense exceeds standard at one's own expense ", " added more than one month, monthly received less than one month ", " paediatrics is added " etc.).
Step S72, based on the body dictionary set up in advance, orientation dictionary, grade dictionary, is cut into some first kind substrings and/or Second Type substring by pretreated Chinese operation technique information character string.
Wherein, first kind substring directly can mate with the body in body dictionary, and Second Type substring directly can not mate with the body in body dictionary.The first kind substring be syncopated as and Second Type substring have independent semanteme, and namely represented operation technique project information does not affect by the character before or after it.
Body dictionary comprises aforesaid standards terminology bank and expands terminology bank, as shown in table 1, specifically comprises some bodies and body is encoded one to one, standard terminology or expand term and be considered body in body dictionary.
It should be noted that, when having used aforesaid Hypothetical classification terminology bank and/or odd encoder terminology bank in the automatic coding of Chinese operation technique information provided by the invention, body dictionary also should comprise Hypothetical classification terminology bank and/or odd encoder terminology bank (now, Hypothetical classification term and/or odd encoder term, omit term and be also considered body in body dictionary), as during title to be encoded to make the first kind substring that is syncopated as or Second Type substring or term can be omitted match with Hypothetical classification term or odd encoder term.
Orientation dictionary comprises some directional terminology, directional terminology be for describe operation technique project for the word in orientation.Such as, directional terminology can be: one-sided, bilateral, left side, right side, both sides, side etc.
Grade dictionary comprises some grade terms, and grade term is for describing the rank of operation technique project, the word of type.Such as, grade term can be: A level, B level, C level, superfine etc.
The object of step S72 Chinese operation technique information is cut into have independent semantic substring (first kind substring or Second Type substring), effectively to avoid being carried out respectively identifying by multiple characters with incidence relation thus causing the problem of identification error.
After the first kind substring be syncopated as and Second Type substring are defined as title to be encoded, merging terminology bank in later use embodiment three or the omission terminology bank in embodiment four treat encoding name when carrying out pre-service, because first kind substring and body corresponding to Second Type substring may be expand term, and the combining objects merged in terminology bank and the omission object omitted in terminology bank are standard terminology, therefore, expansion term corresponding to first kind substring and Second Type substring need be converted to corresponding standard terminology, and then utilize merging terminology bank or omission terminology bank to carry out pre-service.
As shown in Figure 9, step S72 specifically comprises:
Step S80, judges whether pretreated Chinese operation technique information character string comprises symbol; If comprise symbol, then perform step S81; If do not comprise symbol, then perform step S82.
Step S81, mates with the body in body dictionary as a whole by the character between adjacent two symbols every in pretreated Chinese operation technique information character string; If the match is successful, then perform step S811; If it fails to match, then perform step S812.
Step S811, using the character cutting between this adjacent two symbols out as first kind substring.
Step S812, this adjacent two symbols and between character be defined as wouldn't cutting character string, then perform step S83.
The processing rule of step S81, step S811, step S812 foundation is: mated with body as a whole by the alphabet between adjacent-symbol, just cutting when only having coupling, otherwise temporarily refuses cutting.
Such as, to the cutting of " cardiac output is monitored; consume technology with oxygen; ventricular puncture, through implantation catheter " shown in table 7, wherein, " cardiac output is monitored; consume technology with oxygen ", " ventricular puncture, through implantation catheter " are the alphabet between symbol, and can find the body matched, therefore, be split out respectively.
Table 7
Step S82, adopts mechanical Chinese word segmentation method to be mated with the body in body dictionary by pretreated Chinese operation technique information character string; If all characters in pretreated Chinese operation technique information character string all can with Ontology Matching, then perform step S821; Fail and the single character of Ontology Matching or multiple continuous print character if exist in pretreated Chinese operation technique information character string, then perform step S822.
Step S821, according to the body that mates using the character cutting in pretreated Chinese operation technique information character string out as first kind substring.
Step S822, judges whether fail with the single character of Ontology Matching or multiple continuous print character is directional terminology or grade term; If directional terminology or grade term, then perform step S8221; If not directional terminology or grade term, then perform step S8222.
The processing rule of step S82, step S821, step S822 foundation is: adopt mechanical Chinese word segmentation method to be mated with body by the character in pretreated Chinese operation technique information character string, just cutting when only having alphabet can find the body matched, otherwise temporarily refuse cutting.
Such as table 8 is depicted as the cutting to " electroencephalogram 24 hours monitoring of blood pressure ", and employing mechanical Chinese word segmentation method can find the body that " electroencephalogram " and " 24 hours monitoring of blood pressure " matches respectively, therefore, is split out respectively.
Table 8
The mechanical Chinese word segmentation method that step S82 adopts can be Forward Maximum Method type, reverse maximum matching type, or minimum cutting type.Concrete dicing process, the present embodiment repeats no more.
Step S8221, according to failing and the position in the single character of Ontology Matching or multiple continuous print character Chinese operation technique information character string after the pre-treatment, cut out failing as Second Type substring with single character or multiple continuous print character of Ontology Matching and can merging with the single character of Ontology Matching or multiple continuous print character before or after it, and can with the single character of Ontology Matching or multiple continuous print character cutting out as first kind substring using remaining.
Step S8222, cuts out pretreated Chinese operation technique information character string entirety as Second Type substring.
The processing rule of step S8221, step S8222 foundation is: if failing with the single character of Ontology Matching or multiple continuous print character is directional terminology or grade term, then perform cutting, and during cutting be by itself and its before or after character merge cut out.
Such as table 9 is depicted as the cutting to " lung volume reduction surgery right lung neoplasty ", adopt the body that mechanical Chinese word segmentation method can find " lung volume reduction surgery " respectively, " lung neoplasty " matches, " right side " is wherein directional terminology, therefore, " right side " and " lung neoplasty " are merged and cut out, " lung volume reduction surgery " cuts out separately.
Table 9
Step S83, judges whether wouldn't comprise default special symbol in cutting character string; If special symbol wouldn't be comprised in cutting character string, then perform step S831; If special symbol wouldn't do not comprised in cutting character string, then perform step S833.
Step S831, searching wouldn't character model belonging to cutting character string, and according to segmentation rules corresponding to this affiliated character model to cutting character string carrying out cutting; Wherein, character model is provided by the character model storehouse that sets up in advance, and character model has segmentation rules one to one.
Step S832, the character cut out is mated with the body in body dictionary, if the match is successful, then the character that this cuts out is defined as first kind substring, if it fails to match, then the character that this cuts out is defined as Second Type substring;
Step S833, cutting character string wouldn't directly be defined as Second Type substring.
The processing rule of step S83, step S831, step S832, step S833 foundation is: when comprising default special symbol in cutting character string, according to carrying out cutting by the character model belonging to cutting character string, otherwise directly cuts out; And the character be syncopated as based on character model is mated with body again, using wherein can with body directly mate as first kind substring, can not directly mate as Second Type substring.
The special symbol such as preset can include but not limited to fullstop, colon, plus sige, branch, slash line etc.
Be the partial character model in character model storehouse and segmentation rules thereof such as:
(1) character model: XAY type, A is plus sige, colon;
Segmentation rules: XAY is cut out as a whole;
(2) character model: CDE type, and one of C, E are Chinese character, D is fullstop, branch;
Segmentation rules: by the Chinese character segmentation in C, E out;
(3) character model: STU type, and S and/or U is individual Chinese character, T is slash line;
Segmentation rules: STU is cut out as a whole.
Such as to " blood fat (P).Renal function detects (P) " carry out cutting, through searching, character model storehouse is known belongs to CDE type, then " blood fat (P) ", " renal function detects (P) " are cut out separately.
Such as carry out cutting to " under thoracoscope lung neoplasty+pulmonary belb resection ", through searching, character model storehouse is known belongs to XAY type, then " under thoracoscope lung neoplasty+pulmonary belb resection " entirety cut out.
Such as carry out cutting to " 3/4 laryngectomy and laryngeal reconstruction ", through searching, character model storehouse is known belongs to STU type, then " 3/4 laryngectomy and laryngeal reconstruction " entirety cut out.
Step S73, is defined as title to be encoded by the first kind substring be syncopated as and Second Type substring.
The present embodiment is carrying out in the process of natural language processing to Chinese operation technique information, taken into full account Chinese operation technique information that surgical doctor inputs belong to natural language, form complexity various, there is no the features such as unified standard, utilize multiple dictionary set up in advance to carry out cutting and coupling to Chinese operation technique information character string, using this, operation technique project name is identified as title to be encoded.
embodiment seven
As shown in Figure 10, searching the standard terminology matched with title to be encoded or the embodiment expanding term for being applicable to a kind of of exemplifying method, comprising:
Step S90, if name to be encoded is called first kind substring, then by body that this first kind substring matches, be defined as title to be encoded matches with this standard terminology or expand term, if name to be encoded is called Second Type substring, then each body in Second Type substring and body dictionary is carried out to the parsing of the first dimension, obtain some first dimension analysis results of Second Type substring, and some first dimension analysis results of each body;
This step is using Second Type substring and body as analysis object, and alternatively, parsing analysis object being carried out to the first dimension can include but not limited to:
(1) determine the directional terminology comprised in analysis object, if wherein do not comprise directional terminology, then this analysis result is empty;
(2) determine the grade term comprised in analysis object, if wherein do not comprise grade term, then this analysis result is empty;
(3) determine the character in analysis object bracket, if wherein do not comprise bracket, then this analysis result is empty;
(4) determine the character in analysis object after dash, if wherein do not comprise dash, then this analysis result is empty; And,
(5) determine the character (the residue character hereinafter referred to as in body) except the character in directional terminology, grade term, bracket, character after dash in analysis object, be generally the core stem of analysis object.
When analysis object is Second Type substring, its each the first dimension analysis result can include but not limited to: the character in the character in the grade term in the directional terminology in Second Type substring, Second Type substring, Second Type substring bracket, Second Type substring after dash, the residue character in Second Type substring.
When analysis object is body, its each the first dimension analysis result can include but not limited to: the character in the character in the directional terminology in body, the grade term in body, body bracket, body after dash, the residue character in body.
Step S91, each first dimension analysis result of Second Type substring is mated with the analysis result of each first dimension of each body in body dictionary, searches each first dimension analysis result that whether there is certain body and all match with each first dimension analysis result of Second Type substring; If there is such body, then perform step S92, if there is no such body, then perform step S93.
Step S92, is defined as the body that Second Type substring matches by the body found.
Step S93, choose the part first dimension analysis result in whole first dimension analysis results of Second Type substring to mate with the part first dimension analysis result in whole first dimension analysis results of each body in body dictionary, and this part first dimension analysis result searching this part first dimension analysis result and the Second Type substring that whether there is certain body matches; If there is such body, then perform step S931; If there is no such body, then perform step S932.
Step S931, is defined as the body that Second Type substring matches by the body found.
Respectively the directional terminology comprised in Second Type substring is mated with the directional terminology comprised in body, the grade term comprised in Second Type substring is mated with the grade term comprised in body, character in Second Type substring bracket is mated with the character in body bracket, character after dash in Second Type substring is mated with the character after dash in body bracket, the residue character in Second Type substring is mated with the residue character in body.
If the first whole dimension analysis results all mates, then this body is defined as the body that Second Type substring matches.
If some first dimension analysis result does not mate, then selected part first dimension analysis result mates respectively.
Consider the core information of residue character in Second Type substring Second Type substring often, therefore, in concrete enforcement, preferably, selected part first dimension analysis result at least comprises the residue character in Second Type substring, and the residue character in body.Such as, the character only chosen after the residue character of analysis object and dash mates respectively, or, the residue character only choosing analysis object mates, or the residue character can also choosing analysis object and the character in directional terminology or grade term or bracket or directional terminology or grade term mate respectively.
Such as a certain Second Type substring is " left mastostomy (greatly) ", it is carried out to the parsing of the first dimension, the analysis result obtained is as shown in table 10, and as shown in table 11 is the body and each the first dimension analysis result thereof that match with this Second Type substring.
Table 10
First dimension analysis result of the body " mastostomy " matched with " left mastostomy (greatly) " is as shown in table 11:
Table 11
Step S932, carries out the parsing of the second dimension to each body in Second Type substring and body dictionary, obtain each second dimension analysis result of Second Type substring, and each second dimension analysis result of each body in body dictionary.
This step is using Second Type substring and body as analysis object, and alternatively, parsing analysis object being carried out to default dimension can include but not limited to:
(1) each Chinese character in analysis object is determined;
(2) initial consonant of each Chinese character in analysis object is determined;
(3) simple or compound vowel of a Chinese syllable of each Chinese character in analysis object is determined;
(4) initial character of analysis object is determined;
(5) phonetic of the initial character of analysis object is determined; And,
(6) determine the non-Chinese character in analysis object, if wherein do not comprise non-Chinese character, then this analysis result is empty.
When analysis object is Second Type substring, the analysis result of its each dimension can include but not limited to: the non-Chinese character in each Chinese character in Second Type substring, Second Type substring in the initial consonant of each Chinese character, Second Type substring in the phonetic of the simple or compound vowel of a Chinese syllable of each Chinese character, the initial character of Second Type substring, the initial character of Second Type substring, Second Type substring.
When analysis object is entry, analysis result can include but not limited to: simple or compound vowel of a Chinese syllable, the initial character of entry, the phonetic of the initial character of entry, the non-Chinese character of entry of each Chinese character in the initial consonant of each Chinese character, entry in each Chinese character in entry, entry.
Such as, table 12 is each second dimension analysis result of Second Type substring " deciduous teeth arrachement ".
Table 12
Step S933, based on some second dimension analysis results of Second Type substring, and some second dimension analysis results of body, calculate the matching degree of Second Type substring and each body.
Particularly, this step can calculate the similarity of Second Type substring and each body, also can calculate total degree of confidence of Second Type substring and each body.Wherein, compared to similarity, total degree of confidence more can embody the matching degree of Second Type substring and each body, but the computation process of total degree of confidence is also more complicated compared to the computation process of similarity.During this step S933 of concrete enforcement, if desired processing speed faster, then can the process of seletion calculation similarity, if desired matching result more accurately, then can the process of the total degree of confidence of seletion calculation.
A kind of embodiment of step S933 is the similarity calculating Second Type substring and each body, specific as follows:
According to the similarity of following formulae discovery Second Type substring and each body, and the similarity calculated is defined as the matching degree of Second Type substring and each body:
M = Σ t i n q ( t f t i n d · i d f ( t ) 2 · t . g e t B o o s t ( ) · n o r m ( t , d ) )
Wherein, M represents similarity;
T represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
Tinq represents each second dimension of Second Type substring;
D represents body;
Tf (tind) represents in the second identical dimension, the frequency that the second dimension analysis result of Second Type substring and the second dimension analysis result of body match;
wherein, T represents the sum of body in body dictionary, and T (t) represents the sum of the body that each second dimension analysis result all matches with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of body.
A kind of embodiment of step S933 is the total degree of confidence calculating Second Type substring and each body, specific as follows:
According to total degree of confidence of following process computation Second Type substring and each body, and the total degree of confidence calculated is defined as the matching degree of Second Type substring and each body:
1) each Chinese character in Second Type substring is determined.
2) the cosine degree of confidence of each body matched with it according to following formulae discovery Second Type substring:
N = Σ j = 1 V w Q , j × w d ′ , j Σ j = 1 V w Q , j 2 × Σ j = 1 V w d ′ , j 2
Wherein, N represents cosine degree of confidence;
V represents the Chinese character sum that Second Type substring and the body matched thereof comprise;
Q represents Second Type substring;
D' represents and the body that Second Type substring matches;
W q,jrepresent the frequency that each Chinese character occurs in Second Type substring;
W d', jrepresent the frequency occurred in the body that each Chinese character matches at Second Type substring;
J represents the sequence number of the Chinese character that Second Type substring and the body matched thereof comprise.
3) total degree of confidence of each body matched with it according to following formulae discovery Second Type substring:
S=M×a+N×b
Wherein, S represents total degree of confidence;
M represents similarity;
A represents the preset weights that similarity M is corresponding;
B represents the preset weights that cosine degree of confidence N is corresponding;
Further, similarity M is according to following formulae discovery:
M = Σ t i n q ( t f t i n d · i d f ( t ) 2 · t . g e t B o o s t ( ) · n o r m ( t , d ) )
Wherein, t represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
Tinq represents each second dimension of Second Type substring;
D represents body;
Tf (tind) represents in the second identical dimension, the frequency that the second dimension analysis result of Second Type substring and the second dimension analysis result of body match;
wherein, T represents the sum of body in body dictionary, and T (t) represents the sum of the body that each second dimension analysis result all matches with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of body.
Step S934, according to the matching degree of Second Type substring and each body, determines the body that one or more body matches as Second Type substring.
Alternatively, this step can have following embodiment: sort to whole body according to the size of the matching degree with Second Type substring, and the body of the predetermined number wherein sorting forward (2 that such as sort forward) is defined as the body that Second Type substring matches; Or, the matching degree with Second Type substring is reached one or more bodies of predetermined threshold value, is defined as the body that Second Type substring matches.
When concrete enforcement is of the present invention, in order to clear and definite Second Type substring and body that each matches matching degree and it is used, can also can also comprise the matching degree of each body that Second Type substring matches with it in the final result exported.Such as, export the matching degree of Second Type substring and each body matched, then according to the size of matching degree, therefrom can select a body matched as Second Type substring again by manual type.
Step S94, the body match Second Type substring or reach one or more bodies of preset matching condition with Second Type substring, is defined as standard terminology that title to be encoded matches or expands term.
The present embodiment is carrying out in the process of natural language processing to Chinese operation technique information, taken into full account Chinese operation technique information that surgical doctor inputs belong to natural language, form complexity various, there is no the features such as unified standard, utilize multiple dictionary set up in advance to carry out cutting and coupling to Chinese operation technique information character string, search with this standard terminology that title to be encoded matches or expand term.
example system
After the method describing exemplary embodiment of the invention, next, be introduced with reference to the automatic coding system (ACOM) of Figure 11 to the Chinese operation technique information of exemplary embodiment of the invention.
The enforcement of the automatic coding system (ACOM) of Chinese operation technique information see the enforcement of said method, can repeat part and repeats no more.Following used term " module " can be realize the software of predetermined function and/or the combination of hardware.Although the system described by following examples preferably realizes with software, hardware, or the realization of the combination of software and hardware also may and conceived.
As shown in figure 11, the automatic coding system (ACOM) of Chinese operation technique information can comprise: import module 111, natural language processing module 112, mate endowed module 113.
Import module 111, for inputting Chinese operation technique information.
Natural language processing module 112, for carrying out natural language processing to Chinese operation technique information, obtains one or more title to be encoded.
Mate endowed module 113, for based on the standard terminology storehouse set up in advance and expand terminology bank, search the standard terminology that matches with title to be encoded or expand term, and by the standard terminology that the match is successful or the coding expanding term, being defined as the coding of title to be encoded.
Alternatively, as shown in figure 11, the automatic coding system (ACOM) of this Chinese operation technique information can also comprise: merging treatment module 114, omission processing module 115.
Wherein, merging treatment module 114 is for judging in one or more title to be encoded, whether comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of any one or more merging term are replaced to corresponding merging term.
Omit processing module 115 for carrying out pretreated step to one or more title to be encoded, comprise: judge in one or more title to be encoded, whether comprise whole omission objects of any one or more omission term, if comprise, then whole omission objects of any one or more omission term are replaced to corresponding omission term.
In this example system, wherein, the specifying information of standard terminology storehouse, described expansion terminology bank, described Hypothetical classification terminology bank, described odd encoder terminology bank, described merging terminology bank, with reference to the introduction of the automatic coding of above-mentioned Chinese operation technique information, repeats part and repeats no more.
Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; be understood that; the foregoing is only specific embodiments of the invention; the protection domain be not intended to limit the present invention; within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.
Those skilled in the art can also recognize the various illustrative components, blocks (illustrativelogicalblock) that the embodiment of the present invention is listed, unit, and step can pass through electronic hardware, computer software, or both combinations realize.For the replaceability (interchangeability) of clear displaying hardware and software, above-mentioned various illustrative components (illustrativecomponents), unit and step have universally described their function.Such function is the designing requirement realizing depending on specific application and whole system by hardware or software.Those skilled in the art for often kind of specifically application, can use the function described in the realization of various method, but this realization can should not be understood to the scope exceeding embodiment of the present invention protection.
Various illustrative logical block described in the embodiment of the present invention, or unit, or device can pass through general processor, digital signal processor, special IC (ASIC), field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the design of above-mentioned any combination realizes or operates described function.General processor can be microprocessor, and alternatively, this general processor also can be any traditional processor, controller, microcontroller or state machine.Processor also can be realized by the combination of calculation element, such as digital signal processor and microprocessor, multi-microprocessor, and a Digital Signal Processor Core combined by one or more microprocessor, or other similar configuration any realizes.
The software module that method described in the embodiment of the present invention or the step of algorithm directly can embed hardware, processor performs or the combination of both.Software module can be stored in the storage medium of other arbitrary form in RAM storer, flash memory, ROM storer, eprom memory, eeprom memory, register, hard disk, moveable magnetic disc, CD-ROM or this area.Exemplarily, storage medium can be connected with processor, with make processor can from storage medium reading information, and write information can be deposited to storage medium.Alternatively, storage medium can also be integrated in processor.Processor and storage medium can be arranged in ASIC, and ASIC can be arranged in user terminal.Alternatively, processor and storage medium also can be arranged in the different parts in user terminal.
In one or more exemplary design, the above-mentioned functions described by the embodiment of the present invention can realize in the combination in any of hardware, software, firmware or this three.If realized in software, these functions can store on the medium with computer-readable, or are transmitted on the medium of computer-readable with one or more instruction or code form.Computer readable medium comprises computer storage medium and is convenient to make to allow computer program transfer to the telecommunication media in other place from a place.Storage medium can be that any general or special computer can the useable medium of access.Such as, such computer readable media can include but not limited to RAM, ROM, EEPROM, CD-ROM or other optical disc storage, disk storage or other magnetic storage device, or other anyly may be used for carrying or store the medium that can be read the program code of form with instruction or data structure and other by general or special computer or general or special processor.In addition, any connection can be properly termed computer readable medium, such as, if software is by a concentric cable, fiber optic cables, twisted-pair feeder, Digital Subscriber Line (DSL) or being also comprised in defined computer readable medium with wireless way for transmittings such as such as infrared, wireless and microwaves from a web-site, server or other remote resource.Described video disc (disk) and disk (disc) comprise Zip disk, radium-shine dish, CD, DVD, floppy disk and Blu-ray Disc, and disk is usually with magnetic duplication data, and video disc carries out optical reproduction data with laser usually.Above-mentioned combination also can be included in computer readable medium.

Claims (20)

1. an automatic coding for Chinese operation technique information, comprising:
Step 1, inputs Chinese operation technique information;
Step 2, carries out natural language processing to described Chinese operation technique information, obtains one or more title to be encoded;
Step 3, based on the standard terminology storehouse set up in advance with expand terminology bank, searches the standard terminology that matches with described title to be encoded or expands term, and by the coding of the standard terminology that the match is successful or expansion term, is defined as the coding of described title to be encoded;
Wherein, described standard terminology storehouse comprises some standard terminologys and coding thereof, described standard terminology is the operation technique title specified in International Classification of Diseases ICD, and the coding of described standard terminology is the coding of the corresponding operation technique title specified in International Classification of Diseases ICD;
Described expansion terminology bank comprises some expansion terms and coding thereof, and described expansion term has the word of synonymy with described standard terminology or has the word of relation of genus and species;
The coding that described expansion term is corresponding with the described standard terminology with synonymy or relation of genus and species is consistent.
2. the automatic coding of Chinese operation technique information according to claim 1, wherein,
Described step 3 also comprises: based on the Hypothetical classification terminology bank set up in advance, searches the Hypothetical classification term matched with described title to be encoded; And by the coding of the Hypothetical classification term that the match is successful, be defined as the coding of described title to be encoded;
Described Hypothetical classification terminology bank comprises some Hypothetical classification terms and coding thereof;
Described Hypothetical classification term represents ad hoc type treatment means, and described ad hoc type treatment means corresponds to multiple resection operation type, and described multiple resection operation type is described standard terminology;
The coding of described Hypothetical classification term is consistent with the coding of the organ full resection operation type in described multiple resection operation type or Partial Resection type of surgery.
3. the automatic coding of Chinese operation technique information according to claim 1, wherein,
Described step 3 also comprises: based on the odd encoder terminology bank set up in advance, searches the odd encoder term matched with described title to be encoded; And by the coding of the odd encoder term that the match is successful, be defined as the coding of described title to be encoded;
Described odd encoder terminology bank comprises some odd encoder terms and coding thereof;
Described odd encoder term is ad hoc type operation technique type; The prerequisite that described ad hoc type operation technique type performs is another kind of operation technique type; Described ad hoc type operation technique type and described another kind of operation technique type are described standard terminology or described expansion term;
The combination of the coding being encoded to described ad hoc type operation technique type of described odd encoder term and the coding of described another kind of operation technique type.
4. the automatic coding of Chinese operation technique information according to claim 1, wherein,
Before described step 3, also comprise: based on the merging terminology bank set up in advance, pre-service is carried out to described one or more title to be encoded;
Described merging terminology bank comprises some merging terms and coding thereof; Wherein, described merging term is the single standard terminology that can substitute the other standards term that at least two occur simultaneously that International Classification of Diseases ICD specifies; Described at least two other standards terms simultaneously occurred are the combining objects of this merging term; Described merging terminology bank also comprises each whole combining objects merging term;
Described based on merging terminology bank, pretreated step is carried out to described one or more title to be encoded, comprise: judge in described one or more title to be encoded, whether comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of described any one or more merging term are replaced to corresponding merging term.
5. the automatic coding of Chinese operation technique information according to claim 1, wherein,
Before described step 3, also comprise: based on the omission terminology bank set up in advance, pre-service is carried out to described one or more title to be encoded;
Described omission terminology bank comprises some omission terms and coding thereof; Wherein, described omission term is the single standard terminology that can substitute at least two standard terminologys simultaneously occurred that International Classification of Diseases ICD specifies; Described omission term is one in described at least two standard terminologys simultaneously occurred; Described at least two standard terminologys simultaneously occurred are the omission object of this omission term; Described omission terminology bank also comprises whole omission objects that each omits term;
Described based on omission terminology bank, pretreated step is carried out to described one or more title to be encoded, comprise: judge in described one or more title to be encoded, whether comprise whole omission objects of any one or more omission term, if comprise, then whole omission objects of described any one or more omission term are replaced to corresponding omission term.
6., according to the automatic coding of the arbitrary described Chinese operation technique information of Claims 1 to 5, wherein, after described step 3, also comprise:
Step 4, by do not determine encode title to be encoded with without mating without encryption description in encryption description storehouse, if the match is successful, then perform default treatment step and this is not determined that the title to be encoded of encoding is encoded to represent, if it fails to match, then this is not determined that the title to be encoded of encoding is sent to artificial treatment platform and carries out artificial treatment;
Wherein, described comprise without encryption description dictionary some without encryption description;
Describedly somely to comprise without encryption description:
Preset for representing operation technique information but the word of operation technique title cannot being determined;
The disease name preset;
The nomenclature of drug preset; And,
The medical treatment consumptive materials title preset.
7. the automatic coding of Chinese operation technique information according to claim 1, wherein, described step 2 comprises:
Step 21, carries out pre-service to described Chinese operation technique information character string, obtains pretreated Chinese operation technique information character string;
Step 22, based on the body dictionary set up in advance, orientation dictionary, grade dictionary, is cut into some first kind substrings and/or Second Type substring by described pretreated Chinese operation technique information character string;
Wherein, described body dictionary comprises described standard terminology storehouse and expands terminology bank, and described standard terminology and described expansion term are body; Described orientation dictionary comprises some directional terminology, described directional terminology be for describe operation technique for the word in orientation; Described grade dictionary comprises some grade terms, and described grade term is for describing the rank of operation technique, the word of type;
Described first kind substring directly can mate with the body in described body dictionary, and described Second Type substring directly can not mate with the body in described body dictionary;
Step 23, is defined as title to be encoded by the first kind substring be syncopated as and Second Type substring.
8. the automatic coding of Chinese operation technique information according to claim 7, wherein, described step 21 comprises:
Form normalized is carried out to the non-Chinese character in described Chinese operation technique information character string, and the non-medical term deleted in described Chinese operation technique information character string, obtain pretreated Chinese operation technique information character string, wherein said non-medical term is provided by the non-medical term dictionary that sets up in advance, and described non-medical term has been the word of remarks effect, phrase or sentence.
9. the automatic coding of Chinese operation technique information according to claim 8, wherein, described step 22 comprises:
Judge whether described pretreated Chinese operation technique information character string comprises symbol;
If described pretreated Chinese operation technique information character string comprises symbol, then the character between adjacent two symbols every in described pretreated Chinese operation technique information character string is mated with the body in body dictionary as a whole; If the match is successful, then using the character cutting between this adjacent two symbols out as first kind substring; If it fails to match, then by this adjacent two symbols and between character be defined as wouldn't cutting character string, and whether wouldn't comprise default special symbol in cutting character string described in judging;
Special symbol wouldn't be comprised in cutting character string if described, then search described wouldn't character model belonging to cutting character string, and the segmentation rules corresponding according to this affiliated character model cutting character string wouldn't carry out cutting to described, the character cut out is mated with the body in body dictionary, if the match is successful, the character then this cut out is as first kind substring, if it fails to match, then the character this cut out is as Second Type substring; Wherein, described character model is provided by the character model storehouse that sets up in advance, and described character model has segmentation rules one to one;
Special symbol wouldn't do not comprised in cutting character string if described, then cutting character string wouldn't directly be defined as Second Type substring by described;
If described pretreated Chinese operation technique information character string does not comprise symbol, then mechanical Chinese word segmentation method is adopted the single character in described pretreated Chinese operation technique information character string or multiple continuous print character to be mated with the body in described body dictionary;
If all characters in described pretreated Chinese operation technique information character string all can with Ontology Matching, then according to the body that mates using the single character in described pretreated Chinese operation technique information character string or multiple continuous print character cutting out as first kind substring;
Whether fail and the single character of Ontology Matching or multiple continuous print character if exist in described pretreated Chinese operation technique information character string, then failing with the single character of Ontology Matching or multiple continuous print character described in judging is directional terminology or grade term;
When described fail with the single character of Ontology Matching or multiple continuous print character be directional terminology or grade term time, fail and the single character of Ontology Matching or the position of multiple continuous print character in described pretreated Chinese operation technique information character string according to described, fail with single character or multiple continuous print character of Ontology Matching using described and can merge with the single character of Ontology Matching or multiple continuous print character before or after it and cut out as Second Type substring, and can with the single character of Ontology Matching or multiple continuous print character cutting out as first kind substring using remaining in described pretreated Chinese operation technique information character string,
When described to fail with the single character of Ontology Matching or multiple continuous print character for directional terminology or grade term time, described pretreated Chinese operation technique information character string entirety is cut out as Second Type substring.
10. the automatic coding of Chinese operation technique information according to claim 7, wherein, search the standard terminology matched with described title to be encoded or the step expanding term in described step 3, comprising:
If described name to be encoded is called first kind substring, then by the body that this first kind substring matches, be defined as title to be encoded matches with this standard terminology or expand term;
If described name to be encoded is called Second Type substring, then:
Each body in Second Type substring and body dictionary is carried out to the parsing of the first dimension, obtain some first dimension analysis results of Second Type substring, and some first dimension analysis results of each body;
Each first dimension analysis result of described Second Type substring is mated with each first dimension analysis result of each body in described body dictionary, judges whether to exist the body that each first dimension analysis result all matches with each first dimension analysis result of described Second Type substring;
If there is the body that each first dimension analysis result all matches with each first dimension analysis result of described Second Type substring, then this body is defined as the body that described Second Type substring matches;
If there is no the body that all matches with each first dimension analysis result of described Second Type substring of each first dimension analysis result, then choose the part first dimension analysis result in whole first dimension analysis results of described Second Type substring to mate with the part first dimension analysis result in whole first dimension analysis results of each body in described body dictionary, and the body that the described part first dimension analysis result judging whether to exist described part first dimension analysis result and described Second Type substring matches;
If the body that the described part first dimension analysis result that there is described part first dimension analysis result and described Second Type substring matches, then this body is defined as the body that described Second Type substring matches;
If the body that the described part first dimension analysis result that there is not described part first dimension analysis result and described Second Type substring matches, then each body in described Second Type substring and described body dictionary is carried out to the parsing of the second dimension, obtain some second dimension analysis results of described Second Type substring, and some second dimension analysis results of each body in described body dictionary;
Based on some second dimension analysis results of described Second Type substring, and some second dimension analysis results of described body, calculate the matching degree of described Second Type substring and each body;
According to the matching degree of described Second Type substring and each body, determine the body that one or more body matches as described Second Type substring;
By the body that described Second Type substring matches, be defined as standard terminology that described title to be encoded matches or expand term.
The automatic coding of 11. Chinese operation technique information according to claim 10, wherein, described Second Type substring described in body each first dimension analysis result respectively:
Described Second Type substring described in directional terminology in body;
Described Second Type substring described in grade term in body;
Described Second Type substring described in character in body bracket;
Described Second Type substring described in character in body after dash; And,
Described Second Type substring described in character in body except the character in directional terminology, grade term, bracket, character after dash;
Described Second Type substring described in body whole first dimension analysis results in part first dimension analysis result comprise: in described two type substrings described in character in body except the character in directional terminology, grade term, bracket, character after dash; And, one or more in the following:
Described Second Type substring described in directional terminology, grade term in body;
Described Second Type substring described in character in body bracket;
Described Second Type substring described in character in body after dash.
The automatic coding of 12. Chinese operation technique information according to claim 10, wherein, described Second Type substring described in body each second dimension analysis result respectively:
Described Second Type substring described in each Chinese character of body;
Described Second Type substring described in the initial consonant of each Chinese character of body;
Described Second Type substring described in the simple or compound vowel of a Chinese syllable of each Chinese character of body;
Described Second Type substring described in the initial character of body;
Described Second Type substring described in the phonetic of initial character of body; And,
Described Second Type substring described in non-Chinese character in body.
The automatic coding of 13. Chinese operation technique information according to claim 10, wherein, described some second dimension analysis results based on described Second Type substring, and some second dimension analysis results of described body, the step calculating the matching degree of described Second Type substring and each body comprises:
Similarity according to Second Type substring described in following formulae discovery and each body:
M = Σ t i n q ( t f t i n d · i d f ( t ) 2 · t . g e t B o o s t ( ) · n o r m ( t , d ) )
Wherein, M represents similarity;
T represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
Tinq represents each second dimension of Second Type substring;
D represents body;
Tf (tind) represents in the second identical dimension, the frequency that the second dimension analysis result of Second Type substring and the second dimension analysis result of body match;
wherein, T represents the sum of body in body dictionary, and T (t) represents the sum of the body that each second dimension analysis result all matches with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of body;
The similarity calculated is defined as the matching degree of described Second Type substring and each body.
The automatic coding of 14. Chinese operation technique information according to claim 10, wherein, described some second dimension analysis results based on described Second Type substring, and some second dimension analysis results of described body, the step calculating the matching degree of described Second Type substring and each body comprises:
Determine each Chinese character in described Second Type substring;
The cosine degree of confidence of each body matched with it according to Second Type substring described in following formulae discovery:
N = Σ j = 1 V w Q , j × w d ′ , j Σ j = 1 V w Q , j 2 × Σ j = 1 V w d ′ , j 2
Total degree of confidence of each body matched with it according to Second Type substring described in following formulae discovery:
S=M×a+N×b
Wherein, N represents cosine degree of confidence;
V represents the Chinese character sum that Second Type substring and the body matched thereof comprise;
Q represents Second Type substring;
D' represents and the body that Second Type substring matches;
W q,jrepresent the frequency that each Chinese character occurs in Second Type substring;
W d', jrepresent the frequency occurred in the body that each Chinese character matches at Second Type substring;
J represents the sequence number of the Chinese character that Second Type substring and the body matched thereof comprise;
S represents total degree of confidence;
M represents similarity;
A represents the preset weights that similarity M is corresponding;
B represents the preset weights that cosine degree of confidence N is corresponding;
Further, similarity M is according to following formulae discovery:
M = Σ t i n q ( t f t i n d · i d f ( t ) 2 · t . g e t B o o s t ( ) · n o r m ( t , d ) )
Wherein, t represents each second dimension analysis result of Second Type substring;
Q represents Second Type substring;
Tinq represents each second dimension of Second Type substring;
D represents body;
Tf (tind) represents in the second identical dimension, the frequency that the second dimension analysis result of Second Type substring and the second dimension analysis result of body match;
wherein, T represents the sum of body in body dictionary, and T (t) represents the sum of the body that each second dimension analysis result all matches with each second dimension analysis result of Second Type substring;
T.getBoost () represents the preset weights of each second dimension;
Norm (t, d) represents the length normalization method factor of body;
The total degree of confidence calculated is defined as the matching degree of described Second Type substring and each body.
The automatic coding of 15. Chinese operation technique information according to claim 10, wherein, the described matching degree according to described Second Type substring and each body, determine the step of the body that one or more body matches as described Second Type substring, comprising:
According to the size of the matching degree with described Second Type substring, whole body is sorted, and the body of the forward predetermined number that wherein sorts is defined as the body that described Second Type substring matches;
Or,
Matching degree with described Second Type substring is reached one or more bodies of predetermined threshold value, be defined as the body that described Second Type substring matches.
The automatic coding system (ACOM) of 16. 1 kinds of Chinese operation technique information, comprising:
Import module, for inputting Chinese operation technique information;
Natural language processing module, for carrying out natural language processing to described Chinese operation technique information, obtains one or more title to be encoded;
Mate endowed module, for based on the standard terminology storehouse set up in advance and expand terminology bank, search the standard terminology that matches with described title to be encoded or expand term, and by the standard terminology that the match is successful or the coding expanding term, being defined as the coding of described title to be encoded;
Wherein, described standard terminology storehouse comprises some standard terminologys and coding thereof, described standard terminology is the operation technique title specified in International Classification of Diseases ICD, and the coding of described standard terminology is the coding of the corresponding operation technique title specified in International Classification of Diseases ICD;
Described expansion terminology bank comprises some expansion terms and coding thereof, and described expansion term has the word of synonymy with described standard terminology or has the word of relation of genus and species;
The coding that described expansion term is corresponding with the described standard terminology with synonymy or relation of genus and species is consistent.
The automatic coding system (ACOM) of 17. Chinese operation technique information according to claim 16, wherein,
Described matching module, also for based on the Hypothetical classification terminology bank set up in advance, searches the Hypothetical classification term matched with described title to be encoded; And by the coding of the Hypothetical classification term that the match is successful, be defined as the coding of described title to be encoded;
Described Hypothetical classification terminology bank comprises some Hypothetical classification terms and coding thereof;
Described Hypothetical classification term represents ad hoc type treatment means, and described ad hoc type treatment means corresponds to multiple resection operation type, and described multiple resection operation type is described standard terminology;
The coding of described Hypothetical classification term is consistent with the coding of the organ full resection operation type in described multiple resection operation type or Partial Resection type of surgery.
The automatic coding system (ACOM) of 18. Chinese operation technique information according to claim 16, wherein,
Described matching module, also for based on the odd encoder terminology bank set up in advance, searches the odd encoder term matched with described title to be encoded; And by the coding of the odd encoder term that the match is successful, be defined as the coding of described title to be encoded;
Described odd encoder terminology bank comprises some odd encoder terms and coding thereof;
Described odd encoder term is ad hoc type operation technique type; The prerequisite that described ad hoc type operation technique type performs is another kind of operation technique type; Described ad hoc type operation technique type and described another kind of operation technique type are described standard terminology or described expansion term;
The combination of the coding being encoded to described ad hoc type operation technique type of described odd encoder term and the coding of described another kind of operation technique type.
The automatic coding system (ACOM) of 19. Chinese operation technique information according to claim 16, also comprises:
Merging treatment module, for based on the merging terminology bank set up in advance, carries out pre-service to described one or more title to be encoded;
Described merging terminology bank comprises some merging terms and coding thereof; Wherein, described merging term is the single standard terminology that can substitute the other standards term that at least two occur simultaneously that International Classification of Diseases ICD specifies; Described at least two other standards terms simultaneously occurred are the combining objects of this merging term; Described merging terminology bank also comprises each whole combining objects merging term;
Described merging treatment module, specifically for judging in described one or more title to be encoded, whether comprise whole combining objects of any one or more merging term, if comprise, then whole combining objects of described any one or more merging term are replaced to corresponding merging term.
The automatic coding system (ACOM) of 20. Chinese operation technique information according to claim 16, also comprises:
Omit processing module, for based on the omission terminology bank set up in advance, pre-service is carried out to described one or more title to be encoded;
Described omission terminology bank comprises some omission terms and coding thereof; Wherein, described omission term is the single standard terminology that can substitute at least two standard terminologys simultaneously occurred that International Classification of Diseases ICD specifies; Described omission term is one in described at least two standard terminologys simultaneously occurred; Described at least two standard terminologys simultaneously occurred are the omission object of this omission term; Described omission terminology bank also comprises whole omission objects that each omits term;
Described omission processing module, specifically for carrying out pretreated step to described one or more title to be encoded, comprise: judge in described one or more title to be encoded, whether comprise whole omission objects of any one or more omission term, if comprise, then whole omission objects of described any one or more omission term are replaced to corresponding omission term.
CN201510496500.3A 2015-08-13 2015-08-13 A kind of automatic coding and system of Chinese surgical procedure information Active CN105069123B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510496500.3A CN105069123B (en) 2015-08-13 2015-08-13 A kind of automatic coding and system of Chinese surgical procedure information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510496500.3A CN105069123B (en) 2015-08-13 2015-08-13 A kind of automatic coding and system of Chinese surgical procedure information

Publications (2)

Publication Number Publication Date
CN105069123A true CN105069123A (en) 2015-11-18
CN105069123B CN105069123B (en) 2018-06-26

Family

ID=54498493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510496500.3A Active CN105069123B (en) 2015-08-13 2015-08-13 A kind of automatic coding and system of Chinese surgical procedure information

Country Status (1)

Country Link
CN (1) CN105069123B (en)

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630873A (en) * 2015-12-18 2016-06-01 河南思维自动化设备股份有限公司 Graphical assistance editing method for announcement in station
CN105963022A (en) * 2016-04-19 2016-09-28 中国中医科学院中医临床基础医学研究所 Treatment encoder
CN106844308A (en) * 2017-01-20 2017-06-13 天津艾登科技有限公司 A kind of use semantics recognition carries out the method for automating disease code conversion
CN106874643A (en) * 2016-12-27 2017-06-20 中国科学院自动化研究所 Build the method and system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector
CN107577826A (en) * 2017-10-25 2018-01-12 山东众阳软件有限公司 Classification of diseases coding method and system based on raw diagnostic data
CN107705839A (en) * 2017-10-25 2018-02-16 山东众阳软件有限公司 Disease automatic coding and system
CN108182207A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese surgical procedure based on participle network
CN108182977A (en) * 2018-02-05 2018-06-19 南方医科大学顺德医院(佛山市顺德区第人民医院) Patient diagnosis coding method and system
CN108257667A (en) * 2016-12-28 2018-07-06 中国科学院深圳先进技术研究院 A kind of data processing method and terminal device
CN108320778A (en) * 2017-01-16 2018-07-24 医渡云(北京)技术有限公司 Medical record ICD coding methods and system
CN108831522A (en) * 2018-05-28 2018-11-16 陈丽璇 A kind of the medical insurance disease score value charging system and its construction method of autocoding
CN109256216A (en) * 2018-08-14 2019-01-22 平安医疗健康管理股份有限公司 Medical data processing method, device, computer equipment and storage medium
CN109273062A (en) * 2018-08-09 2019-01-25 北京爱医声科技有限公司 ICD intelligence Auxiliary Encoder System
CN109918655A (en) * 2019-02-27 2019-06-21 浙江数链科技有限公司 Logistics terms library generating method and device
CN110442844A (en) * 2019-07-03 2019-11-12 北京达佳互联信息技术有限公司 Data processing method, device, electronic equipment and storage medium
CN111128388A (en) * 2019-12-03 2020-05-08 东软集团股份有限公司 Value domain data matching method and device and related products
CN111933244A (en) * 2020-08-17 2020-11-13 医渡云(北京)技术有限公司 Medicine data encoding method and device, computer readable medium and electronic equipment
CN112131868A (en) * 2020-09-22 2020-12-25 上海亿普医药科技有限公司 Clinical trial medical coding method
CN112131867A (en) * 2020-09-22 2020-12-25 上海亿普医药科技有限公司 Clinical trial medical coding system
CN112632910A (en) * 2020-12-21 2021-04-09 北京惠及智医科技有限公司 Operation encoding method, electronic device and storage device
CN112749307A (en) * 2020-12-30 2021-05-04 杭州依图医疗技术有限公司 Medical data processing method and device and storage medium
CN115017326A (en) * 2022-05-12 2022-09-06 青岛普瑞盛医药科技有限公司 Medical coding method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102456100A (en) * 2010-11-03 2012-05-16 通用电气公司 Systems, methods, and apparatus for computer-assisted full medical code scheme to code scheme mapping
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102456100A (en) * 2010-11-03 2012-05-16 通用电气公司 Systems, methods, and apparatus for computer-assisted full medical code scheme to code scheme mapping
US20130086069A1 (en) * 2010-11-03 2013-04-04 General Electric Company Systems, methods, and apparatus for computer-assisted full medical code scheme to code scheme mapping
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
林冬盛: "中文分词算法的研究与实现", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (34)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105630873A (en) * 2015-12-18 2016-06-01 河南思维自动化设备股份有限公司 Graphical assistance editing method for announcement in station
CN105630873B (en) * 2015-12-18 2018-12-25 河南思维自动化设备股份有限公司 The graphical assist edit method disclosed in yard
CN105963022A (en) * 2016-04-19 2016-09-28 中国中医科学院中医临床基础医学研究所 Treatment encoder
CN105963022B (en) * 2016-04-19 2018-08-14 中国中医科学院中医临床基础医学研究所 Treat encoder
CN106874643A (en) * 2016-12-27 2017-06-20 中国科学院自动化研究所 Build the method and system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector
CN106874643B (en) * 2016-12-27 2020-02-28 中国科学院自动化研究所 Method and system for automatically constructing knowledge base to realize auxiliary diagnosis and treatment based on word vectors
CN108257667A (en) * 2016-12-28 2018-07-06 中国科学院深圳先进技术研究院 A kind of data processing method and terminal device
CN108320778A (en) * 2017-01-16 2018-07-24 医渡云(北京)技术有限公司 Medical record ICD coding methods and system
CN106844308A (en) * 2017-01-20 2017-06-13 天津艾登科技有限公司 A kind of use semantics recognition carries out the method for automating disease code conversion
CN106844308B (en) * 2017-01-20 2020-04-03 天津艾登科技有限公司 Method for automatic disease code conversion using semantic recognition
CN107705839A (en) * 2017-10-25 2018-02-16 山东众阳软件有限公司 Disease automatic coding and system
CN107577826B (en) * 2017-10-25 2018-05-15 山东众阳软件有限公司 Classification of diseases coding method and system based on raw diagnostic data
CN107705839B (en) * 2017-10-25 2020-06-26 山东众阳软件有限公司 Disease automatic coding method and system
CN107577826A (en) * 2017-10-25 2018-01-12 山东众阳软件有限公司 Classification of diseases coding method and system based on raw diagnostic data
CN108182207A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese surgical procedure based on participle network
CN108182207B (en) * 2017-12-15 2020-11-13 中电科软件信息服务有限公司 Intelligent coding method and system for Chinese surgical operation based on word segmentation network
CN108182977A (en) * 2018-02-05 2018-06-19 南方医科大学顺德医院(佛山市顺德区第人民医院) Patient diagnosis coding method and system
CN108831522A (en) * 2018-05-28 2018-11-16 陈丽璇 A kind of the medical insurance disease score value charging system and its construction method of autocoding
CN109273062A (en) * 2018-08-09 2019-01-25 北京爱医声科技有限公司 ICD intelligence Auxiliary Encoder System
CN109256216A (en) * 2018-08-14 2019-01-22 平安医疗健康管理股份有限公司 Medical data processing method, device, computer equipment and storage medium
CN109256216B (en) * 2018-08-14 2023-06-27 平安医疗健康管理股份有限公司 Medical data processing method, medical data processing device, computer equipment and storage medium
CN109918655A (en) * 2019-02-27 2019-06-21 浙江数链科技有限公司 Logistics terms library generating method and device
CN109918655B (en) * 2019-02-27 2023-11-14 浙江数链科技有限公司 Logistics term library generation method and device
CN110442844A (en) * 2019-07-03 2019-11-12 北京达佳互联信息技术有限公司 Data processing method, device, electronic equipment and storage medium
CN110442844B (en) * 2019-07-03 2023-09-26 北京达佳互联信息技术有限公司 Data processing method, device, electronic equipment and storage medium
CN111128388A (en) * 2019-12-03 2020-05-08 东软集团股份有限公司 Value domain data matching method and device and related products
CN111128388B (en) * 2019-12-03 2024-02-27 东软集团股份有限公司 Value range data matching method and device and related products
CN111933244A (en) * 2020-08-17 2020-11-13 医渡云(北京)技术有限公司 Medicine data encoding method and device, computer readable medium and electronic equipment
CN112131868A (en) * 2020-09-22 2020-12-25 上海亿普医药科技有限公司 Clinical trial medical coding method
CN112131867A (en) * 2020-09-22 2020-12-25 上海亿普医药科技有限公司 Clinical trial medical coding system
CN112632910A (en) * 2020-12-21 2021-04-09 北京惠及智医科技有限公司 Operation encoding method, electronic device and storage device
CN112749307A (en) * 2020-12-30 2021-05-04 杭州依图医疗技术有限公司 Medical data processing method and device and storage medium
CN115017326A (en) * 2022-05-12 2022-09-06 青岛普瑞盛医药科技有限公司 Medical coding method and device
CN115017326B (en) * 2022-05-12 2023-08-18 青岛普瑞盛医药科技有限公司 Medical coding method and device

Also Published As

Publication number Publication date
CN105069123B (en) 2018-06-26

Similar Documents

Publication Publication Date Title
CN105069123A (en) Automatic coding method and system for Chinese surgical operation information
Martin et al. MUSS: Multilingual unsupervised sentence simplification by mining paraphrases
CN105069124A (en) Automatic ICD (International Classification of Diseases) coding method and system
Van den Bercken et al. Evaluating neural text simplification in the medical domain
Amin-Nejad et al. Exploring transformer text generation for medical dataset augmentation
CN105184053A (en) Automatic coding method and system for Chinese medical service project information
CN100371927C (en) System for identifying paraphrases using machine translation techniques
CN102214166B (en) Machine translation system and machine translation method based on syntactic analysis and hierarchical model
CN105138829A (en) Natural language processing method and system for Chinese diagnosis and treatment information
CN106407443A (en) Structured medical data generation method and device
Soysal et al. Design and evaluation of an ontology based information extraction system for radiological reports
CN105095665A (en) Natural language processing method and system for Chinese disease diagnosis information
CN104484319A (en) Methods and systems for automated text correction
CN108647203B (en) Method for calculating text similarity of traditional Chinese medicine disease conditions
Ji et al. A BILSTM-CRF method to Chinese electronic medical record named entity recognition
Peng et al. A self-attention based deep learning method for lesion attribute detection from CT reports
Wu et al. Structured information extraction of pathology reports with attention-based graph convolutional network
Gero et al. PMCVec: Distributed phrase representation for biomedical text processing
Yu et al. Bios: An algorithmically generated biomedical knowledge graph
Mitrofan et al. Bioro: The biomedical corpus for the romanian language
Romanov et al. Evaluation of morphological embeddings for the Russian language
Sanchez-Cartagena et al. A generalised alignment template formalism and its application to the inference of shallow-transfer machine translation rules from scarce bilingual corpora
CN112732900B (en) Electronic medical record text abstract extraction method
CN113658720A (en) Method, apparatus, electronic device and storage medium for matching diagnostic name and ICD code
Shaitarova et al. Cross-lingual transfer-learning approach to negation scope resolution

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant