CN108182207B - Intelligent coding method and system for Chinese surgical operation based on word segmentation network - Google Patents

Intelligent coding method and system for Chinese surgical operation based on word segmentation network Download PDF

Info

Publication number
CN108182207B
CN108182207B CN201711350705.6A CN201711350705A CN108182207B CN 108182207 B CN108182207 B CN 108182207B CN 201711350705 A CN201711350705 A CN 201711350705A CN 108182207 B CN108182207 B CN 108182207B
Authority
CN
China
Prior art keywords
words
information
chinese
word
icd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711350705.6A
Other languages
Chinese (zh)
Other versions
CN108182207A (en
Inventor
李本文
赵蕾
段珂
任永超
邹智超
罗世利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhong electricity Ke software information Services Co., Ltd
Original Assignee
Shanghai Changjiang Technology Development Co ltd
Cetc Software Information Services Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Changjiang Technology Development Co ltd, Cetc Software Information Services Co ltd filed Critical Shanghai Changjiang Technology Development Co ltd
Priority to CN201711350705.6A priority Critical patent/CN108182207B/en
Publication of CN108182207A publication Critical patent/CN108182207A/en
Application granted granted Critical
Publication of CN108182207B publication Critical patent/CN108182207B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2452Query translation
    • G06F16/24522Translation of natural language queries to structured queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides an intelligent coding method and system for Chinese surgery operation based on word segmentation network, comprising the following steps: step 1, acquiring a Chinese surgical operation name; step 2, adapting the Chinese operation name to a dynamic expansion word segmentation network, extracting key information from the Chinese operation name and forming key phrases by the extracted key information; and 3, adapting the key phrase and the phrase coding mapping table and outputting an unmatched result or an ICD code. Compared with the prior art, the invention has the following beneficial effects: 1. the standard codes of medical case data under heterogeneous conditions are unified, so that solid data standardization support is provided for the management decision of medical quality and medical insurance fund, and the gradual standardization and normalization of the clinical medical practice in China are promoted. 2. By the method, ICD coding can be automatically completed without manual participation, and the method has the advantages of high coding speed, low cost, high accuracy and the like. 3. Unified standard ICD coding.

Description

Intelligent coding method and system for Chinese surgical operation based on word segmentation network
Technical Field
The invention relates to the technical field of medical information management, in particular to an intelligent coding method and system for Chinese surgical operation based on a word segmentation network.
Background
International Classification of Diseases (ICD) is an International unified disease Classification method established by WHO, which classifies Diseases into an ordered combination according to characteristics such as etiology, pathology, clinical manifestation, and anatomical location of Diseases, and is a system expressed by a coding method. ICDs standardize, format disease names. The method is an application basis of clinical information systems such as medical informatization, hospital information management and the like. The standardization of Chinese medical terms is largely lagged behind the standardization of Western medical terms, and the standardization process of Chinese medical terms starts from the mass introduction and translation of international standards. The ICD-10 Chinese edition is assigned by the national health council as a universal standard for clinical diagnosis, disease-like terminology.
However, the disease coverage is wide, the variety is great, and the naming is very complicated. In the work of disease reporting, medical record statistics management, medical insurance reimbursement and the like, the use of disease codes by various medical institutions in China has great difference, and the local modification is respectively carried out on international standard ICD codes provided by WHO, so that the problems of multi-source isomerism, deficiency and difficulty in unification of codes are caused. Therefore, an intelligent encoding method for Chinese surgical operation information is needed to be constructed, which can automatically perform natural language processing on different Chinese surgical operation information of different medical institutions, standardize the Chinese surgical operation information into ICD codes of universal standards, and further provide solid data standardization support for management decisions of medical quality and medical insurance funds, so that ICD disease information is shared in the largest range, national health conditions can be reflected, and the ICD codes are also tools and data for medical scientific research and teaching.
As the closest prior technical proposal, the invention relates to a Chinese surgical operation information automatic coding method and system, which is provided by the medical information technology (Beijing) Co., Ltd. The method is based on a standard term library and an extended term library, a standard term or an extended term matched with a name to be coded is searched, and the code of the successfully matched standard term or extended term is determined as the code of the name to be coded; wherein the standard terms are each disease term included in the ICD version to which reference is made, and the expanded terms are words having a synonymous relationship with the standard terms or words having an allegian relationship; the expanded terms are consistent with the codes corresponding to the standard terms having a synonymous relationship or an generic relationship. The method mainly researches the generic relationship among terms, and our technique lies in that firstly, a word segmentation network is constructed, the input operation name is adapted based on the constructed word segmentation network, key information is extracted from the operation name, a keyword combination consisting of parts, operation formulas, routes, etiology, purposes and other six parts is obtained, then a pre-constructed word combination and coding mapping table is matched, and the final ICD code corresponding to the Chinese name is obtained through precise matching or fuzzy matching; the dependencies between these terms are not of concern and focus solely on the processing of natural language.
The application numbers are: 201510496500.3, the name is: an automatic coding method and system for Chinese operation information, mainly based on standard term library and expanded term library, search for the standard term or expanded term matching with the name to be coded, and will match the successful code of the standard term or expanded term, confirm as the code of the name to be coded; where standard terms are each disease term contained in the ICD version to which reference is made, expanded terms are colloquial, alternative or acronym for standard terms, or subclass disease terms for standard terms, or newly generated disease terms.
In the patent documents, the relationship between the genus and the species of the term is mainly studied, and when the expanded term is a disease term that is a subclass of any of the standard terms or the newly-developed disease term, the expanded term is assigned with the code of the standard term that is closest to the relationship between the genus and the species of the expanded term. The method comprises the steps of firstly constructing a word segmentation network, adapting an input operation name based on the constructed word segmentation network, extracting key information from the operation name to obtain a keyword combination consisting of a part, a formula, an approach, a cause, a purpose and other six parts, matching a pre-constructed word combination and code mapping table, and obtaining a final ICD code corresponding to a Chinese name through precise matching or fuzzy matching; the dependencies between these terms are not of concern and focus solely on the processing of natural language.
The application numbers are: 201610571791.2, the name is: the diagnosis related grouping method and system based on intelligent coding adaptation mainly aim at unifying codes into international standard diagnosis codes (ICD-10-CM) and operation codes (ICD-10-PCS) which can be identified by a case grouping system, so that the adaptability of the case grouping system is improved; the technology is wider in adaptation width and deeper in adaptation depth, and any Chinese diagnosis name and operation name can be respectively adapted to ICD diagnosis codes and ICD operation codes of specific versions, so that the problems of multisource isomerism, deficiency and non-uniform standards of hospital codes are solved, and the construction of hospital informatization is facilitated. In addition, the method for adapting the two is also very different. The patent documents need to use a block when in adaptation, namely, the input original diagnosis code, but the technique does not need to use the original code to carry out adaptation, extracts key information after word segmentation completely based on the operation name, and then carries out precise matching or fuzzy matching. The technical scheme is also the basis and precondition for the application of the patent documents. In addition, the technology can be applied to code matching work of hospital case departments, for example, the ICD version used before is corresponding to the latest ICD version, and the work efficiency of case coders can be improved.
Application No.: 201510831116.4, entitled an intelligent diagnosis operation code retrieval method, the patent documents mainly search out the code set with the highest matching strength in the whole preset diagnosis or operation code character set based on the diagnosis or operation name input by the user, and the technical invention focuses on the matching degree of the retrieval target character string in the preset database. The methods of the two methods have no similarity, and the patents have no adaptation and are realized based on retrieval; the technology is that a word segmentation network is firstly constructed, the network can be enriched and improved continuously to increase the adaptation range, and the ICD code closest to the specified version can be obtained only by inputting any Chinese name.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide an intelligent coding method and system for Chinese surgery operation based on a word segmentation network.
In order to solve the technical problem, the intelligent coding method for Chinese surgery operation based on the word segmentation network provided by the invention comprises the following steps:
step 1, acquiring a Chinese surgical operation name;
step 2, adapting the Chinese operation name to a dynamic expansion word segmentation network, extracting key information from the Chinese operation name and forming key phrases by the extracted key information;
and 3, adapting the key phrase and the phrase coding mapping table and outputting an unmatched result or an ICD code.
Preferably, the key information includes site information, surgical information, access information, etiology information, purpose information, and other information.
Preferably, step 2 comprises:
matching the key information with the position information from long to short according to a position word library in the dynamically expanded word segmentation network, and combining the priority of each position information in the position word sequence list to obtain a position word;
matching the key information with the operation type word library in the word segmentation network from long to short according to the dynamic expansion of the key information, and combining the priority of each operation type information in the operation type word sequence table to obtain operation type words;
matching the key information with the entry information from long to short according to an entry word library in the dynamically-expanded word segmentation network, and combining the priority of each entry information in an entry word sequence table to obtain entry words;
matching the key information with the etiological factor information from long to short according to an etiological factor word library in the dynamically expanded word segmentation network, and combining the priorities of the etiological factor information in the etiological factor word sequence list to obtain etiological factor words;
matching the key information with target information from long to short according to a target word library in the dynamically expanded word segmentation network, and combining the priority of each target information in a target word sequence list to obtain a target word;
matching the key information with other information from long to short according to other word libraries in the dynamically expanded word segmentation network, and combining the priority of each piece of other information in the sequence list of other words to obtain other words;
the key phrases at least include part words, operation words, entry words, cause words, purpose words and other words.
Preferably, the adapting in step 3 comprises:
step 3.1, first-stage adaptation: carrying out accurate adaptation on the key phrase and the phrase coding mapping table;
if the key phrase is matched with the phrase code mapping table, outputting the ICD code corresponding to the key phrase, and ending the intelligent coding method;
if the key phrase is not matched with the phrase coding mapping table, entering step 3.2;
step 3.2, second-stage adaptation: and carrying out fuzzy adaptation on the key phrase and the phrase coding mapping table.
Preferably, step 3.2, the second stage of adaptation comprises:
step 3.2.1, searching a path containing the most matched keywords in the dynamic expansion participle network;
step 3.2.2, calculating the number of the matching of the residual keywords in the path and the key information of the input Chinese operation name, and selecting a matching path; wherein
The rest keywords are the keywords which are not matched in the keyword group.
Preferably, step 3.2.1 comprises:
step 3.2.1.1, screening all first screening paths containing the matched part words from the dynamic expansion word segmentation network; if no part word exists in the key phrase, outputting an unmatched result, and ending the intelligent coding method;
step 3.2.1.2, screening out a second screening path containing matched operation type words, entry words, etiological factors words, target words and other words from the first screening path;
the path is the path which is selected according to the operation type words, the entry words, the cause words, the destination words and other words and contains the most matched keywords in all the second screening paths.
Preferably, in step 3.2.2, when the number of matched keywords is the same, the path with the least number of unmatched keywords is selected as the matching path, and the ICD code corresponding to the matching path is obtained through the phrase code mapping table and is output.
Preferably, the creation of the dynamically augmented participle network comprises the steps of:
ICD codes in different regions or different versions and corresponding Chinese operation names are used as training sets to be input; if the same Chinese surgery operation name corresponds to different ICD codes, the ICD code with the smaller initial letter is reserved; if the current letter is the same, comparing the next letter until the tail letter;
extracting keywords corresponding to symptom information, part information, etiology information and other information from the diagnosis name, and respectively constructing a part word bank, a surgical word bank, a route word bank, an etiology word bank, a target word bank and other word banks;
respectively constructing a part word sequence table, a surgical word sequence table, an entry word sequence table, a cause word sequence table, a target word sequence table and other word sequence tables according to the lengths of symptom words, part words, cause words and other words from long to short, and generating a sequencing forest; wherein
The longer the length, the more forward the rank, the higher the priority.
Preferably, the creating of the phrase coding mapping table includes the following steps:
determining an ICD version to be referred to;
extracting records with the same Chinese operation name in the training set and the ICD version to be referred to, and replacing the ICD codes in the training set with the ICD codes in the ICD version to be referred to;
extracting records in the training set, which are different from the Chinese operation names in the ICD version to be determined and referred to, matching the Chinese operation names in the ICD version to be determined and referred to with the generated dynamic expansion participle network, and taking the ICD code of the record of the ICD version to be determined and referred to which is most similar as a mapping code; wherein
And the ICD code of the most similar appointed version record is the record with the most matched keywords or the record with the least number of unmatched keywords when the number of matched keywords is the same with the corresponding part of the dynamic expansion participle network.
A system of an intelligent coding method of Chinese surgery operation based on word segmentation network comprises a computer readable storage medium storing a computer program, wherein the computer program realizes the steps of the intelligent coding method of Chinese surgery operation based on word segmentation network when being executed by a processor.
Compared with the prior art, the invention has the following beneficial effects:
1. the standard codes of medical case data under heterogeneous conditions are unified, so that solid data standardization support is provided for the management decision of medical quality and medical insurance fund, and the gradual standardization and normalization of the clinical medical practice in China are promoted.
2. By the method, ICD coding can be automatically completed without manual participation, and the method has the advantages of high coding speed, low cost, high accuracy and the like.
3. The unified standard ICD codes can reflect the state of national health diseases and are also tools and data for medical scientific research and teaching.
Drawings
Other characteristic objects and advantages of the invention will become more apparent upon reading the detailed description of non-limiting embodiments with reference to the following figures.
FIG. 1 is a flow chart of an intelligent encoding method for Chinese surgery operation based on word segmentation network according to the present invention;
FIG. 2 is a flow chart of the precise matching and fuzzy matching of the intelligent encoding method of Chinese surgery operation based on word segmentation network according to the present invention;
FIG. 3 is a flow chart of the intelligent encoding method for Chinese surgery operation based on word segmentation network for constructing dynamically expanded word segmentation network and sequence table of part words, art words, access words, etiological words, destination words and other words;
FIG. 4 is a flowchart of the first step of constructing a word combination and coding mapping table in the intelligent coding method for Chinese surgery operation based on word segmentation network according to the present invention;
FIG. 5 is a second flowchart of the method for constructing word combinations and mapping tables for encoding according to the present invention;
FIG. 6 is a flow chart of a third step of constructing a word combination and coding mapping table in the intelligent coding method for Chinese surgery operation based on word segmentation network according to the present invention;
FIG. 7 is a fourth flowchart of the method for constructing word combinations and mapping tables for encoding according to the intelligent encoding method for Chinese surgery based on word segmentation network of the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
The invention provides an intelligent coding method of Chinese surgery operation based on word segmentation network, comprising the following steps:
step 1: inputting a Chinese operation name;
step 2: the method comprises the steps of carrying out adaptation based on a pre-established dynamically expanded word segmentation network, sequentially extracting key information from a Chinese operation name to obtain a keyword combination consisting of six parts of part information, operation type information, access information, etiology information, target information and other information, and following the principle that word segmentation is from long to short and the priority is from high to low during adaptation.
The step 2 further comprises the following steps 2.1-2.6 (shown in figure 1):
step 2.1: after the Chinese operation name is input, matching parts from long to short according to a part word library in the word segmentation network, and combining the priority of each part in a pre-established part word sequence list to obtain a final part word; if no matching exists, introducing a pre-established part synonym table to continue matching to obtain a final part word; if there is still no match, then "no match" is output.
Step 2.2: matching the operation formulas according to the operation formula word library in the word segmentation network from long to short, and combining the priority of each operation formula in a pre-established operation formula word sequence table to obtain a final operation formula word; and if no matching exists, introducing a pre-established art type synonym table to continue matching to obtain the final art type word.
Step 2.3: matching the entry according to the entry word library in the word segmentation network from long to short, and combining the priorities of all entries in a pre-established entry word sequence table to obtain final entry words; and if no matching exists, introducing a pre-established entry synonym table to continue matching to obtain a final entry word.
Step 2.4: matching the etiology according to the etiology word bank in the word segmentation network from long to short, and combining the priorities of all etiologies in a pre-established etiology word sequence table to obtain a final etiology word; if no matching exists, introducing a pre-established cause synonym table to continue matching, and obtaining the final cause word.
Step 2.5: and matching the purposes according to the target word library in the word segmentation network from long to short, and combining the priorities of all the purposes in a pre-established target word sequence list to obtain the final target word.
Step 2.6: and matching other words according to other word banks in the word segmentation network from long to short, and combining the priorities of the other words in the pre-established other word sequence list to obtain the final other words.
And step 3: matching the keyword combination obtained in the step 2 with a pre-constructed word combination and code mapping table, and if a corresponding word combination path can be found, accurately matching an ICD code of an appointed version corresponding to the name of the operation, and outputting the ICD code; if the record of a certain word combination path cannot be found, the next fuzzy adaptation is needed.
And 4, step 4: and performing further adaptation on unmatched word combinations in the accurate matching process through a fuzzy adaptation rule, and taking ICD codes corresponding to the word combinations which are described to be the closest as output results. The fuzzy adaptation rules comprise:
1) firstly, searching all paths of which the parts in the participle network contain matched 'part words', if the paths cannot be found, continuously searching all paths of which the parts in the participle network contain matched 'part synonyms', and if the paths cannot be found, outputting 'unmatched';
2) based on the final path in the previous step, continuing to screen paths in which the term in the term segmentation network contains the matched term, if the paths cannot be found, screening paths containing the matched term synonym, and if the paths cannot be found, executing the next step;
3) based on the final path in the previous step, continuing to screen paths containing matched 'entry words' in the participle network, if the paths are not found, screening paths containing matched 'entry synonyms', and if the paths are still not found, executing the next step;
4) based on the final path in the last step, continuously screening the path of the cause containing the matched 'cause words' in the word segmentation network, if the path is not found, screening the path containing the matched 'cause synonyms', and if the path is still not found, executing the next step;
5) based on the final path in the previous step, continuing to screen a path of which the target contains the matched target word in the participle network, if the path cannot be found, screening the path containing the matched target synonym, and if the path cannot be found, executing the next step;
6) based on the final path in the previous step, other paths containing matched 'other words' in the word segmentation network are continuously screened, and if the paths cannot be found, the final path in the previous step is a path L meeting the condition;
in this case, the route L is a route including the most matched words, which is sequentially selected according to the position, the formula, the approach, the cause, the purpose, and other sequences.
Secondly, respectively calculating the number of matched words of the corresponding part of the part of; finally, if path L is empty, "no match" is output.
The specific implementation process of the exact matching and the fuzzy matching of step 3 and step 4 is shown in fig. 2.
2. As shown in fig. 3, the step 2 of establishing the dynamically expanded word segmentation network and the sequence table of part words, art words, entry words, etiological words, target words and other words, and the sequence table of part words, art words, entry words and etiological synonyms specifically includes the following steps:
firstly, ICD codes of different regions or different versions and corresponding Chinese operation names are used as training set input, and when the same Chinese name has different ICD codes, records with the codes more advanced are reserved; secondly, extracting keywords corresponding to parts, operation formulas, routes, causes, purposes and other six parts from the corresponding operation names, and respectively constructing a part lexicon, an operation formula lexicon, a route lexicon, a cause lexicon, a purpose lexicon and other lexicons; then, constructing a sequence table of part words, operation words, entry words, etiology words, destination words and other words, generating a sequencing forest, and adding a special sequence table to meet the requirement of specifying a special sequence; the sequence table specifies the priorities of parts, operation styles, accesses, causes, purposes and other keywords, and the higher the sequence is, the higher the priority is; and finally, extracting synonyms from phrases with the same ICD operation codes in the training set, and respectively forming a part synonym table, a surgical formula synonym table, an entry synonym table and a cause synonym table, wherein not all phrases with the same codes are generalized into synonyms, the screening strategy is to combine every two parts of participles of all phrases with the same codes and count, and the higher the count number is, the higher the priority is when the phrases are matched.
3. Establishing the word combination and coding mapping table in the step 3, specifically comprising the following steps:
firstly, determining an ICD version of a surgical operation to be referred to; extracting records with the same operation names as the operation names of the specified versions in the training set, directly replacing the ICD codes in the training set with the ICD codes of the specified versions, and completing the mapping of the part of codes; matching the operation name of the specified version with the generated word segmentation network to complete automatic word segmentation and obtain a set S in the figure 4 if the training set is provided with records different from the diagnosis name of the specified version; by means of the first four digits of the ICD codes of the training set ICD codes and the ICD codes of the set S, a set with the same first four digits and the same formula is searched, the most similar appointed version records are searched in the set (wherein the most similar is defined as the most matched characters with the corresponding parts of the participle, namely the number of the matched characters, the access, the cause of disease, the purpose and the like are sequentially matched, the number of the matched characters and the number of the unmatched characters are counted, the most matched characters are the most similar records, and the most unmatched characters with the least number are selected as the most similar records when the matched characters are the same), and the ICD codes of the most similar appointed version records are used. The specific process is as shown in fig. 4, a set S1 with different first four-bit codes and a set S2 with the same first four-bit codes but without the same formula are obtained, and they will be used as input data for the next matching and continue to adapt.
Secondly, the set S2 performs formula matching through the generated formula synonym table, and then finds the most similar specified version record, and the ICD code of the most similar specified version record is used as the mapping code, and the specific process is shown in fig. 5.
And thirdly, searching a same set of operation formulas in the records with the same three previous bits of ICD codes of the training set and the records with the same three previous bits of ICD codes of the specified versions for the records with the failed matching in the previous two steps, and searching the most similar records of the specified versions in the set, wherein the ICD codes of the most similar records of the specified versions are used as mapping codes. The specific process is shown in fig. 6, and the obtained set S3 with different first three-bit codes and the set S4 with the same first three-bit codes but without the same formula are used as input data for the next matching, and the adaptation is continued.
Fourthly, as shown in fig. 7, the set S4 performs formula matching through the generated formula synonym table, and then finds the most similar specified version record, where the ICD code of the most similar specified version record is used as the mapping code.
And finally, manually intervening the training set records which cannot be matched, and completing the coding mapping from all the training set records to the specified version. The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (7)

1. An intelligent coding method for Chinese surgery operation based on word segmentation network is characterized by comprising the following steps:
step 1, acquiring a Chinese surgical operation name;
step 2, adapting the Chinese operation name to a dynamic expansion word segmentation network, extracting key information from the Chinese operation name and forming key phrases by the extracted key information;
step 3, matching the keyword group with the phrase coding mapping table and outputting an unmatched result or an ICD code;
the key information comprises position information, operation type information, access information, etiology information, purpose information and other information;
the step 2 comprises the following steps:
matching the key information with the position information from long to short according to a position word library in the dynamically expanded word segmentation network, and combining the priority of each position information in the position word sequence list to obtain a position word;
matching the key information with the operation type word library in the word segmentation network from long to short according to the dynamic expansion of the key information, and combining the priority of each operation type information in the operation type word sequence table to obtain operation type words;
matching the key information with the entry information from long to short according to an entry word library in the dynamically-expanded word segmentation network, and combining the priority of each entry information in an entry word sequence table to obtain entry words;
matching the key information with the etiological factor information from long to short according to an etiological factor word library in the dynamically expanded word segmentation network, and combining the priorities of the etiological factor information in the etiological factor word sequence list to obtain etiological factor words;
matching the key information with target information from long to short according to a target word library in the dynamically expanded word segmentation network, and combining the priority of each target information in a target word sequence list to obtain a target word;
matching the key information with other information from long to short according to other word libraries in the dynamically expanded word segmentation network, and combining the priority of each piece of other information in the sequence list of other words to obtain other words;
the key word group at least comprises part words, operation words, entry words, etiological words, target words and other words;
the adaptation in step 3 comprises:
step 3.1, first-stage adaptation: carrying out accurate adaptation on the key phrase and the phrase coding mapping table;
if the key phrase is matched with the phrase code mapping table, outputting the ICD code corresponding to the key phrase, and ending the intelligent coding method;
if the key phrase is not matched with the phrase coding mapping table, entering step 3.2;
step 3.2, second-stage adaptation: and carrying out fuzzy adaptation on the key phrase and the phrase coding mapping table.
2. The intelligent encoding method for chinese surgery operation based on participle network as recited in claim 1, wherein step 3.2, the second level adaptation comprises:
step 3.2.1, searching a path containing the most matched keywords in the dynamic expansion participle network;
step 3.2.2, calculating the number of the matching of the residual keywords in the path and the key information of the input Chinese operation name, and selecting a matching path; wherein
The rest keywords are the keywords which are not matched in the keyword group.
3. The intelligent encoding method for chinese surgery operation based on participle network as recited in claim 2, wherein step 3.2.1 comprises:
step 3.2.1.1, screening all first screening paths containing the matched part words from the dynamic expansion word segmentation network; if no part word exists in the key phrase, outputting an unmatched result, and ending the intelligent coding method;
step 3.2.1.2, screening out a second screening path containing matched operation type words, entry words, etiological factors words, target words and other words from the first screening path;
the path is the path which is selected according to the operation type words, the entry words, the cause words, the destination words and other words and contains the most matched keywords in all the second screening paths.
4. The intelligent encoding method for Chinese surgery operation based on word segmentation network as claimed in claim 3, wherein in step 3.2.2, when the number of matched keywords is the same, the path with the least number of unmatched keywords is selected as the matching path, and the ICD code corresponding to the matching path is obtained through the phrase code mapping table and is output.
5. The intelligent encoding method for Chinese surgery operation based on participle network as claimed in claim 4, wherein the creation of dynamically augmented participle network comprises the steps of:
ICD codes in different regions or different versions and corresponding Chinese operation names are used as training sets to be input; if the same Chinese surgery operation name corresponds to different ICD codes, the ICD code with the smaller initial letter is reserved; if the current letter is the same, the next letter is compared until the last letter.
6. The intelligent encoding method for Chinese surgery operation based on word segmentation network as claimed in claim 5, wherein the creation of the phrase encoding mapping table comprises the following steps:
determining an ICD version to be referred to;
extracting records with the same Chinese operation name in the training set and the ICD version to be referred to, and replacing the ICD codes in the training set with the ICD codes in the ICD version to be referred to;
extracting records in the training set, which are different from the Chinese operation names in the ICD version to be determined and referred to, matching the Chinese operation names in the ICD version to be determined and referred to with the generated dynamic expansion participle network, and taking the ICD code of the record of the ICD version to be determined and referred to which is most similar as a mapping code; wherein
And the ICD code of the most similar appointed version record is the record with the most matched keywords or the record with the least number of unmatched keywords when the number of matched keywords is the same with the corresponding part of the dynamic expansion participle network.
7. A system of intelligent encoding method of chinese surgery operation based on word segmentation network, comprising a computer readable storage medium storing a computer program, characterized in that the computer program, when being executed by a processor, realizes the steps of the intelligent encoding method of chinese surgery operation based on word segmentation network according to any one of claims 1 to 6.
CN201711350705.6A 2017-12-15 2017-12-15 Intelligent coding method and system for Chinese surgical operation based on word segmentation network Active CN108182207B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711350705.6A CN108182207B (en) 2017-12-15 2017-12-15 Intelligent coding method and system for Chinese surgical operation based on word segmentation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711350705.6A CN108182207B (en) 2017-12-15 2017-12-15 Intelligent coding method and system for Chinese surgical operation based on word segmentation network

Publications (2)

Publication Number Publication Date
CN108182207A CN108182207A (en) 2018-06-19
CN108182207B true CN108182207B (en) 2020-11-13

Family

ID=62546087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711350705.6A Active CN108182207B (en) 2017-12-15 2017-12-15 Intelligent coding method and system for Chinese surgical operation based on word segmentation network

Country Status (1)

Country Link
CN (1) CN108182207B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522552B (en) * 2018-11-09 2023-08-29 天津开心生活科技有限公司 Normalization method and device of medical information, medium and electronic equipment
CN109785959A (en) * 2018-12-14 2019-05-21 平安医疗健康管理股份有限公司 A kind of disease code method and apparatus
CN110032715A (en) * 2019-03-21 2019-07-19 武汉金豆医疗数据科技有限公司 A kind of method of disease code conversion
CN111128388B (en) * 2019-12-03 2024-02-27 东软集团股份有限公司 Value range data matching method and device and related products
CN111695336A (en) * 2020-04-26 2020-09-22 平安科技(深圳)有限公司 Disease name code matching method and device, computer equipment and storage medium
CN112131867A (en) * 2020-09-22 2020-12-25 上海亿普医药科技有限公司 Clinical trial medical coding system
CN112632910A (en) * 2020-12-21 2021-04-09 北京惠及智医科技有限公司 Operation encoding method, electronic device and storage device
CN112700825B (en) * 2020-12-30 2024-03-05 杭州依图医疗技术有限公司 Medical data processing method, device and storage medium
CN112802566A (en) * 2020-12-31 2021-05-14 医渡云(北京)技术有限公司 Method and device for encoding electronic medical record
CN114155968A (en) * 2021-04-29 2022-03-08 深圳市康比特信息技术有限公司 Method for establishing mapping relation, and method and equipment for auditing surgical operation
CN115017326B (en) * 2022-05-12 2023-08-18 青岛普瑞盛医药科技有限公司 Medical coding method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6915254B1 (en) * 1998-07-30 2005-07-05 A-Life Medical, Inc. Automatically assigning medical codes using natural language processing
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data
CN105069123A (en) * 2015-08-13 2015-11-18 易保互联医疗信息科技(北京)有限公司 Automatic coding method and system for Chinese surgical operation information
CN105574103A (en) * 2015-12-11 2016-05-11 浙江大学 Method and system for automatically establishing medical term mapping relationship based on word segmentation and coding
CN106202955A (en) * 2016-07-19 2016-12-07 中电科软件信息服务有限公司 Diagnosis associated packets method and system based on intellectual coded adaptation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6915254B1 (en) * 1998-07-30 2005-07-05 A-Life Medical, Inc. Automatically assigning medical codes using natural language processing
CN104156415A (en) * 2014-07-31 2014-11-19 沈阳锐易特软件技术有限公司 Mapping processing system and method for solving problem of standard code control of medical data
CN105069123A (en) * 2015-08-13 2015-11-18 易保互联医疗信息科技(北京)有限公司 Automatic coding method and system for Chinese surgical operation information
CN105574103A (en) * 2015-12-11 2016-05-11 浙江大学 Method and system for automatically establishing medical term mapping relationship based on word segmentation and coding
CN106202955A (en) * 2016-07-19 2016-12-07 中电科软件信息服务有限公司 Diagnosis associated packets method and system based on intellectual coded adaptation

Also Published As

Publication number Publication date
CN108182207A (en) 2018-06-19

Similar Documents

Publication Publication Date Title
CN108182207B (en) Intelligent coding method and system for Chinese surgical operation based on word segmentation network
CN108182972B (en) Intelligent coding method and system for Chinese disease diagnosis based on word segmentation network
Hu et al. Improved lexically constrained decoding for translation and monolingual rewriting
US9858270B2 (en) Converting data into natural language form
CN105069124B (en) A kind of International Classification of Diseases coding method of automation and system
CN109344250B (en) Rapid structuring method of single disease diagnosis information based on medical insurance data
CN110717034A (en) Ontology construction method and device
US20140351228A1 (en) Dialog system, redundant message removal method and redundant message removal program
CN112650840A (en) Intelligent medical question-answering processing method and system based on knowledge graph reasoning
CN110516260A (en) Entity recommended method, device, storage medium and equipment
CN108922633A (en) A kind of disease name standard convention method and canonical system
CN111475623A (en) Case information semantic retrieval method and device based on knowledge graph
WO2021208444A1 (en) Method and apparatus for automatically generating electronic cases, a device, and a storage medium
CA2853627C (en) Automatic creation of clinical study reports
CN110929498B (en) Method and device for calculating similarity of short text and readable storage medium
US11645447B2 (en) Encoding textual information for text analysis
CN112328800A (en) System and method for automatically generating programming specification question answers
CN115983233B (en) Electronic medical record duplicate checking rate estimation method based on data stream matching
CN114625748A (en) SQL query statement generation method and device, electronic equipment and readable storage medium
CN117854715B (en) Intelligent diagnosis assisting system based on inquiry analysis
JP6867963B2 (en) Summary Evaluation device, method, program, and storage medium
Shah et al. Improvement of Soundex algorithm for Indian language based on phonetic matching
CN116521837A (en) Map question-answering method, system and computer readable medium based on context semantic retrieval
CN115203206A (en) Data content searching method and device, computer equipment and readable storage medium
CN114676258A (en) Disease classification intelligent service method based on patient symptom description text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20190422

Address after: Room 501-503, 43 Block 1485, Jialuo Road, Jiading District, Shanghai, 201800

Applicant after: Zhong electricity Ke software information Services Co., Ltd

Applicant after: Shanghai Changjiang science and Technology Development Co Ltd

Address before: Room 106-7, 50 Jiling Road, Jing'an District, Shanghai, 2003

Applicant before: Shanghai Changjiang science and Technology Development Co Ltd

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant