CN111739601A - Normalization method, device and readable medium for non-standard disease names - Google Patents

Normalization method, device and readable medium for non-standard disease names Download PDF

Info

Publication number
CN111739601A
CN111739601A CN202010594595.3A CN202010594595A CN111739601A CN 111739601 A CN111739601 A CN 111739601A CN 202010594595 A CN202010594595 A CN 202010594595A CN 111739601 A CN111739601 A CN 111739601A
Authority
CN
China
Prior art keywords
disease
standard
determining
disease name
standard disease
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010594595.3A
Other languages
Chinese (zh)
Other versions
CN111739601B (en
Inventor
刘文丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Health Medical Big Data Co ltd
Original Assignee
Shandong Health Medical Big Data Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Health Medical Big Data Co ltd filed Critical Shandong Health Medical Big Data Co ltd
Priority to CN202010594595.3A priority Critical patent/CN111739601B/en
Publication of CN111739601A publication Critical patent/CN111739601A/en
Application granted granted Critical
Publication of CN111739601B publication Critical patent/CN111739601B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Epidemiology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a normalization method, a device and a readable medium of non-standard disease names, wherein the method comprises the following steps: acquiring a non-standard disease name; determining first identification information of the non-standard disease name according to the spoken language disease type; determining the name of the disease to be normalized of the non-standard disease name according to the first identification information and the first body part feature word; the following steps are performed for each standard disease name in the ICD version to be referenced: determining second identification information of the target standard disease name according to the standard disease type; determining a middle standard disease name of the target standard disease name according to the second identification information and the second body part feature words; and calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, and determining the standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to. The scheme of the invention can accurately normalize the non-standard disease names.

Description

Normalization method, device and readable medium for non-standard disease names
Technical Field
The invention relates to the technical field of medical informatization, in particular to a normalization method and device for non-standard disease names and a readable medium.
Background
In China, a large number of diagnosis names (namely disease names) are stored in electronic medical records, and the disease names are mostly non-standard oral disease names (such as lung cancer, senile dementia and the like). International Classification of Diseases (ICD) classifies Diseases according to their etiology, anatomical site, clinical manifestations, and pathology. The ICD version which is most widely used worldwide is the ICD-10 published by the world health organization WHO in 1992, and the ICD-10 can be expanded to form a localized version according to needs in various countries or regions. Therefore, normalization of non-standard disease names is a pressing problem to be solved.
At present, the industry is mainly divided into two ways of establishing a disease normalization library manually and establishing a disease normalization library with the assistance of a computer. The former has the disadvantages that special personnel is needed for long-term maintenance, and the labor cost is high; the disadvantage of the latter is that the similarity of words is generally determined by calculating the euclidean distance of the word vector after word vectorization (i.e. word2vec), but in the work of normalizing the disease names, most of the disease names are independent nouns, and the context correlation is lacked, so that word vectorization cannot be realized by using word2 vec.
In order to solve the problems of the latter, the vocabulary similarity is judged by adopting a conventional distance editing mode at present. For example, the input nonstandard disease name is "lung cancer", the disease normalization library stores "pneumonia" and "lung malignancy", the distance between "lung cancer" and "pneumonia" is 1 (i.e., the number of different chinese characters is 1) and the distance between "lung cancer" and "lung malignancy" is 4 (i.e., the number of different chinese characters is 4) by distance calculation, the minimum value between the two distances is output, i.e., the standard disease name is "pneumonia", so that the input nonstandard disease name "lung cancer" is normalized to the standard disease name "pneumonia", but this is not a correct output (i.e., the correct output should be the standard disease name "lung malignancy"). Therefore, the simple method of using the edit distance still cannot accurately normalize the non-standard disease names.
Disclosure of Invention
The embodiment of the invention provides a normalization method, a normalization device and a readable medium of non-standard disease names, which can accurately normalize the non-standard disease names.
In a first aspect, embodiments of the present invention provide a method for normalizing non-standard disease names, including:
acquiring a non-standard disease name;
judging whether the non-standard disease name comprises a first body part characteristic word and a spoken language disease type;
if so, determining first identification information of the non-standard disease name according to the spoken language disease type, wherein the first identification information is used for indicating a standard disease type corresponding to the non-standard disease name in the ICD version to be referred;
determining the name of the disease to be normalized of the non-standard disease name according to the first identification information and the first body part feature word;
performing the following steps for each standard disease name in the ICD version to be referred to:
s1, if the standard disease name comprises the second body part characteristic word and the standard disease type, determining the standard disease name as a target standard disease name;
s2, determining second identification information of the target standard disease name according to the standard disease type, wherein the second identification information is used for indicating the standard disease type corresponding to the target standard disease name in the ICD version to be referred to;
s3, determining a middle standard disease name of the target standard disease name according to the second identification information and the second body part feature words;
calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, and obtaining the intermediate standard disease name with the minimum distance from the disease name to be normalized;
and determining a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred according to the intermediate standard disease name with the minimum distance from the disease name to be normalized.
In one possible design, before the determining whether the non-standard disease name includes the first body part feature word and the spoken disease type, the method further includes:
classifying each standard disease name in the ICD version to be referred according to a standard disease type, and forming a plurality of first disease groups arranged according to a set sequence;
according to a plurality of first disease groups arranged in a set sequence, determining a spoken disease type corresponding to a standard disease type in each first disease group, and determining the first disease group containing the spoken disease type as a second disease group;
the determining the first identification information of the non-standard disease name according to the spoken language disease type comprises:
determining a second disease group containing the spoken disease type according to the spoken disease type, and determining the position of the second disease group in a plurality of second disease groups;
determining first identification information of the non-standard disease name according to the positions of the second disease groups in a plurality of second disease groups;
the determining of the second identification information of the target standard disease name according to the standard disease type includes:
according to the standard disease type, determining a second disease group containing the standard disease type, and determining the position of the second disease group in a plurality of second disease groups;
and determining second identification information of the target standard disease name according to the positions of the second disease groups in a plurality of second disease groups.
In one possible design, the setting a calculation rule includes:
judging whether the positions of the second disease groups corresponding to the non-standard disease names in the plurality of second disease groups are the same as the positions of the second disease groups corresponding to the target standard disease names in the plurality of second disease groups, if so, determining that the first distance is 0, and if not, determining that the first distance is 2;
judging whether the first body part feature words and the second body part feature words are the same or not, if so, determining that a second distance is 0, and if not, determining that the second distance is 1;
adding the first distance and the second distance.
In one possible design, the determining a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to according to the intermediate standard disease name having the smallest distance from the disease name to be normalized includes:
storing the target standard disease name corresponding to the intermediate standard disease name with the minimum distance with the disease name to be normalized, and determining the target standard disease name as a non-standard ICD version;
determining a first disease name mapping relation, wherein the first disease name mapping relation is used for representing a one-to-one correspondence relation between a target standard disease name in the non-standard ICD version and a standard disease name corresponding to the target standard disease name in the ICD version to be referred to;
and determining a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to according to the first disease name mapping relation.
In one possible design, after the determining whether the non-standard disease name includes the first body part feature word and the spoken disease type, the method further includes:
if not, storing the non-standard disease name into the non-standard ICD version;
and establishing a second disease name mapping relation between the non-standard disease name and a standard disease name corresponding to the non-standard disease name in the ICD version to be referred according to the non-standard disease name.
In a second aspect, an embodiment of the present invention provides a normalization apparatus for non-standard disease names, including:
the acquisition module is used for acquiring non-standard disease names;
the judging module is used for judging whether the non-standard disease names comprise first body part characteristic words and spoken language disease types or not;
if so, determining first identification information of the non-standard disease name according to the spoken language disease type, wherein the first identification information is used for indicating a standard disease type corresponding to the non-standard disease name in the ICD version to be referred;
the first determining module is used for determining the name of the disease to be normalized of the non-standard disease name according to the first identification information and the first body part feature word;
a loop module for performing the following steps for each standard disease name in the ICD version to be referred to:
s1, if the standard disease name comprises the second body part characteristic word and the standard disease type, determining the standard disease name as a target standard disease name;
s2, determining second identification information of the target standard disease name according to the standard disease type, wherein the second identification information is used for indicating the standard disease type corresponding to the target standard disease name in the ICD version to be referred to;
s3, determining a middle standard disease name of the target standard disease name according to the second identification information and the second body part feature words;
the calculation module is used for calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, and obtaining the intermediate standard disease name with the minimum distance from the disease name to be normalized;
and a second determining module, configured to determine, according to the intermediate standard disease name having the smallest distance from the disease name to be normalized, a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to.
In one possible design, further comprising:
the classification module is used for classifying each standard disease name in the ICD version to be referred according to a standard disease type and forming a plurality of first disease groups arranged according to a set sequence;
the third determining module is used for determining the spoken language disease type corresponding to the standard disease type in each first disease group according to a plurality of first disease groups arranged in a set sequence, and determining the first disease group containing the spoken language disease type as a second disease group;
the judging module is further configured to:
determining a second disease group containing the spoken disease type according to the spoken disease type, and determining the position of the second disease group in a plurality of second disease groups;
determining first identification information of the non-standard disease name according to the positions of the second disease groups in a plurality of second disease groups;
the circulation module is further configured to:
according to the standard disease type, determining a second disease group containing the standard disease type, and determining the position of the second disease group in a plurality of second disease groups;
and determining second identification information of the target standard disease name according to the positions of the second disease groups in a plurality of second disease groups.
In one possible design, the setting a calculation rule includes:
judging whether the positions of the second disease groups corresponding to the non-standard disease names in the plurality of second disease groups are the same as the positions of the second disease groups corresponding to the target standard disease names in the plurality of second disease groups, if so, determining that the first distance is 0, and if not, determining that the first distance is 2;
judging whether the first body part feature words and the second body part feature words are the same or not, if so, determining that a second distance is 0, and if not, determining that the second distance is 1;
adding the first distance and the second distance.
In a third aspect, an embodiment of the present invention provides a normalization apparatus for non-standard disease names, including: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine-readable program to perform the method described above.
In a fourth aspect, embodiments of the present invention provide a computer-readable medium having stored thereon computer instructions, which, when executed by a processor, cause the processor to perform the method described above.
According to the scheme, the normalization method of the non-standard disease name determines the first identification information of the non-standard disease name according to the spoken disease type contained in the non-standard disease name, and then determines the disease name to be normalized of the non-standard disease name according to the first identification information and the first body part feature word; determining second identification information of a target standard disease name in an ICD version to be referred according to the standard disease type contained in the target standard disease name, determining an intermediate standard disease name of the target standard disease name according to the second identification information and a second body part feature word, calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, obtaining an intermediate standard disease name with the minimum distance from the disease name to be normalized, and determining the standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred. By the arrangement, the influence of large errors caused by judging the similarity of the words in a conventional editing distance mode can be avoided, and the nonstandard disease names can be accurately normalized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a method for normalizing non-standard disease names provided by one embodiment of the present invention;
FIG. 2 is a flow chart of a method for normalizing non-standard disease names provided by another embodiment of the present invention;
FIG. 3 is a schematic diagram of a device for normalizing non-standard disease names according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a normalization apparatus for non-standard disease names provided by an embodiment of the present invention;
fig. 5 is a schematic diagram of a normalization apparatus for non-standard disease names according to another embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without creative efforts belong to the scope of the present invention.
FIG. 1 is a flow chart of a method for normalizing non-standard disease names provided by one embodiment of the present invention. As shown in fig. 1, the method may include the steps of:
step 101, acquiring a non-standard disease name;
step 102, judging whether the non-standard disease name comprises a first body part characteristic word and a spoken language disease type;
if so, determining first identification information of the non-standard disease name according to the spoken language disease type, wherein the first identification information is used for indicating a standard disease type corresponding to the non-standard disease name in the ICD version to be referred;
103, determining a disease name to be normalized of the non-standard disease name according to the first identification information and the first body part feature word;
step 104, executing the following steps for each standard disease name in the ICD version to be referred to:
s1, if the standard disease name comprises the second body part characteristic word and the standard disease type, determining the standard disease name as a target standard disease name;
s2, determining second identification information of the target standard disease name according to the standard disease type, wherein the second identification information is used for indicating the standard disease type corresponding to the target standard disease name in the ICD version to be referred to;
s3, determining a middle standard disease name of the target standard disease name according to the second identification information and the second body part feature words;
105, calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, and obtaining the intermediate standard disease name with the minimum distance from the disease name to be normalized;
and step 106, determining a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to according to the intermediate standard disease name with the minimum distance to the disease name to be normalized.
In the embodiment of the invention, the normalization method of the non-standard disease name determines the first identification information of the non-standard disease name according to the spoken disease type contained in the non-standard disease name, and then determines the disease name to be normalized of the non-standard disease name according to the first identification information and the first body part feature word; determining second identification information of a target standard disease name in an ICD version to be referred according to the standard disease type contained in the target standard disease name, determining an intermediate standard disease name of the target standard disease name according to the second identification information and a second body part feature word, calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, obtaining an intermediate standard disease name with the minimum distance from the disease name to be normalized, and determining the standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred. By the arrangement, the influence of large errors caused by judging the similarity of the words in a conventional editing distance mode can be avoided, and the nonstandard disease names can be accurately normalized.
Based on the normalization method for non-standard disease names shown in fig. 1, in an embodiment of the present invention, before the determining whether the non-standard disease names include the first body part feature words and the spoken language disease types, the method further includes:
classifying each standard disease name in the ICD version to be referred according to a standard disease type, and forming a plurality of first disease groups arranged according to a set sequence;
according to a plurality of first disease groups arranged in a set sequence, determining a spoken disease type corresponding to a standard disease type in each first disease group, and determining the first disease group containing the spoken disease type as a second disease group;
the determining the first identification information of the non-standard disease name according to the spoken language disease type comprises:
determining a second disease group containing the spoken disease type according to the spoken disease type, and determining the position of the second disease group in a plurality of second disease groups;
determining first identification information of the non-standard disease name according to the positions of the second disease groups in a plurality of second disease groups;
the determining of the second identification information of the target standard disease name according to the standard disease type includes:
according to the standard disease type, determining a second disease group containing the standard disease type, and determining the position of the second disease group in a plurality of second disease groups;
and determining second identification information of the target standard disease name according to the positions of the second disease groups in a plurality of second disease groups.
In an embodiment of the invention, the first identification information of the non-standard disease name may be obtained by a second disease group including the spoken disease type, and determining the position of the second disease grouping in a number of second disease groupings, the second identification information of the target standard disease name may be grouped by a second disease including the standard disease type, and determining the position of the second disease grouping in a number of second disease groupings, thus, the determined disease name to be normalized and the intermediate standard disease name can be more reasonable and accord with the normalization logic when being calculated, that is, the standard disease name corresponding to the intermediate standard disease name having the smallest calculated distance from the disease name to be normalized among the ICD versions to be referred to is correctly output, so that the non-standard disease names can be accurately normalized.
Based on the normalization method of the non-standard disease names shown in fig. 1, in an embodiment of the present invention, the setting of the calculation rule includes:
judging whether the positions of the second disease groups corresponding to the non-standard disease names in the plurality of second disease groups are the same as the positions of the second disease groups corresponding to the target standard disease names in the plurality of second disease groups, if so, determining that the first distance is 0, and if not, determining that the first distance is 2;
judging whether the first body part feature words and the second body part feature words are the same or not, if so, determining that a second distance is 0, and if not, determining that the second distance is 1;
adding the first distance and the second distance.
In the embodiment of the present invention, by setting the calculation rule, the first distance between the target standard disease names correctly corresponding to the non-standard disease names is 0 and the second distance is 0, so that the output standard disease names can be correctly output. That is, as long as one of the first distance or the second distance is not 0, it may result in the output standard disease name not correctly corresponding to the non-standard disease name.
Based on the normalization method of non-standard disease names shown in fig. 1, in an embodiment of the present invention, the determining a standard disease name corresponding to an intermediate standard disease name in the ICD version to be referred to according to the intermediate standard disease name having the smallest distance from the disease name to be normalized includes:
storing the target standard disease name corresponding to the intermediate standard disease name with the minimum distance with the disease name to be normalized, and determining the target standard disease name as a non-standard ICD version;
determining a first disease name mapping relation, wherein the first disease name mapping relation is used for representing a one-to-one correspondence relation between a target standard disease name in the non-standard ICD version and a standard disease name in the ICD version to be referred to;
and determining a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to according to the first disease name mapping relation.
In the embodiment of the present invention, by setting the non-standard ICD version, the non-standard ICD version can be run simultaneously with the existing ICD version (not the ICD version to be referred to above), so as to implement quality control for normalizing the existing ICD version. For example, an input non-standard disease name is "lung cancer," a non-standard ICD version outputs "lung malignancy," and an existing ICD version, if it also outputs "lung malignancy," proves that the existing ICD version is correctly output for the non-standard disease name "lung cancer," while an existing ICD version, if it outputs "pneumonia," proves that the existing ICD version is not correctly output for the non-standard disease name "lung cancer.
Based on the normalization method for non-standard disease names shown in fig. 1, in an embodiment of the present invention, after the determining whether the non-standard disease names include the first body part feature words and the spoken disease types, the method further includes:
if not, storing the non-standard disease name into the non-standard ICD version;
and establishing a second disease name mapping relation between the non-standard disease name and a standard disease name corresponding to the non-standard disease name in the ICD version to be referred according to the non-standard disease name.
In the embodiment of the present invention, if the first body part feature word and the spoken language disease type are not included in the non-standard disease name, it is proved that the distance between the non-standard disease name and the standard disease name may be large, and thus the non-standard disease name cannot be accurately calculated by using the distance, and the proportion of the non-standard disease name (for example, leukemia, senile dementia, and the like) in all the non-standard disease names is relatively small, so that the non-standard disease name normalization accuracy can be further improved by storing the non-standard disease name by setting the non-standard ICD version ICD.
As shown in fig. 2, another embodiment of the present invention further provides a method for normalizing non-standard disease names. The method comprises the following steps:
step 201, obtaining a non-standard disease name.
In this step, the non-standard disease name may be obtained by manual input or voice input to the mobile terminal or the server.
Step 202, classifying each standard disease name in the ICD version to be referred to according to a standard disease type, and forming a plurality of first disease groups arranged according to a set sequence.
In this step, the ICD version to be referred to may be ICD-10, for example, and the standard disease types may include: malignant tumor, tuberculosis, inflammation, etc., for example, in the first disease group of malignant tumor, the standard disease names such as brain malignant tumor, lung malignant tumor, chest malignant tumor, etc. are actually included, and the set order arrangement of a plurality of first disease groups may be, for example, the first letter of the standard disease type, and of course, may be the set order arrangement of other words, which is not limited herein. For example, a number of first disease groups arranged in a set order are formed as follows: [ DENSATURE ] NEUTRON, [ CHINESE TUBERY ] AND [ INFLATION ] … ….
Step 203, according to a plurality of first disease groups arranged in a set sequence, determining a spoken disease type corresponding to the standard disease type in each first disease group, and determining the first disease group containing the spoken disease type as a second disease group.
In this step, first, a malignant tumor is taken as an example, the spoken language disease type corresponding to the malignant tumor is generally cancer, and in order to increase the number of spoken language disease types, it may be cancer, and thus, it is also possible to output the input non-standard disease name more accurately. Then, taking the example of inflammation, the spoken language disease type corresponding to malignancy is generally inflammation, but it may be inflammation in order to increase the number of spoken language disease types, where the spoken language disease type is the same as the standard disease type, so the field of inflammation is not repeated. For example, a plurality of second disease groups arranged in a set order are formed as follows: [ malignant tumor, cancer … … ], [ tuberculosis … … ], [ inflammation, inflammation … … ] … ….
And step 204, judging whether the non-standard disease name comprises a first body part characteristic word and a spoken language disease type.
In this step, since the number of the non-standard disease names is large, in the embodiment of the present invention, the number of the non-standard disease names are divided into two types, one type includes the first body part feature word and the spoken language disease type (for example, lung cancer, brain cancer, etc.), and since the percentage of the non-standard disease names is large, how to accurately output the non-standard disease names is mainly considered in the embodiment of the present invention; another category is the exclusion of first body part feature words and spoken language disease types (e.g., leukemia, senile dementia, etc.).
Step 205, according to the spoken language disease type, determining a second disease group comprising the spoken language disease type, and determining the position of the second disease group in a plurality of second disease groups.
In this step, for example, the input non-standard disease name is "lung cancer", the second disease group containing the spoken disease type "cancer" is [ malignancy, cancer … … ] according to the spoken disease type "cancer", and the position of the second disease group in several second disease groups, for example, the first position, is determined.
And step 206, determining the first identification information of the non-standard disease name according to the positions of the second disease groups in a plurality of second disease groups.
In this step, as described in the previous example, for example, if the second disease group has n groups, the position where the "lung cancer" is located is the first group, and the first identification information of the determined non-standard disease name may be, for example, "10000 … … 00" (where the number of 0 is n-1), or may be other identification information, which is not described herein.
And step 207, determining the disease name to be normalized of the non-standard disease name according to the first identification information and the first body part feature word.
In this step, as described in the previous example, the first identification information of "lung cancer" is "10000 … … 00" (where the number of 0 is n-1), the first body part feature word of "lung cancer" is "lung", and the determined disease name to be normalized may be "10000 … … 00 lung" (where the number of 0 is n-1), for example.
Step 208, performing the following steps for each standard disease name in the ICD version to be referred to:
s1, if the standard disease name includes the second body part feature word and the standard disease type, determining the standard disease name as the target standard disease name.
In this step, since the previous example defines the input non-standard disease name as including the first body part feature word and the spoken disease type (e.g., lung cancer, brain cancer, etc.), in order to correspond to the input non-standard disease name, it is necessary to screen out standard disease names in which each standard disease name in the ICD version to be referred conforms to the body part and the disease type, where the screened target standard disease name includes the second body part feature word and the standard disease type (e.g., lung malignant tumor, brain malignant tumor, etc.).
S21, according to the standard disease type, determining a second disease group containing the standard disease type, and determining the position of the second disease group in a plurality of second disease groups.
In this step, standard disease names such as "lung malignancy" and "brain malignancy" would be grouped into the first group described in the previous example, i.e., [ malignancy, cancer … … ], according to their standard disease type, "malignancy", and further standard disease names such as "pneumonia" and "encephalitis" would be grouped into the third group described in the previous example, i.e., [ inflammation, inflammation … … ], according to their standard disease type, "inflammation".
And S22, determining second identification information of the target standard disease name according to the positions of the second disease groups in a plurality of second disease groups.
In this step, as described in the previous example, for example, if the second disease is grouped into n groups, the location where the "lung malignant tumor" and the "brain malignant tumor" are located is the first group, and the second identification information for determining the target standard disease name may be, for example, "10000 … … 00" (where the number of 0 is n-1), or other identification information, which is not illustrated here.
For another example, the position of "pneumonia" and "encephalitis" is a third group, and the second identification information for determining the target standard disease name may be, for example, "00100 … … 00" (where the number of 0 is n-1), or of course, other identification information may be also available, which is not illustrated herein
S3, determining the intermediate standard disease name of the target standard disease name according to the second identification information and the second body part feature words.
In this step, as described in the previous example, the second identification information of "lung malignant tumor" is "10000 … … 00" (where the number of 0 s is n-1), the second body part feature word of "lung malignant tumor" is "lung", and the determined intermediate standard disease name may be "10000 … … 00 lung" (where the number of 0 s is n-1), for example.
As another example, the second identification information of "brain malignancy" is "10000 … … 00" (where the number of 0 s is n-1), the second body part feature word of "brain malignancy" is "brain", and the determined intermediate standard disease name may be "10000 … … 00 brain" (where the number of 0 s is n-1), for example.
As another example, the second identification information of "pneumonia" is "00100 … … 00" (where the number of 0 s is n-1), the second body part feature word of "lung malignant tumor" is "lung", and the determined intermediate standard disease name may be "00100 … … 00 lung" (where the number of 0 s is n-1), for example.
As another example, the second identification information of "encephalitis" is "00100 … … 00" (where the number of 0 s is n-1), the second body part feature word of "encephalitis" is "brain", and the determined intermediate standard disease name may be "00100 … … 00 brain" (where the number of 0 s is n-1), for example.
And 209, calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, and obtaining the intermediate standard disease name with the minimum distance from the disease name to be normalized.
In this step, the setting of the calculation rule includes:
judging whether the positions of the second disease groups corresponding to the non-standard disease names in the plurality of second disease groups are the same as the positions of the second disease groups corresponding to the target standard disease names in the plurality of second disease groups, if so, determining that the first distance is 0, and if not, determining that the first distance is 2;
judging whether the first body part feature words and the second body part feature words are the same or not, if so, determining that a second distance is 0, and if not, determining that the second distance is 1;
adding the first distance and the second distance.
In the previous example, the disease name to be normalized of "lung cancer" is "10000 … … 00 lung" (where the number of 0 s is n-1), "10000 … … 00 lung" (where the number of 0 s is n-1), "10000 … … 00 brain" (where the number of 0 s is n-1), "00100 … … 00 lung" (where the number of 0 s is n-1), "encephalitis" is "00100 brain" (where the number of 0 s is n-1), "and" 00100 … … 00 brain "(where the number of 0 s is n-1), according to the above calculation rule, it can be obtained:
the first distance for "lung cancer" and "lung malignancy" is 0 and the second distance is 0;
the first distance between "lung cancer" and "brain malignancy" is 0 and the second distance is 1;
the first distance for "lung cancer" and "pneumonia" is 2, the second distance is 0;
the first distance for "lung cancer" and "encephalitis" is 2, the second distance is 1;
as can be seen, the standard disease name corresponding to "lung cancer" is "lung malignant tumor", and thus, accurate output can be achieved.
Step 210, storing the target standard disease name corresponding to the intermediate standard disease name, and determining the target standard disease name as a non-standard ICD version.
In this step, the determined non-standard ICD version may be used to guide quality control of an existing ICD version. For example, an input non-standard disease name is "lung cancer," a non-standard ICD version outputs "lung malignancy," and an existing ICD version, if it also outputs "lung malignancy," proves that the existing ICD version is correctly output for the non-standard disease name "lung cancer," while an existing ICD version, if it outputs "pneumonia," proves that the existing ICD version is not correctly output for the non-standard disease name "lung cancer.
And step 211, determining a first disease name mapping relation.
In this step, the first disease name mapping relationship is used to characterize a one-to-one correspondence relationship between the target standard disease name in the non-standard ICD version and the standard disease name in the ICD version to be referred to.
Step 212, determining a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to according to the first disease name mapping relation.
In this step, by using the set first disease name mapping relationship, the effect that the determined non-standard ICD version guides the quality control of the existing ICD version can be achieved.
Step 213, storing the non-standard disease name into the non-standard ICD version.
In this step, non-standard disease names such as "leukemia", "senile dementia", etc. may be stored in the non-standard ICD version.
Step 214, establishing a second disease name mapping relationship between the non-standard disease name and the standard disease name corresponding to the non-standard disease name in the ICD version to be referred to according to the non-standard disease name.
In this step, if the first body part feature word and the spoken disease type are not included in the non-standard disease name, it is proved that the non-standard disease name may be a large distance from the standard disease name, so that the distance cannot be accurately calculated, and the portion of the non-standard disease name (e.g., leukemia, senile dementia, etc.) occupies a relatively small proportion of all the non-standard disease names, so that the non-standard disease name normalization accuracy can be further improved by storing the portion of the non-standard disease name by setting the non-standard ICD version.
As shown in fig. 3 and 4, the embodiment of the present invention provides a device where a normalization apparatus for non-standard disease names is located and a normalization apparatus for non-standard disease names. The device embodiments may be implemented by software, or by hardware, or by a combination of hardware and software. From a hardware aspect, as shown in fig. 3, a hardware structure diagram of a device in which a normalization apparatus for a non-standard disease name provided in an embodiment of the present invention is located is shown, where in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 3, the device in which the apparatus is located in the embodiment may also include other hardware, such as a forwarding chip responsible for processing a packet, and the like. Taking a software implementation as an example, as shown in fig. 4, as a logical apparatus, the apparatus is formed by reading a corresponding computer program instruction in a non-volatile memory into a memory by a CPU of a device in which the apparatus is located and running the computer program instruction.
As shown in fig. 4, the normalization apparatus for non-standard disease names provided in this embodiment includes:
an obtaining module 401, configured to obtain a non-standard disease name;
a determining module 402, configured to determine whether the non-standard disease name includes a first body part feature word and a spoken language disease type;
if so, determining first identification information of the non-standard disease name according to the spoken language disease type, wherein the first identification information is used for indicating a standard disease type corresponding to the non-standard disease name in the ICD version to be referred;
a first determining module 403, configured to determine, according to the first identification information and the first body part feature word, a to-be-normalized disease name of the non-standard disease name;
a loop module 404, configured to perform the following steps for each standard disease name in the ICD version to be referred to:
s1, if the standard disease name comprises the second body part characteristic word and the standard disease type, determining the standard disease name as a target standard disease name;
s2, determining second identification information of the target standard disease name according to the standard disease type, wherein the second identification information is used for indicating the standard disease type corresponding to the target standard disease name in the ICD version to be referred to;
s3, determining a middle standard disease name of the target standard disease name according to the second identification information and the second body part feature words;
a calculating module 405, configured to calculate a distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, and obtain an intermediate standard disease name with a minimum distance from the disease name to be normalized;
a second determining module 406, configured to determine, according to the intermediate standard disease name with the smallest distance from the disease name to be normalized, a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to.
In an embodiment of the present invention, the obtaining module 401 may be configured to execute step 101 in the foregoing method embodiment, the determining module 402 may be configured to execute step 102 in the foregoing method embodiment, the first determining module 403 may be configured to execute step 103 in the foregoing method embodiment, the looping module 404 may be configured to execute step 104 in the foregoing method embodiment, the calculating module 405 may be configured to execute step 105 in the foregoing method embodiment, and the second determining module 406 may be configured to execute step 106 in the foregoing method embodiment.
As shown in fig. 5, in an embodiment of the present invention, the normalization device for non-standard disease names further includes:
a classification module 407, configured to classify each standard disease name in the ICD version to be referred to according to a standard disease type, and form a plurality of first disease groups arranged according to a set order;
a third determining module 408, configured to determine a spoken language disease type corresponding to the standard disease type in each first disease group according to a plurality of first disease groups arranged in a set order, and determine the first disease group containing the spoken language disease type as a second disease group;
the determining module 402 is further configured to:
determining a second disease group containing the spoken disease type according to the spoken disease type, and determining the position of the second disease group in a plurality of second disease groups;
determining first identification information of the non-standard disease name according to the positions of the second disease groups in a plurality of second disease groups;
the loop module 404 is further configured to:
according to the standard disease type, determining a second disease group containing the standard disease type, and determining the position of the second disease group in a plurality of second disease groups;
and determining second identification information of the target standard disease name according to the positions of the second disease groups in a plurality of second disease groups.
In an embodiment of the present invention, the setting of the calculation rule includes:
judging whether the positions of the second disease groups corresponding to the non-standard disease names in the plurality of second disease groups are the same as the positions of the second disease groups corresponding to the target standard disease names in the plurality of second disease groups, if so, determining that the first distance is 0, and if not, determining that the first distance is 2;
judging whether the first body part feature words and the second body part feature words are the same or not, if so, determining that a second distance is 0, and if not, determining that the second distance is 1;
adding the first distance and the second distance.
It is to be understood that the illustrated structure of the embodiments of the present invention does not constitute a specific limitation of the normalization means for non-standard disease names. In other embodiments of the invention, the normalization means for non-standard disease names may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
Because the content of information interaction, execution process, and the like among the modules in the device is based on the same concept as the method embodiment of the present invention, specific content can be referred to the description in the method embodiment of the present invention, and is not described herein again.
The embodiment of the invention also provides a device for normalizing the non-standard disease names, which comprises: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor is configured to invoke the machine readable program to perform a method for normalization of non-standard disease names in any embodiment of the invention.
Embodiments of the present invention also provide a computer-readable medium storing instructions for causing a computer to perform a method of normalizing non-standard disease names as described herein. Specifically, a method or an apparatus equipped with a storage medium on which a software program code that realizes the functions of any of the above-described embodiments is stored may be provided, and a computer (or a CPU or MPU) of the method or the apparatus is caused to read out and execute the program code stored in the storage medium.
In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.
Examples of the storage medium for supplying the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.
Further, it should be clear that the functions of any one of the above-described embodiments can be implemented not only by executing the program code read out by the computer, but also by performing a part or all of the actual operations by an operation method or the like operating on the computer based on instructions of the program code.
Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.
In summary, the normalization method, apparatus and readable medium for non-standard disease names provided in the embodiments of the present invention at least have the following advantages:
1. in the embodiment of the invention, the normalization method of the non-standard disease name determines the first identification information of the non-standard disease name according to the spoken disease type contained in the non-standard disease name, and then determines the disease name to be normalized of the non-standard disease name according to the first identification information and the first body part feature word; determining second identification information of a target standard disease name in an ICD version to be referred according to the standard disease type contained in the target standard disease name, determining an intermediate standard disease name of the target standard disease name according to the second identification information and a second body part feature word, calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, obtaining an intermediate standard disease name with the minimum distance from the disease name to be normalized, and determining the standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred. By the arrangement, the influence of large errors caused by judging the similarity of the words in a conventional editing distance mode can be avoided, and the nonstandard disease names can be accurately normalized.
2. In an embodiment of the invention, the first identification information of the non-standard disease name may be obtained by a second disease group including the spoken disease type, and determining the position of the second disease grouping in a number of second disease groupings, the second identification information of the target standard disease name may be grouped by a second disease including the standard disease type, and determining the position of the second disease grouping in a number of second disease groupings, thus, the determined disease name to be normalized and the intermediate standard disease name can be more reasonable and accord with the normalization logic when being calculated, that is, the standard disease name corresponding to the intermediate standard disease name having the smallest calculated distance from the disease name to be normalized among the ICD versions to be referred to is correctly output, so that the non-standard disease names can be accurately normalized.
3. In the embodiment of the present invention, by setting the calculation rule, the first distance between the target standard disease names correctly corresponding to the non-standard disease names is 0 and the second distance is 0, so that the output standard disease names can be correctly output. That is, as long as one of the first distance or the second distance is not 0, it may result in the output standard disease name not correctly corresponding to the non-standard disease name.
4. In the embodiment of the present invention, by setting the non-standard ICD version, the non-standard ICD version can be simultaneously run with the existing ICD version (other than the ICD version to be referred to above), so as to implement quality control of normalizing the existing ICD version. For example, an input non-standard disease name is "lung cancer," a non-standard ICD version outputs "lung malignancy," and an existing ICD version, if it also outputs "lung malignancy," proves that the existing ICD version is correctly output for the non-standard disease name "lung cancer," while an existing ICD version, if it outputs "pneumonia," proves that the existing ICD version is not correctly output for the non-standard disease name "lung cancer.
5. In the embodiment of the present invention, if the first body part feature word and the spoken language disease type are not included in the non-standard disease name, it is proved that the distance between the non-standard disease name and the standard disease name may be large, and thus the non-standard disease name cannot be accurately calculated by using the distance, and the non-standard disease names (such as "leukemia", "senile dementia", and the like) account for a relatively small proportion of all the non-standard disease names, so that the non-standard disease name normalization accuracy can be further improved by storing the non-standard disease names by setting the non-standard ICD version.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the scope of protection of the present application.

Claims (10)

1. A method for normalizing non-standard disease names, comprising:
acquiring a non-standard disease name;
judging whether the non-standard disease name comprises a first body part characteristic word and a spoken language disease type;
if so, determining first identification information of the non-standard disease name according to the spoken language disease type, wherein the first identification information is used for indicating a standard disease type corresponding to the non-standard disease name in the ICD version to be referred;
determining the name of the disease to be normalized of the non-standard disease name according to the first identification information and the first body part feature word;
performing the following steps for each standard disease name in the ICD version to be referred to:
s1, if the standard disease name comprises the second body part characteristic word and the standard disease type, determining the standard disease name as a target standard disease name;
s2, determining second identification information of the target standard disease name according to the standard disease type, wherein the second identification information is used for indicating the standard disease type corresponding to the target standard disease name in the ICD version to be referred to;
s3, determining a middle standard disease name of the target standard disease name according to the second identification information and the second body part feature words;
calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, and obtaining the intermediate standard disease name with the minimum distance from the disease name to be normalized;
and determining a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred according to the intermediate standard disease name with the minimum distance from the disease name to be normalized.
2. The method of claim 1,
before the determining whether the non-standard disease name includes the first body part feature word and the spoken language disease type, further comprising:
classifying each standard disease name in the ICD version to be referred according to a standard disease type, and forming a plurality of first disease groups arranged according to a set sequence;
according to a plurality of first disease groups arranged in a set sequence, determining a spoken disease type corresponding to a standard disease type in each first disease group, and determining the first disease group containing the spoken disease type as a second disease group;
the determining the first identification information of the non-standard disease name according to the spoken language disease type comprises:
determining a second disease group containing the spoken disease type according to the spoken disease type, and determining the position of the second disease group in a plurality of second disease groups;
determining first identification information of the non-standard disease name according to the positions of the second disease groups in a plurality of second disease groups;
the determining of the second identification information of the target standard disease name according to the standard disease type includes:
according to the standard disease type, determining a second disease group containing the standard disease type, and determining the position of the second disease group in a plurality of second disease groups;
and determining second identification information of the target standard disease name according to the positions of the second disease groups in a plurality of second disease groups.
3. The method of claim 2, wherein setting the calculation rule comprises:
judging whether the positions of the second disease groups corresponding to the non-standard disease names in the plurality of second disease groups are the same as the positions of the second disease groups corresponding to the target standard disease names in the plurality of second disease groups, if so, determining that the first distance is 0, and if not, determining that the first distance is 2;
judging whether the first body part feature words and the second body part feature words are the same or not, if so, determining that a second distance is 0, and if not, determining that the second distance is 1;
adding the first distance and the second distance.
4. The method according to claim 1, wherein determining the standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to according to the intermediate standard disease name having the smallest distance from the disease name to be normalized comprises:
storing the target standard disease name corresponding to the intermediate standard disease name with the minimum distance with the disease name to be normalized, and determining the target standard disease name as a non-standard ICD version;
determining a first disease name mapping relation, wherein the first disease name mapping relation is used for representing a one-to-one correspondence relation between a target standard disease name in the non-standard ICD version and a standard disease name corresponding to the target standard disease name in the ICD version to be referred to;
and determining a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to according to the first disease name mapping relation.
5. The method of claim 4,
after the determining whether the non-standard disease name includes the first body part feature word and the spoken language disease type, further comprising:
if not, storing the non-standard disease name into the non-standard ICD version;
and establishing a second disease name mapping relation between the non-standard disease name and a standard disease name corresponding to the non-standard disease name in the ICD version to be referred according to the non-standard disease name.
6. The normalization device of the non-standard disease names is characterized by comprising:
the acquisition module is used for acquiring non-standard disease names;
the judging module is used for judging whether the non-standard disease names comprise first body part characteristic words and spoken language disease types or not;
if so, determining first identification information of the non-standard disease name according to the spoken language disease type, wherein the first identification information is used for indicating a standard disease type corresponding to the non-standard disease name in the ICD version to be referred;
the first determining module is used for determining the name of the disease to be normalized of the non-standard disease name according to the first identification information and the first body part feature word;
a loop module for performing the following steps for each standard disease name in the ICD version to be referred to:
s1, if the standard disease name comprises the second body part characteristic word and the standard disease type, determining the standard disease name as a target standard disease name;
s2, determining second identification information of the target standard disease name according to the standard disease type, wherein the second identification information is used for indicating the standard disease type corresponding to the target standard disease name in the ICD version to be referred to;
s3, determining a middle standard disease name of the target standard disease name according to the second identification information and the second body part feature words;
the calculation module is used for calculating the distance between the disease name to be normalized and each intermediate standard disease name according to a set calculation rule, and obtaining the intermediate standard disease name with the minimum distance from the disease name to be normalized;
and a second determining module, configured to determine, according to the intermediate standard disease name having the smallest distance from the disease name to be normalized, a standard disease name corresponding to the intermediate standard disease name in the ICD version to be referred to.
7. The apparatus of claim 6, further comprising:
the classification module is used for classifying each standard disease name in the ICD version to be referred according to a standard disease type and forming a plurality of first disease groups arranged according to a set sequence;
the third determining module is used for determining the spoken language disease type corresponding to the standard disease type in each first disease group according to a plurality of first disease groups arranged in a set sequence, and determining the first disease group containing the spoken language disease type as a second disease group;
the judging module is further configured to:
determining a second disease group containing the spoken disease type according to the spoken disease type, and determining the position of the second disease group in a plurality of second disease groups;
determining first identification information of the non-standard disease name according to the positions of the second disease groups in a plurality of second disease groups;
the circulation module is further configured to:
according to the standard disease type, determining a second disease group containing the standard disease type, and determining the position of the second disease group in a plurality of second disease groups;
and determining second identification information of the target standard disease name according to the positions of the second disease groups in a plurality of second disease groups.
8. The apparatus of claim 7, wherein the setting of the calculation rule comprises:
judging whether the positions of the second disease groups corresponding to the non-standard disease names in the plurality of second disease groups are the same as the positions of the second disease groups corresponding to the target standard disease names in the plurality of second disease groups, if so, determining that the first distance is 0, and if not, determining that the first distance is 2;
judging whether the first body part feature words and the second body part feature words are the same or not, if so, determining that a second distance is 0, and if not, determining that the second distance is 1;
adding the first distance and the second distance.
9. Apparatus for normalization of disease names, comprising: at least one memory and at least one processor;
the at least one memory to store a machine readable program;
the at least one processor, configured to invoke the machine readable program, to perform the method of any of claims 1 to 5.
10. Computer readable medium, characterized in that it has stored thereon computer instructions which, when executed by a processor, cause the processor to carry out the method of any one of claims 1 to 5.
CN202010594595.3A 2020-06-28 2020-06-28 Normalization method, device and readable medium for non-standard disease names Active CN111739601B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010594595.3A CN111739601B (en) 2020-06-28 2020-06-28 Normalization method, device and readable medium for non-standard disease names

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010594595.3A CN111739601B (en) 2020-06-28 2020-06-28 Normalization method, device and readable medium for non-standard disease names

Publications (2)

Publication Number Publication Date
CN111739601A true CN111739601A (en) 2020-10-02
CN111739601B CN111739601B (en) 2022-03-29

Family

ID=72651269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010594595.3A Active CN111739601B (en) 2020-06-28 2020-06-28 Normalization method, device and readable medium for non-standard disease names

Country Status (1)

Country Link
CN (1) CN111739601B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113096799A (en) * 2021-04-25 2021-07-09 北京百度网讯科技有限公司 Quality control method and device

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107545143A (en) * 2017-09-04 2018-01-05 复旦大学 The mapping method of disease and human body
CN108182972A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network
CN109215771A (en) * 2018-05-29 2019-01-15 平安医疗健康管理股份有限公司 Medical mapping relations library method for building up, device, computer equipment and storage medium
CN109582797A (en) * 2018-12-13 2019-04-05 泰康保险集团股份有限公司 Obtain method, apparatus, medium and electronic equipment that classification of diseases is recommended
CN109698016A (en) * 2018-12-11 2019-04-30 中国科学院深圳先进技术研究院 Disease automatic coding and device
CN110032728A (en) * 2019-02-01 2019-07-19 阿里巴巴集团控股有限公司 The standardized conversion method of disease name and device
CN110517787A (en) * 2019-08-30 2019-11-29 山东健康医疗大数据有限公司 A kind of clinical data group classification method based on Chinese medical main suit's analysis
CN110660459A (en) * 2019-08-30 2020-01-07 腾讯科技(深圳)有限公司 Method, device, server and storage medium for controlling medical record quality

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107545143A (en) * 2017-09-04 2018-01-05 复旦大学 The mapping method of disease and human body
CN108182972A (en) * 2017-12-15 2018-06-19 上海长江科技发展有限公司 The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network
CN109215771A (en) * 2018-05-29 2019-01-15 平安医疗健康管理股份有限公司 Medical mapping relations library method for building up, device, computer equipment and storage medium
CN109698016A (en) * 2018-12-11 2019-04-30 中国科学院深圳先进技术研究院 Disease automatic coding and device
CN109582797A (en) * 2018-12-13 2019-04-05 泰康保险集团股份有限公司 Obtain method, apparatus, medium and electronic equipment that classification of diseases is recommended
CN110032728A (en) * 2019-02-01 2019-07-19 阿里巴巴集团控股有限公司 The standardized conversion method of disease name and device
CN110517787A (en) * 2019-08-30 2019-11-29 山东健康医疗大数据有限公司 A kind of clinical data group classification method based on Chinese medical main suit's analysis
CN110660459A (en) * 2019-08-30 2020-01-07 腾讯科技(深圳)有限公司 Method, device, server and storage medium for controlling medical record quality

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113096799A (en) * 2021-04-25 2021-07-09 北京百度网讯科技有限公司 Quality control method and device
CN113096799B (en) * 2021-04-25 2024-04-02 北京百度网讯科技有限公司 Quality control method and device

Also Published As

Publication number Publication date
CN111739601B (en) 2022-03-29

Similar Documents

Publication Publication Date Title
JP2004139484A (en) Form processing device, program for implementing it, and program for creating form format
US20060116862A1 (en) System and method for tokenization of text
CN111090641B (en) Data processing method and device, electronic equipment and storage medium
CN111144210B (en) Image structuring processing method and device, storage medium and electronic equipment
CN101655837A (en) Method for detecting and correcting error on text after voice recognition
CN108320808A (en) Analysis of medical record method and apparatus, equipment, computer readable storage medium
CN110555096A (en) User intention identification method, system, terminal and medium
US20190286692A1 (en) Computing machine and template management method
CN111739601B (en) Normalization method, device and readable medium for non-standard disease names
US20060005169A1 (en) Software development system and method
JP2019032704A (en) Table data structuring system and table data structuring method
CN113138990B (en) Data blood margin construction and tracing method, device and equipment
CN112700763B (en) Voice annotation quality evaluation method, device, equipment and storage medium
CN113823404A (en) Medical big data-based method for standardizing medical terms for construction of specific diseases
CN113436730A (en) Hospital disease diagnosis classification automatic coding method and system
JP2000089786A (en) Method for correcting speech recognition result and apparatus therefor
CN116360794A (en) Database language analysis method, device, computer equipment and storage medium
CN110837494B (en) Method and device for identifying unspecified diagnosis coding errors of medical record home page
CN111339756B (en) Text error detection method and device
JP2009087378A (en) Business form processor
CN116880826B (en) Visualized code generation method
JP4521377B2 (en) Form processing apparatus, program for executing the apparatus, and form format creation program
CN116186271B (en) Medical term classification model training method, classification method and device
CN117273001A (en) Medical record entity extraction method and device
CN111610948B (en) Intelligent formula online editing system and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant