CN112445917A - Method and device for constructing traditional medical disease body - Google Patents
Method and device for constructing traditional medical disease body Download PDFInfo
- Publication number
- CN112445917A CN112445917A CN202011222616.5A CN202011222616A CN112445917A CN 112445917 A CN112445917 A CN 112445917A CN 202011222616 A CN202011222616 A CN 202011222616A CN 112445917 A CN112445917 A CN 112445917A
- Authority
- CN
- China
- Prior art keywords
- disease
- matching
- terms
- term
- traditional medical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 388
- 201000010099 disease Diseases 0.000 title claims abstract description 387
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000013507 mapping Methods 0.000 claims abstract description 37
- 230000001737 promoting effect Effects 0.000 abstract description 3
- 239000003814 drug Substances 0.000 description 18
- 238000010276 construction Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 9
- 238000003759 clinical diagnosis Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 206010023126 Jaundice Diseases 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 230000036541 health Effects 0.000 description 4
- 108090000623 proteins and genes Proteins 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 238000003745 diagnosis Methods 0.000 description 3
- 230000008520 organization Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 210000000115 thoracic cavity Anatomy 0.000 description 3
- 208000013738 Sleep Initiation and Maintenance disease Diseases 0.000 description 2
- 206010022437 insomnia Diseases 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 208000019423 liver disease Diseases 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 208000037911 visceral disease Diseases 0.000 description 2
- 208000012260 Accidental injury Diseases 0.000 description 1
- 206010067484 Adverse reaction Diseases 0.000 description 1
- 208000030852 Parasitic disease Diseases 0.000 description 1
- 208000005374 Poisoning Diseases 0.000 description 1
- 230000006838 adverse reaction Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 201000006549 dyspepsia Diseases 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 208000030533 eye disease Diseases 0.000 description 1
- 208000024798 heartburn Diseases 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 206010025482 malaise Diseases 0.000 description 1
- 238000010208 microarray analysis Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 231100000572 poisoning Toxicity 0.000 description 1
- 230000000607 poisoning effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000012502 risk assessment Methods 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 230000009278 visceral effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a method and a device for constructing a traditional medical disease body. The method comprises the steps of building a body classification frame; a mapping of the set of traditional medical condition terms belonging to each category to the set of reference traditional medical condition terms is effected. Wherein, using the traditional medical disease term set belonging to the ontology classification as a matching source set and referring to the traditional medical disease term set as a matching target set, and implementing the mapping from the traditional medical disease term set belonging to each classification to the referring traditional medical disease term set comprises: and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set. The method and the device are oriented to classification statistics, and a traditional medical disease body is automatically constructed; and also relates to establishing a cross mapping between the ontology and a reference traditional medical disease term set, which is the basis for promoting standardized management of medical services.
Description
Technical Field
The invention relates to an information processing technology and Chinese medicine, in particular to a method and a device for constructing a traditional medical disease body.
Background
The traditional Chinese medicine is original Chinese medicine, reflects the advantages and characteristics of Chinese medical science, and is an important component of world traditional medicine and excellent national culture. The dialectical and dialectical theory, theory and clinical practice, guidance and standardization of the traditional Chinese medicine are the living practices of essence inheritance and correction innovation. The development of the traditional Chinese medicine is highly emphasized at home and abroad, and the construction of a traditional Chinese medicine standard system is actively promoted. Since 1995, the national standards such as "terms of clinical diagnosis and treatment of traditional Chinese medicine" (Disease part) were issued by the government of China in sequence, and the world health organization also accepted the traditional medical terms represented by traditional Chinese medicine into the International Classification of Diseases (ICD) in 2009, which promoted the integration of traditional medicine and modern medicine. However, these national standards are formally represented as semi-structured data and do not establish cross-mappings with ICD classification codes; therefore, a traditional medical disease system which can be understood by both human and machine is constructed, organic unification between the traditional medical disease system and the ICD is realized on the semantic level, classification statistics of traditional Chinese medicine diseases is facilitated, and implementation of intelligent medical construction and health Chinese policies is promoted to a certain extent.
Ontology (Ontology) was originally derived from the ancient greek philosophy term of the 17 th century, meaning "systematic description of objective presentities in the world", i.e. the theory of existence. In the non-philosophy field of information science, the generally accepted ontological concept is "the definite formal specification that ontologies are shared concept models" proposed by Studer et al in 1998, and the definition embodies 4 signs that ontologies are shared, definite, conceptual, and formalized. Specifically, ontologies are used to describe important concepts and semantic relationships between concepts in a certain field, and these concepts and semantic relationships are required to be recognized and defined; in addition, the Ontology adopts the international formal Language (Web Ontology Language, OWL) to standardize the description field concepts and semantic relations thereof, so that the knowledge can be commonly understood by man and machine, thereby solving the information transmission and communication obstacles between man and machine. Because of these characteristics, ontologies provide important domain knowledge that can be understood by machines for the development of applications such as artificial intelligence; the ontology and the knowledge map become two cores of knowledge organization and intelligent application in knowledge engineering.
The biomedical field has been at the front of the body research. Since the advent of Gene Ontology (GO) in 1990, biomedical Ontology construction and application studies have received much attention, leading to a group of highly influential results, including Disease Ontology (DO), Human Phenotype Ontology (HPO), Adverse reactions Ontology (OAE), and so on. At present, biomedical ontologies are mainly applied to the aspects of alignment and integration of terms in different databases, basic medical research, development of intelligent decision systems and the like. For example, by integrating disease names in a disease ontology and a big mouse gene database, the data annotation effect of disease names among different species can be effectively improved; some scholars develop researches such as microarray analysis and gene function prediction based on gene ontology by customizing related software such as ChipInfo and the like; in addition, researchers construct special ontologies according to different application requirements, and further develop intelligent decision-making systems for disease diagnosis and treatment, disease risk assessment and the like.
ICD classification is the present international general authoritative grouping statistical tool, mainly used for disease and death cause statistics. In 2018, 18 th in 6 th month, the world health organization released the latest version of ICD-11, whose chinese version was officially released by the national health committee in the same year, 21 th in 12 th month and used all over the country. The ICD-11 Chinese version comprises 28 chapters, wherein the 26 th chapter is a traditional medical disease term, is particularly divided into two subsections of traditional medical diseases and traditional medical symptoms, and has positive significance for effectively standardizing disease classification and coding of medical institutions, perfecting a Chinese and western medical term standard system, improving the standardization level of medical services and medical management efficiency, and promoting the interaction of diagnosis and treatment information.
Disclosure of Invention
The invention innovatively provides a method and a device for constructing a traditional medical disease body, which can automatically carry out semantic matching between the traditional medical disease body and the existing traditional medical disease term system and can quickly and accurately obtain a matching result.
In order to achieve the above technical objects, in one aspect, the present invention discloses a method for constructing a body of a conventional medical disease. The construction method of the traditional medical disease ontology comprises the following steps: building a body classification frame; implementing a mapping of a set of traditional medical condition terms belonging to each classification to a set of reference traditional medical condition terms, wherein the set of traditional medical condition terms belonging to an ontology classification is taken as a set of matching sources and the set of reference traditional medical condition terms is taken as a set of matching targets, the implementing of the mapping of the set of traditional medical condition terms belonging to each classification to the set of reference traditional medical condition terms comprises: and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched with each other, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
Further, for the construction method of the traditional medical disease ontology, the precise matching includes at least one of the following matching manners: the disease terms in the matching source set are the same as the disease terms in the matching target set; disease terms in the matching source set are identical to disease term bodies in the matching target set; synonyms of disease terms in one set of the matching source set and the matching target set are the same as disease terms in the other set; and the disease term matching the source set is synonymous with the disease term matching the target set.
Further, for the method for constructing the ontology of traditional medical diseases, the mapping the set of traditional medical disease terms belonging to each category to the set of reference traditional medical disease terms further includes: for the disease terms which fail to be matched accurately, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set upwards, and the disease term pairs which are matched successfully upwards are found, wherein the upwards matching means that the connotation and the extension of the disease terms in the matching target set are larger than those of the disease terms in the matching source set.
Further, for the construction method of the traditional medical disease ontology, the upward matching includes the following principles: in a near principle, when a plurality of superior disease terms matching the disease terms in the source set are exactly matched with the disease terms in the matching target set, the disease terms in the matching target set matched with the superior disease terms with the closest grade are selected as the matching terms; when the grade difference between the matching source set and the matching target set is 1 and specific and/or non-specific disease terms exist in the disease terms of the next level of the matching source set, the correlative matching of the disease terms of the matching source set and the specific or non-specific disease terms of the next level of the disease terms of the matching target set is established, and the correlative matching is not upward matching.
Further, for the method for constructing the ontology of traditional medical diseases, the mapping the set of traditional medical disease terms belonging to each category to the set of reference traditional medical disease terms further includes: for the disease terms which fail to be matched upwards, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set downwards, and the disease term pairs which are successfully matched downwards are found, wherein the downwards matching means that the connotation and the extension of the disease terms in the matching target set are smaller than those of the disease terms in the matching source set.
Further, for the method for constructing the traditional medical disease ontology, the downward matching includes the following principles: selecting matching target set terms matched with lower-level disease terms of matching source set disease terms with the closest grade and matched with the matching target set disease terms to carry out downward matching according to a principle of closeness; when a disease term in the matching source set can establish a down-matching relationship with a plurality of disease terms in the matching target set, no down-matching is performed.
Further, for the method for constructing the ontology of traditional medical diseases, the mapping the set of traditional medical disease terms belonging to each category to the set of reference traditional medical disease terms further includes: and for the disease terms which fail to be matched downwards, performing relevant matching on the disease terms in the matching source set and the disease terms in the matching target set respectively, and finding out a disease term pair which is successfully matched in a relevant way, wherein the relevant matching means that the disease terms in the matching source set and the disease terms in the matching target set partially contain the connotation and the extension of each other.
Further, for the method for constructing the ontology of the traditional medical diseases, the relevant matching includes the following principles: correlating non-specific disease terms that match to the matching target set if the disease terms matching the source set have sub-disease terms; if the disease term of the source set is matched with the disease term of the non-son disease term, the disease term matched to the matching target set is related to be specific; and selecting the specific or non-specific disease terms under the matching target set disease terms matched by the superior disease terms which are matched accurately and have the closest grades according to the principle of the recent time.
In order to achieve the above technical object, in another aspect, the present invention discloses an apparatus for constructing a body of a conventional medical disease, including: the classification frame building unit is used for building a body classification frame; the mapping unit is used for mapping the traditional medical disease term set belonging to each classification to the reference traditional medical disease term set, wherein the traditional medical disease term set belonging to the ontology classification is used as a matching source set, the reference traditional medical disease term set is used as a matching target set, the mapping unit comprises an accurate matching module used for accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched accurately, and the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
To achieve the above technical object, in yet another aspect, the present invention discloses a computing device. The computing device includes: one or more processors, and a memory coupled with the one or more processors, the memory storing instructions that, when executed by the one or more processors, cause the one or more processors to perform the above-described method.
To achieve the above technical objects, in yet another aspect, the present invention discloses a machine-readable storage medium. The machine-readable storage medium stores executable instructions that, when executed, cause the machine to perform the above-described method.
The invention has the beneficial effects that:
the method and the device for constructing the traditional medical disease ontology are oriented to classification statistics, and the traditional medical disease ontology is automatically constructed by reusing the traditional Chinese medicine disease names and content structures in the existing national standard on the basis of fully using the successful experiences of typical biomedical ontologies at home and abroad, particularly the disease ontology; it also relates to the cross mapping between the ontology and the reference traditional medical disease term set, and the disease classification and code, the operation classification and code, the medical record top page, the medical noun terms and the like are the important bases for promoting the medical service standardization and the standardization management. The semantic association between the traditional medical disease term set in the traditional medical disease ontology and the reference traditional medical disease term set is automatically established, and the matching result can be quickly and accurately obtained.
Drawings
In the figure, the position of the upper end of the main shaft,
FIG. 1 is a diagram of a structural example of a traditional medical disease ontology system provided in embodiment 1 of the present invention;
FIG. 2 is a flow chart of a method for constructing a traditional medical disease ontology according to embodiment 2 of the present invention;
FIGS. 3A, 3B, 3C and 3D are 4 exemplary diagrams of exact matching in embodiment 2 of the present invention;
fig. 4 is an exemplary diagram of upward matching in embodiment 2 of the present invention;
FIG. 5 is a diagram showing an example of downward matching in embodiment 2 of the present invention;
FIG. 6 is a diagram showing an example of correlation matching in embodiment 2 of the present invention;
FIG. 7 is a schematic structural diagram of a conventional medical disease ontology construction apparatus provided in embodiment 3 of the present invention;
fig. 8 is a block diagram of a computing device for a construction process of a conventional medical disease ontology according to an embodiment of the present invention.
Detailed Description
The method and apparatus for constructing the body of a traditional medical disease provided by the present invention will be explained and explained in detail with reference to the drawings of the specification.
First, the national standard "clinical diagnosis and treatment terminology of TCM" (disease section) and its content structure are briefly introduced.
In 1997, the national standard "clinical diagnostic and therapeutic terms of TCM" (disease part) was first released. In 2017, the national traditional Chinese medicine administration revises the standard application situation in 2017 on the basis of comprehensive summary, and forms the Chinese medicine clinical diagnosis and treatment terminology (disease part (2017 edition)) (solicitation draft). The revision manuscript includes 17 major categories of exogenous diseases, parasitic diseases, poisoning and accidental injury diseases, visceral diseases and related diseases, related diseases of children, eye diseases, symptom terms for temporary diagnosis and the like, and 1356 traditional medical disease names, mainly solves the problem that partial disease terms in the original standard and the national standard of classification and code of Chinese medical diseases are inconsistent with the classification of the international disease classification system, and has the remarkable characteristics of definite disease name, standard definition, convenient search of synonym (near) meaning words and the like. Table 1 shows partial disease names in the revised draft of "clinical diagnosis and treatment terminology of chinese medicine" (disease part (2017 edition)), whose main contents include classification codes (i.e., hierarchical system), traditional medical disease names in chinese and english, definitions, and search terms (i.e., terms such as synonyms and synonyms).
TABLE 1 partial contents and structural examples of the national Standard "Chinese medicine clinical diagnosis and treatment terminology" (disease part)
The following is a description of the conventional medical disease entity provided in example 1 of the present invention.
The embodiment sufficiently refers to the achievement construction experience of disease ontology, and builds attributes such as labels (label) of traditional medical disease ontology class, hierarchy and class, synonyms (has _ exact _ synnym), definitions (definition) and cross-mappings (database _ cross _ reference) with other word lists by multiplexing the content and structure of traditional Chinese medicine clinical diagnosis and treatment terminology (disease part (2017 version)) (solicited opinions). International Resource Identifiers (IRI) are used as unique Identifiers of ontologies in the world, and interaction and multiplexing among different ontologies are facilitated. The IRI of the "ontology of traditional medical diseases" is expressed in the form of "TCMO _ seven digits" and the seven digits increase from 0000001, for example, the IRI of "diseases caused by exogenous diseases" is "TCMO _ 0000774". The inter-body class hierarchy is established according to the classification code of the national standard disease name, and the class label, synonym and definition of Chinese and English language are respectively multiplexed with Chinese and English disease names, search words and definitions. In addition, the "conventional medical disease entity" embodies cross-mapping with national standards and international disease classification-11, and for example, the mapping of "exogenous seasonal sickness" with national standards is expressed as "database _ cross _ reference GB/T2017: a 01.01", and the cross-mapping with ICD-11 is expressed as "database _ cross _ reference ICD11-ZH: SE 2Z". Fig. 1 shows a basic content structure of the "ontology of traditional medical diseases" according to embodiment 1 of the present invention, and its correspondence with the national standard and the chinese version of ICD-11.
Fig. 2 is a flowchart of a method for constructing a traditional medical disease ontology according to embodiment 2 of the present invention.
As shown in fig. 2, at step S210, a ontology classification framework is built. And building a body classification framework based on the existing disease classification system.
In step S220, a mapping of the set of conventional medical condition terms belonging to the respective classification to the set of reference conventional medical condition terms is implemented. Wherein, regarding the traditional medical disease term set belonging to the ontology classification as the matching source set, and referring to the traditional medical disease term set as the matching target set, the step S220 includes the following processes: and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched with each other, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
The precise matching may include at least one of the following matching modes: the disease terms in the matching source set are the same as the disease terms in the matching target set; the disease term in the matching source set is the same as the subject of the disease term in the matching target set; synonyms of disease terms in one set of the matching source set and the matching target set are the same as disease terms in the other set; and the disease term matching the source set is synonymous with the disease term matching the target set.
Taking the example of taking the traditional medical disease terminology set as the Chinese version of International Classification of diseases-11 (ICD-11), the mapping method of the traditional medical disease ontology and the Chinese version of International Classification of diseases-11 adopts a computer processing mode, which can assist manual review and establish semantic association between the two. The former is the matching source set, which is shown as source table A in the figure (total 1356 Chinese medical names), and the latter is the matching target set, which is shown as target table B in the figure (comprising 251 classification names related to the traditional medical diseases).
Exact Match (Exact Match) indicates semantic equivalence, i.e., the connotation and the extension of the traditional medical disease names in tables a and B are equal. In the method, the following conditions are all accurate matching: (1) the disease terms in the two tables are identical, e.g., "insomnia" (A04.01.12) in table a and "insomnia" (SD84) in table B, as shown in fig. 3A; (2) the disease names in table a are identical to the classification names in table B, except for the suffix "disease", for example, "pavor" (A04.01.09) and "pavor" (SA10), as shown in fig. 3B; (3) synonyms for the disease names in table a are the same as the category names in table B, such as "thoracic obstruction heartburn" (A04.01.01, synonym "thoracic obstruction") and "thoracic obstruction" (L4-SA2), as shown in fig. 3C; (4) the words are different but have the same meaning, for example, "visceral diseases and related diseases (A04)" and "visceral system diseases (L2-SA 0)", as shown in FIG. 3D.
In this embodiment, step S220 may further include the steps of: for the disease terms which fail to be matched accurately, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set upwards, and the disease term pairs which are matched successfully upwards are found, wherein the upwards matching means that the connotation and the extension of the disease terms in the matching target set are larger than those of the disease terms in the matching source set. The upward matching may include the following principles: in a near principle, when a plurality of superior disease terms matching the disease terms in the source set are exactly matched with the disease terms in the matching target set, the disease terms in the matching target set matched with the superior disease terms with the closest rank are selected as matching terms; when the grade difference between the matching source set and the matching target set is 1 and the disease term of the next level of the matching source set exists in the specific and/or non-specific disease term, the correlative matching of the disease term of the matching source set and the specific or non-specific disease term of the next level of the disease term of the matching target set is established, and the correlative matching is not upward matching.
Still taking the matching target set as the Chinese version of International Classification of diseases-11 as an example, the upward matching (Broad Match) means that the inclusion relationship, i.e., the connotation and extension of the taxonomic names in Table B is larger than the disease names in Table A. The upward matching follows the following principle: (1) applying a near principle, when a plurality of superior disease names of the disease names in the table A are accurately matched with the classification names in the table B, selecting the table B classification name matched with the superior disease name with the closest grade; (2) to be more close to the nature of the disease, when the grade difference between the table A and the table B is 1 and the specific and/or non-specific class names exist in the next class names of the table B, the related matching of the table A class and the specific and non-specific class names of the table B class is established, but not the upward matching. For example, "yang yellow disease" (A04.02.03.01) in table a is the disease name of class 4, there is no exact match, its corresponding class 3 disease name is "jaundice" (A04.02.03) can be matched to "jaundice" (SA01) in table B, class 2 disease name "hepatic disease" (a04.02) can be matched to "hepatic disease class" (L3-SA0) in table B, and under "jaundice" (SA01) unspecified and unspecified class names, according to the rule of proximity, an upward matching relationship is established between "yang yellow disease" (A04.02.03.01) and "jaundice" (SA01), as shown in fig. 4. Wherein, non-specific (NOS) is a special classification name, which is used for the situation that the specific subdivision type cannot be determined due to incomplete information in the actual encoding process; the (other) specific classes are classified systems such as ICD, and do not cover all classification names, only a few common 'specific' classes are listed, and other subdivided classes which can be specifically judged according to actual information can be classified into 'other specific' classes.
In this embodiment, step S220 may further include the steps of: for the disease terms which fail to be matched upwards, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set downwards, and the disease term pairs which are successfully matched downwards are found, wherein downwards matching means that the connotation and the extension of the disease terms in the matching target set are smaller than those of the disease terms in the matching source set. The down-matching may include the following principles: selecting matching target set terms matched with the lower-level disease terms of the matching source set disease terms with the closest matching grades to be matched downwards according to the principle of closeness; when the disease term in the matching source set can establish a down-matching relationship with a plurality of disease terms in the matching target set, the down-matching is not performed.
Still taking the matching target set as the Chinese version of International Classification of diseases-11 as an example, the Down Match (narrow Match) indicates that the content and extent of the class name contained in Table B is smaller than that in Table A. In this embodiment, the downward matching follows the following principle: selecting the table B classification name matched with the lower-level disease name of the table A disease name with the most similar grade and accurately matched by applying a proximity principle to carry out downward matching; when the category in the table A can establish a downward matching relationship with a plurality of table B category names, downward matching is not performed. As the disease name "blackessence" (A11.01.04) in table a does not have a precise match, the only disease name "mixed eye" (A11.01.04.11) in the next 1 st level can be precisely matched to "mixed eye" (SC74) in table B, at which time a downward matching relationship is established between "blackessence" (A11.01.04) and "mixed eye" (A11.01.04.11), as shown in fig. 5.
In this embodiment, step S220 may further include the steps of: and for the disease terms which fail to be matched downwards, performing correlation matching on the disease terms in the matching source set and the disease terms in the matching target set respectively to find a disease term pair which is successfully matched, wherein the correlation matching refers to the meaning and the extension of the disease terms in the matching source set and the disease terms in the matching target set which partially contain the opposite parties with each other. The correlation matching includes the following principles: correlating non-specific disease terms that match to the matching target set if the disease terms that match the source set have sub-disease terms; if the disease term of the source set is matched with the disease term of the non-son disease term, the disease term matched to the matching target set is related to be specific; and selecting the specific or non-specific disease terms under the disease terms of the matching target set matched by the superior disease terms which are matched accurately and have the closest grades according to the principle of the recent time.
Still taking the matching target set as the Chinese version of International Classification of diseases-11 as an example, the related matching (related Match) represents the intersection relationship, i.e. the tables A and B partially contain the connotation and extension of each other. In this example, there are temporary diagnostic syndrome terms in Table A, where some disease names can be exactly matched to the category names in Table B, and the remaining terms that fail to exactly match are considered as being related. In addition, correlation matching follows the following principles: if the disease name in the table A has a sub-disease name, the disease name is related and matched to the unspecified classification name in the table B; if the category A has no child disease names, other specific category names are matched in a relevant way; and (3) selecting other specific or unspecified classified names under the list B category matched with the precisely matched superior disease names with the closest grades by applying the principle of near. For example, in Table A, "New Cold" (A01.03.01.01) exists, which has no child disease name, and the disease name of level 1 above it is "Warm" (A01.03.01), the disease name of level 3 above it is "external Cold" (A01), and both of them establish exact match with "Warm" (L3-SE0) and "external Cold" (L2-SD9), "Warm" (L3-SE0) exists "other specific Cold" (SE0Y), and "unspecified Warm" (SE0Z), "external Cold" (L2-SD9) exists "other specific Cold" (SE2Y) and "unspecified Cold" (SE2Z), at this time, since "New Cold" (A01.03.01.01) has no child disease, other specific class name should be selected, and "other specific Cold" (SE0) is the class name of the class of the most, so that a close match is established between them, as shown in fig. 6.
In this embodiment, when the above four matching relationships are established, the selected sequence is exact matching, upward matching, downward matching, and correlation matching. Of course, the order of exact match-down match-up match-correlation match can also be chosen.
Fig. 7 is a schematic structural diagram of a conventional medical disease ontology construction apparatus provided in embodiment 3 of the present invention. As shown in fig. 7, the conventional medical disease ontology constructing apparatus 700 provided by this embodiment includes a classification framework building unit 710 and a mapping unit 720. The mapping unit 720 includes an exact match module 721.
The classification frame building unit 710 is used for building an ontology classification frame. The operation of the classification frame building unit 710 may refer to the operation of step S210 described above with reference to fig. 2.
The mapping unit 720 is configured to implement mapping of the traditional medical condition term sets belonging to the respective classifications to the reference traditional medical condition term set, wherein the traditional medical condition term sets belonging to the ontology classification are used as matching source sets, and the reference traditional medical condition term set is used as a matching target set. The operation of the mapping unit 720 may refer to the operation of step S220 described above with reference to fig. 2.
The exact matching module 721 is configured to exactly match each disease term in the matching source set with each disease term in the matching target set, respectively, and find a pair of disease terms that are successfully matched, where exact matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
The precise matching may include at least one of the following matching modes: the disease terms in the matching source set are the same as the disease terms in the matching target set; the disease term in the matching source set is the same as the subject of the disease term in the matching target set; synonyms of disease terms in one set of the matching source set and the matching target set are the same as disease terms in the other set; and the disease term matching the source set is synonymous with the disease term matching the target set.
The mapping unit 720 may further include an upward matching module, configured to, for a disease term that fails to be precisely matched, perform upward matching on each disease term in the matching source set and each disease term in the matching target set, and find a disease term pair that is successfully matched upward, where the upward matching means that the content and the extent of the disease terms in the matching target set are greater than those in the matching source set. The upward matching may include the following principles: in a near principle, when a plurality of superior disease terms matching the disease terms in the source set are exactly matched with the disease terms in the matching target set, the disease terms in the matching target set matched with the superior disease terms with the closest grade are selected as the matching terms; when the grade difference between the matching source set and the matching target set is 1 and the specific and/or non-specific disease terms exist in the disease terms of the next level of the matching source set, the correlative matching of the disease terms of the matching source set and the specific or non-specific disease terms of the next level of the disease terms of the matching target set is established, and the correlative matching is not upward matching.
The mapping unit 720 may further include a downward matching module, configured to, for a disease term that fails to be matched upwards, match down each disease term in the matching source set with each disease term in the matching target set, respectively, and find a disease term pair that is successfully matched downwards, where downward matching refers to that the content and extent of the disease terms in the matching target set are smaller than those in the matching source set. The downward matching may include the following principles: applying a proximity principle, and selecting a matching target set term matched with the lower-level disease term of the matching source set disease term with the closest matched grade for downward matching; when the disease term in the matching source set can establish a down-matching relationship with a plurality of disease terms in the matching target set, the down-matching is not performed.
The mapping unit 720 may further include a correlation matching module, configured to, for disease terms that fail to be matched downwards, perform correlation matching on each disease term in the matching source set and the disease term in the matching target set, respectively, to find a disease term pair that is successfully matched in correlation, where correlation matching refers to that the disease terms in the matching source set and the disease terms in the matching target set partially include the connotation and the extension of each other. The correlation matching includes the following principles: correlating non-specific disease terms that match to the matching target set if the disease terms that match the source set have sub-disease terms; if the disease term of the source set is matched with the disease term without sub-disease terms, the disease term matched to the matching target set is related to be specific; and (3) selecting specific or non-specific disease terms under the matching target set disease terms matched by the superior disease terms which are matched accurately and have the closest grades by applying the principle of near.
Fig. 8 is a block diagram of a computing device for a construction process of a conventional medical disease ontology according to an embodiment of the present invention.
As shown in fig. 8, computing device 800 may include at least one processor 810, storage 820, memory 830, communication interface 840, and internal bus 850, with the at least one processor 810, storage 820, memory 830, and communication interface 840 being connected together via bus 850. The at least one processor 810 executes at least one computer-readable instruction (i.e., an element described above as being implemented in software) stored or encoded in a computer-readable storage medium (i.e., the memory 820).
In one embodiment, stored in the memory 820 are computer-executable instructions that, when executed, cause the at least one processor 810 to: building a body classification frame; implementing a mapping of the set of traditional medical condition terms belonging to each classification to the set of reference traditional medical condition terms, wherein the set of traditional medical condition terms belonging to the subject classification is taken as a set of matching sources and the set of reference traditional medical condition terms is taken as a set of matching targets, the implementing of the mapping of the set of traditional medical condition terms belonging to each classification to the set of reference traditional medical condition terms comprises: and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched with each other, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
It should be understood that the computer-executable instructions stored in the memory 820, when executed, cause the at least one processor 810 to perform the various operations and functions described above in connection with fig. 1-6 in the various embodiments of the present disclosure.
In the present disclosure, computing device 800 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handsets, messaging devices, wearable computing devices, consumer electronics, and the like.
According to one embodiment, a program product, such as a non-transitory machine-readable medium, is provided. A non-transitory machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-6 in various embodiments of the present disclosure.
Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the claims, and all equivalent structures or equivalent processes that are transformed by the content of the specification and the drawings, or directly or indirectly applied to other related technical fields are included in the scope of the claims.
Claims (11)
1. A method for constructing a traditional medical disease ontology is characterized by comprising the following steps:
building a body classification frame;
a mapping of the set of conventional medical condition terms belonging to the respective classification to the set of reference conventional medical condition terms is effected, wherein,
taking the traditional medical disease term set belonging to the ontology classification as a matching source set and the reference traditional medical disease term set as a matching target set, and implementing the mapping from the traditional medical disease term set belonging to each classification to the reference traditional medical disease term set comprises:
and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched with each other, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
2. The method for constructing the ontology of traditional medical diseases according to claim 1, wherein the exact matching comprises at least one of the following matching manners:
the disease terms in the matching source set are the same as the disease terms in the matching target set;
the disease term in the matching source set is the same as the subject of the disease term in the matching target set;
synonyms of disease terms in one set of the matching source set and the matching target set are the same as disease terms in the other set; and
the disease terms matching the source set have the same connotation as the disease terms matching the target set.
3. The method of constructing a ontology of traditional medical conditions according to claim 1, wherein said mapping the set of traditional medical conditions terms belonging to each category to the set of reference traditional medical conditions terms further comprises: for the disease terms which fail to be matched accurately, upward matching is carried out on each disease term in the matching source set and each disease term in the matching target set respectively, and a disease term pair which is matched successfully upwards is found, wherein the upward matching means that the connotation and the extension of the disease terms in the matching target set are larger than those of the disease terms in the matching source set.
4. The method for constructing the ontology of traditional medical diseases according to claim 3, wherein the upward matching comprises the following principles:
in a near principle, when a plurality of superior disease terms matching the disease terms in the source set are exactly matched with the disease terms in the matching target set, the disease terms in the matching target set matched with the superior disease terms with the closest grade are selected as the matching terms;
when the grade difference between the matching source set and the matching target set is 1 and the disease term of the next level of the matching source set exists in the specific and/or non-specific disease term, the correlative matching of the disease term of the matching source set and the specific or non-specific disease term of the next level of the disease term of the matching target set is established, and the correlative matching is not upward matching.
5. The method of constructing a ontology of traditional medical conditions according to claim 3, wherein said mapping the set of traditional medical conditions terms belonging to each category to the set of reference traditional medical conditions terms further comprises: for the disease terms which fail to be matched upwards, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set downwards, and the disease term pairs which are successfully matched downwards are found, wherein the downwards matching means that the connotation and the extension of the disease terms in the matching target set are smaller than those of the disease terms in the matching source set.
6. The method for constructing the ontology of traditional medical diseases according to claim 5, wherein the downward matching comprises the following principles:
selecting matching target set terms matched with lower-level disease terms of matching source set disease terms with the closest grade after accurate matching for downward matching according to a principle of closeness;
when the disease term in the matching source set can establish a down-matching relationship with a plurality of disease terms in the matching target set, the down-matching is not performed.
7. The method of constructing a ontology of traditional medical conditions according to claim 5, wherein said mapping the set of traditional medical conditions terms belonging to each category to the set of reference traditional medical conditions terms further comprises: and for the disease terms which fail to be matched downwards, performing relevant matching on the disease terms in the matching source set and the disease terms in the matching target set respectively, and finding out a disease term pair which is successfully matched in a relevant way, wherein the relevant matching means that the disease terms in the matching source set and the disease terms in the matching target set partially contain the connotation and the extension of each other.
8. The method for constructing the ontology of traditional medical diseases according to claim 7, wherein the correlation matching comprises the following principles:
correlating non-specific disease terms that match to the matching target set if the disease terms that match the source set have sub-disease terms; if the disease term of the source set is matched with the disease term of the non-son disease term, the disease term matched to the matching target set is related to be specific;
and selecting the specific or non-specific disease terms under the matching target set disease terms matched by the superior disease terms which are matched accurately and have the closest grades according to the principle of the recent time.
9. A device for constructing a traditional medical disease body is characterized by comprising:
the classification frame building unit is used for building a body classification frame;
a mapping unit for implementing the mapping of the traditional medical disease term sets belonging to the respective classifications to the reference traditional medical disease term set, wherein the traditional medical disease term sets belonging to the ontology classification are used as the matching source set, the reference traditional medical disease term set is used as the matching target set,
the mapping unit comprises an accurate matching module, and is used for accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair which is successfully matched, wherein accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
10. A computing device, comprising:
one or more processors, and
a memory coupled with the one or more processors, the memory storing instructions that, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
11. A machine-readable storage medium having stored thereon executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011222616.5A CN112445917A (en) | 2020-11-05 | 2020-11-05 | Method and device for constructing traditional medical disease body |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011222616.5A CN112445917A (en) | 2020-11-05 | 2020-11-05 | Method and device for constructing traditional medical disease body |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112445917A true CN112445917A (en) | 2021-03-05 |
Family
ID=74735854
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011222616.5A Pending CN112445917A (en) | 2020-11-05 | 2020-11-05 | Method and device for constructing traditional medical disease body |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112445917A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113221543A (en) * | 2021-05-07 | 2021-08-06 | 中国医学科学院医学信息研究所 | Medical term integration method and system |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105069124A (en) * | 2015-08-13 | 2015-11-18 | 易保互联医疗信息科技(北京)有限公司 | Automatic ICD (International Classification of Diseases) coding method and system |
CN105574103A (en) * | 2015-12-11 | 2016-05-11 | 浙江大学 | Method and system for automatically establishing medical term mapping relationship based on word segmentation and coding |
CN106919793A (en) * | 2017-02-24 | 2017-07-04 | 黑龙江特士信息技术有限公司 | A kind of data standardization processing method and device of medical big data |
CN110096635A (en) * | 2019-04-17 | 2019-08-06 | 广东技术师范大学 | A kind of the inquiry visual display method and device of traditional Chinese and western medicine medicine information |
CN111797207A (en) * | 2020-07-14 | 2020-10-20 | 山东健康医疗大数据有限公司 | Method for realizing hospital diagnosis data standardization |
-
2020
- 2020-11-05 CN CN202011222616.5A patent/CN112445917A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105069124A (en) * | 2015-08-13 | 2015-11-18 | 易保互联医疗信息科技(北京)有限公司 | Automatic ICD (International Classification of Diseases) coding method and system |
CN105574103A (en) * | 2015-12-11 | 2016-05-11 | 浙江大学 | Method and system for automatically establishing medical term mapping relationship based on word segmentation and coding |
CN106919793A (en) * | 2017-02-24 | 2017-07-04 | 黑龙江特士信息技术有限公司 | A kind of data standardization processing method and device of medical big data |
CN110096635A (en) * | 2019-04-17 | 2019-08-06 | 广东技术师范大学 | A kind of the inquiry visual display method and device of traditional Chinese and western medicine medicine information |
CN111797207A (en) * | 2020-07-14 | 2020-10-20 | 山东健康医疗大数据有限公司 | Method for realizing hospital diagnosis data standardization |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113221543A (en) * | 2021-05-07 | 2021-08-06 | 中国医学科学院医学信息研究所 | Medical term integration method and system |
CN113221543B (en) * | 2021-05-07 | 2023-10-10 | 中国医学科学院医学信息研究所 | Medical term integration method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220043813A1 (en) | Method and system for ontology driven data collection and processing | |
US11921769B2 (en) | Ontology mapping method and apparatus | |
Fu | FCA based ontology development for data integration | |
Cortis et al. | Discovering semantic equivalence of people behind online profiles | |
CN114153994A (en) | Medical insurance information question and answer method and device | |
Ramar et al. | Technical review on ontology mapping techniques | |
Wang et al. | Effective online knowledge graph fusion | |
Tinelli et al. | Embedding semantics in human resources management automation via SQL | |
Pereira et al. | Querying semantic catalogues of biomedical databases | |
Gollapalli | Literature review of attribute level and structure level data linkage techniques | |
CN112445917A (en) | Method and device for constructing traditional medical disease body | |
KR20210150103A (en) | Collaborative partner recommendation system and method based on user information | |
Oliveira et al. | Automatic semantic enrichment of data services | |
Kumar et al. | A Semantic Query Transformation Approach Based on Ontology for Search Engine | |
Yu et al. | Data service generation framework from heterogeneous printed forms using semantic link discovery | |
Miah et al. | Ontology techniques for representing the problem of discourse: Design of solution application perspective | |
Binnig et al. | DeepVizdom: Deep Interactive Data Exploration | |
Rodger et al. | Mobile speech and the armed services: making a case for adding siri-like features to vamta (voice-activated medical tracking application) | |
CN114564599B (en) | Retrieval system based on query string template | |
Liu et al. | Adaptive semantic matching in a multilingual context | |
Cortis et al. | Techniques for the identification of semantically-equivalent online identities | |
Feng et al. | Extracting meaningful correlations among heterogeneous datasets for medical question answering with domain knowledge | |
Yu | A fast retrieval method of drug information based on multidimensional data analysis | |
Po et al. | Automatic Lexical Annotation: an effective technique for dynamic data integration | |
Yu et al. | A multilingual ontology-based approach to attribute correspondence identification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210305 |
|
RJ01 | Rejection of invention patent application after publication |