CN112445917A - Method and device for constructing traditional medical disease body - Google Patents

Method and device for constructing traditional medical disease body Download PDF

Info

Publication number
CN112445917A
CN112445917A CN202011222616.5A CN202011222616A CN112445917A CN 112445917 A CN112445917 A CN 112445917A CN 202011222616 A CN202011222616 A CN 202011222616A CN 112445917 A CN112445917 A CN 112445917A
Authority
CN
China
Prior art keywords
disease
matching
terms
term
traditional medical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011222616.5A
Other languages
Chinese (zh)
Inventor
朱彦
陈超
刘静
贾李蓉
高博
刘丽红
聂莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute Of Information On Traditional Chinese Medicine Cacms
Original Assignee
Institute Of Information On Traditional Chinese Medicine Cacms
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute Of Information On Traditional Chinese Medicine Cacms filed Critical Institute Of Information On Traditional Chinese Medicine Cacms
Priority to CN202011222616.5A priority Critical patent/CN112445917A/en
Publication of CN112445917A publication Critical patent/CN112445917A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a method and a device for constructing a traditional medical disease body. The method comprises the steps of building a body classification frame; a mapping of the set of traditional medical condition terms belonging to each category to the set of reference traditional medical condition terms is effected. Wherein, using the traditional medical disease term set belonging to the ontology classification as a matching source set and referring to the traditional medical disease term set as a matching target set, and implementing the mapping from the traditional medical disease term set belonging to each classification to the referring traditional medical disease term set comprises: and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set. The method and the device are oriented to classification statistics, and a traditional medical disease body is automatically constructed; and also relates to establishing a cross mapping between the ontology and a reference traditional medical disease term set, which is the basis for promoting standardized management of medical services.

Description

Method and device for constructing traditional medical disease body
Technical Field
The invention relates to an information processing technology and Chinese medicine, in particular to a method and a device for constructing a traditional medical disease body.
Background
The traditional Chinese medicine is original Chinese medicine, reflects the advantages and characteristics of Chinese medical science, and is an important component of world traditional medicine and excellent national culture. The dialectical and dialectical theory, theory and clinical practice, guidance and standardization of the traditional Chinese medicine are the living practices of essence inheritance and correction innovation. The development of the traditional Chinese medicine is highly emphasized at home and abroad, and the construction of a traditional Chinese medicine standard system is actively promoted. Since 1995, the national standards such as "terms of clinical diagnosis and treatment of traditional Chinese medicine" (Disease part) were issued by the government of China in sequence, and the world health organization also accepted the traditional medical terms represented by traditional Chinese medicine into the International Classification of Diseases (ICD) in 2009, which promoted the integration of traditional medicine and modern medicine. However, these national standards are formally represented as semi-structured data and do not establish cross-mappings with ICD classification codes; therefore, a traditional medical disease system which can be understood by both human and machine is constructed, organic unification between the traditional medical disease system and the ICD is realized on the semantic level, classification statistics of traditional Chinese medicine diseases is facilitated, and implementation of intelligent medical construction and health Chinese policies is promoted to a certain extent.
Ontology (Ontology) was originally derived from the ancient greek philosophy term of the 17 th century, meaning "systematic description of objective presentities in the world", i.e. the theory of existence. In the non-philosophy field of information science, the generally accepted ontological concept is "the definite formal specification that ontologies are shared concept models" proposed by Studer et al in 1998, and the definition embodies 4 signs that ontologies are shared, definite, conceptual, and formalized. Specifically, ontologies are used to describe important concepts and semantic relationships between concepts in a certain field, and these concepts and semantic relationships are required to be recognized and defined; in addition, the Ontology adopts the international formal Language (Web Ontology Language, OWL) to standardize the description field concepts and semantic relations thereof, so that the knowledge can be commonly understood by man and machine, thereby solving the information transmission and communication obstacles between man and machine. Because of these characteristics, ontologies provide important domain knowledge that can be understood by machines for the development of applications such as artificial intelligence; the ontology and the knowledge map become two cores of knowledge organization and intelligent application in knowledge engineering.
The biomedical field has been at the front of the body research. Since the advent of Gene Ontology (GO) in 1990, biomedical Ontology construction and application studies have received much attention, leading to a group of highly influential results, including Disease Ontology (DO), Human Phenotype Ontology (HPO), Adverse reactions Ontology (OAE), and so on. At present, biomedical ontologies are mainly applied to the aspects of alignment and integration of terms in different databases, basic medical research, development of intelligent decision systems and the like. For example, by integrating disease names in a disease ontology and a big mouse gene database, the data annotation effect of disease names among different species can be effectively improved; some scholars develop researches such as microarray analysis and gene function prediction based on gene ontology by customizing related software such as ChipInfo and the like; in addition, researchers construct special ontologies according to different application requirements, and further develop intelligent decision-making systems for disease diagnosis and treatment, disease risk assessment and the like.
ICD classification is the present international general authoritative grouping statistical tool, mainly used for disease and death cause statistics. In 2018, 18 th in 6 th month, the world health organization released the latest version of ICD-11, whose chinese version was officially released by the national health committee in the same year, 21 th in 12 th month and used all over the country. The ICD-11 Chinese version comprises 28 chapters, wherein the 26 th chapter is a traditional medical disease term, is particularly divided into two subsections of traditional medical diseases and traditional medical symptoms, and has positive significance for effectively standardizing disease classification and coding of medical institutions, perfecting a Chinese and western medical term standard system, improving the standardization level of medical services and medical management efficiency, and promoting the interaction of diagnosis and treatment information.
Disclosure of Invention
The invention innovatively provides a method and a device for constructing a traditional medical disease body, which can automatically carry out semantic matching between the traditional medical disease body and the existing traditional medical disease term system and can quickly and accurately obtain a matching result.
In order to achieve the above technical objects, in one aspect, the present invention discloses a method for constructing a body of a conventional medical disease. The construction method of the traditional medical disease ontology comprises the following steps: building a body classification frame; implementing a mapping of a set of traditional medical condition terms belonging to each classification to a set of reference traditional medical condition terms, wherein the set of traditional medical condition terms belonging to an ontology classification is taken as a set of matching sources and the set of reference traditional medical condition terms is taken as a set of matching targets, the implementing of the mapping of the set of traditional medical condition terms belonging to each classification to the set of reference traditional medical condition terms comprises: and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched with each other, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
Further, for the construction method of the traditional medical disease ontology, the precise matching includes at least one of the following matching manners: the disease terms in the matching source set are the same as the disease terms in the matching target set; disease terms in the matching source set are identical to disease term bodies in the matching target set; synonyms of disease terms in one set of the matching source set and the matching target set are the same as disease terms in the other set; and the disease term matching the source set is synonymous with the disease term matching the target set.
Further, for the method for constructing the ontology of traditional medical diseases, the mapping the set of traditional medical disease terms belonging to each category to the set of reference traditional medical disease terms further includes: for the disease terms which fail to be matched accurately, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set upwards, and the disease term pairs which are matched successfully upwards are found, wherein the upwards matching means that the connotation and the extension of the disease terms in the matching target set are larger than those of the disease terms in the matching source set.
Further, for the construction method of the traditional medical disease ontology, the upward matching includes the following principles: in a near principle, when a plurality of superior disease terms matching the disease terms in the source set are exactly matched with the disease terms in the matching target set, the disease terms in the matching target set matched with the superior disease terms with the closest grade are selected as the matching terms; when the grade difference between the matching source set and the matching target set is 1 and specific and/or non-specific disease terms exist in the disease terms of the next level of the matching source set, the correlative matching of the disease terms of the matching source set and the specific or non-specific disease terms of the next level of the disease terms of the matching target set is established, and the correlative matching is not upward matching.
Further, for the method for constructing the ontology of traditional medical diseases, the mapping the set of traditional medical disease terms belonging to each category to the set of reference traditional medical disease terms further includes: for the disease terms which fail to be matched upwards, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set downwards, and the disease term pairs which are successfully matched downwards are found, wherein the downwards matching means that the connotation and the extension of the disease terms in the matching target set are smaller than those of the disease terms in the matching source set.
Further, for the method for constructing the traditional medical disease ontology, the downward matching includes the following principles: selecting matching target set terms matched with lower-level disease terms of matching source set disease terms with the closest grade and matched with the matching target set disease terms to carry out downward matching according to a principle of closeness; when a disease term in the matching source set can establish a down-matching relationship with a plurality of disease terms in the matching target set, no down-matching is performed.
Further, for the method for constructing the ontology of traditional medical diseases, the mapping the set of traditional medical disease terms belonging to each category to the set of reference traditional medical disease terms further includes: and for the disease terms which fail to be matched downwards, performing relevant matching on the disease terms in the matching source set and the disease terms in the matching target set respectively, and finding out a disease term pair which is successfully matched in a relevant way, wherein the relevant matching means that the disease terms in the matching source set and the disease terms in the matching target set partially contain the connotation and the extension of each other.
Further, for the method for constructing the ontology of the traditional medical diseases, the relevant matching includes the following principles: correlating non-specific disease terms that match to the matching target set if the disease terms matching the source set have sub-disease terms; if the disease term of the source set is matched with the disease term of the non-son disease term, the disease term matched to the matching target set is related to be specific; and selecting the specific or non-specific disease terms under the matching target set disease terms matched by the superior disease terms which are matched accurately and have the closest grades according to the principle of the recent time.
In order to achieve the above technical object, in another aspect, the present invention discloses an apparatus for constructing a body of a conventional medical disease, including: the classification frame building unit is used for building a body classification frame; the mapping unit is used for mapping the traditional medical disease term set belonging to each classification to the reference traditional medical disease term set, wherein the traditional medical disease term set belonging to the ontology classification is used as a matching source set, the reference traditional medical disease term set is used as a matching target set, the mapping unit comprises an accurate matching module used for accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched accurately, and the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
To achieve the above technical object, in yet another aspect, the present invention discloses a computing device. The computing device includes: one or more processors, and a memory coupled with the one or more processors, the memory storing instructions that, when executed by the one or more processors, cause the one or more processors to perform the above-described method.
To achieve the above technical objects, in yet another aspect, the present invention discloses a machine-readable storage medium. The machine-readable storage medium stores executable instructions that, when executed, cause the machine to perform the above-described method.
The invention has the beneficial effects that:
the method and the device for constructing the traditional medical disease ontology are oriented to classification statistics, and the traditional medical disease ontology is automatically constructed by reusing the traditional Chinese medicine disease names and content structures in the existing national standard on the basis of fully using the successful experiences of typical biomedical ontologies at home and abroad, particularly the disease ontology; it also relates to the cross mapping between the ontology and the reference traditional medical disease term set, and the disease classification and code, the operation classification and code, the medical record top page, the medical noun terms and the like are the important bases for promoting the medical service standardization and the standardization management. The semantic association between the traditional medical disease term set in the traditional medical disease ontology and the reference traditional medical disease term set is automatically established, and the matching result can be quickly and accurately obtained.
Drawings
In the figure, the position of the upper end of the main shaft,
FIG. 1 is a diagram of a structural example of a traditional medical disease ontology system provided in embodiment 1 of the present invention;
FIG. 2 is a flow chart of a method for constructing a traditional medical disease ontology according to embodiment 2 of the present invention;
FIGS. 3A, 3B, 3C and 3D are 4 exemplary diagrams of exact matching in embodiment 2 of the present invention;
fig. 4 is an exemplary diagram of upward matching in embodiment 2 of the present invention;
FIG. 5 is a diagram showing an example of downward matching in embodiment 2 of the present invention;
FIG. 6 is a diagram showing an example of correlation matching in embodiment 2 of the present invention;
FIG. 7 is a schematic structural diagram of a conventional medical disease ontology construction apparatus provided in embodiment 3 of the present invention;
fig. 8 is a block diagram of a computing device for a construction process of a conventional medical disease ontology according to an embodiment of the present invention.
Detailed Description
The method and apparatus for constructing the body of a traditional medical disease provided by the present invention will be explained and explained in detail with reference to the drawings of the specification.
First, the national standard "clinical diagnosis and treatment terminology of TCM" (disease section) and its content structure are briefly introduced.
In 1997, the national standard "clinical diagnostic and therapeutic terms of TCM" (disease part) was first released. In 2017, the national traditional Chinese medicine administration revises the standard application situation in 2017 on the basis of comprehensive summary, and forms the Chinese medicine clinical diagnosis and treatment terminology (disease part (2017 edition)) (solicitation draft). The revision manuscript includes 17 major categories of exogenous diseases, parasitic diseases, poisoning and accidental injury diseases, visceral diseases and related diseases, related diseases of children, eye diseases, symptom terms for temporary diagnosis and the like, and 1356 traditional medical disease names, mainly solves the problem that partial disease terms in the original standard and the national standard of classification and code of Chinese medical diseases are inconsistent with the classification of the international disease classification system, and has the remarkable characteristics of definite disease name, standard definition, convenient search of synonym (near) meaning words and the like. Table 1 shows partial disease names in the revised draft of "clinical diagnosis and treatment terminology of chinese medicine" (disease part (2017 edition)), whose main contents include classification codes (i.e., hierarchical system), traditional medical disease names in chinese and english, definitions, and search terms (i.e., terms such as synonyms and synonyms).
TABLE 1 partial contents and structural examples of the national Standard "Chinese medicine clinical diagnosis and treatment terminology" (disease part)
Figure BDA0002762598870000071
The following is a description of the conventional medical disease entity provided in example 1 of the present invention.
The embodiment sufficiently refers to the achievement construction experience of disease ontology, and builds attributes such as labels (label) of traditional medical disease ontology class, hierarchy and class, synonyms (has _ exact _ synnym), definitions (definition) and cross-mappings (database _ cross _ reference) with other word lists by multiplexing the content and structure of traditional Chinese medicine clinical diagnosis and treatment terminology (disease part (2017 version)) (solicited opinions). International Resource Identifiers (IRI) are used as unique Identifiers of ontologies in the world, and interaction and multiplexing among different ontologies are facilitated. The IRI of the "ontology of traditional medical diseases" is expressed in the form of "TCMO _ seven digits" and the seven digits increase from 0000001, for example, the IRI of "diseases caused by exogenous diseases" is "TCMO _ 0000774". The inter-body class hierarchy is established according to the classification code of the national standard disease name, and the class label, synonym and definition of Chinese and English language are respectively multiplexed with Chinese and English disease names, search words and definitions. In addition, the "conventional medical disease entity" embodies cross-mapping with national standards and international disease classification-11, and for example, the mapping of "exogenous seasonal sickness" with national standards is expressed as "database _ cross _ reference GB/T2017: a 01.01", and the cross-mapping with ICD-11 is expressed as "database _ cross _ reference ICD11-ZH: SE 2Z". Fig. 1 shows a basic content structure of the "ontology of traditional medical diseases" according to embodiment 1 of the present invention, and its correspondence with the national standard and the chinese version of ICD-11.
Fig. 2 is a flowchart of a method for constructing a traditional medical disease ontology according to embodiment 2 of the present invention.
As shown in fig. 2, at step S210, a ontology classification framework is built. And building a body classification framework based on the existing disease classification system.
In step S220, a mapping of the set of conventional medical condition terms belonging to the respective classification to the set of reference conventional medical condition terms is implemented. Wherein, regarding the traditional medical disease term set belonging to the ontology classification as the matching source set, and referring to the traditional medical disease term set as the matching target set, the step S220 includes the following processes: and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched with each other, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
The precise matching may include at least one of the following matching modes: the disease terms in the matching source set are the same as the disease terms in the matching target set; the disease term in the matching source set is the same as the subject of the disease term in the matching target set; synonyms of disease terms in one set of the matching source set and the matching target set are the same as disease terms in the other set; and the disease term matching the source set is synonymous with the disease term matching the target set.
Taking the example of taking the traditional medical disease terminology set as the Chinese version of International Classification of diseases-11 (ICD-11), the mapping method of the traditional medical disease ontology and the Chinese version of International Classification of diseases-11 adopts a computer processing mode, which can assist manual review and establish semantic association between the two. The former is the matching source set, which is shown as source table A in the figure (total 1356 Chinese medical names), and the latter is the matching target set, which is shown as target table B in the figure (comprising 251 classification names related to the traditional medical diseases).
Exact Match (Exact Match) indicates semantic equivalence, i.e., the connotation and the extension of the traditional medical disease names in tables a and B are equal. In the method, the following conditions are all accurate matching: (1) the disease terms in the two tables are identical, e.g., "insomnia" (A04.01.12) in table a and "insomnia" (SD84) in table B, as shown in fig. 3A; (2) the disease names in table a are identical to the classification names in table B, except for the suffix "disease", for example, "pavor" (A04.01.09) and "pavor" (SA10), as shown in fig. 3B; (3) synonyms for the disease names in table a are the same as the category names in table B, such as "thoracic obstruction heartburn" (A04.01.01, synonym "thoracic obstruction") and "thoracic obstruction" (L4-SA2), as shown in fig. 3C; (4) the words are different but have the same meaning, for example, "visceral diseases and related diseases (A04)" and "visceral system diseases (L2-SA 0)", as shown in FIG. 3D.
In this embodiment, step S220 may further include the steps of: for the disease terms which fail to be matched accurately, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set upwards, and the disease term pairs which are matched successfully upwards are found, wherein the upwards matching means that the connotation and the extension of the disease terms in the matching target set are larger than those of the disease terms in the matching source set. The upward matching may include the following principles: in a near principle, when a plurality of superior disease terms matching the disease terms in the source set are exactly matched with the disease terms in the matching target set, the disease terms in the matching target set matched with the superior disease terms with the closest rank are selected as matching terms; when the grade difference between the matching source set and the matching target set is 1 and the disease term of the next level of the matching source set exists in the specific and/or non-specific disease term, the correlative matching of the disease term of the matching source set and the specific or non-specific disease term of the next level of the disease term of the matching target set is established, and the correlative matching is not upward matching.
Still taking the matching target set as the Chinese version of International Classification of diseases-11 as an example, the upward matching (Broad Match) means that the inclusion relationship, i.e., the connotation and extension of the taxonomic names in Table B is larger than the disease names in Table A. The upward matching follows the following principle: (1) applying a near principle, when a plurality of superior disease names of the disease names in the table A are accurately matched with the classification names in the table B, selecting the table B classification name matched with the superior disease name with the closest grade; (2) to be more close to the nature of the disease, when the grade difference between the table A and the table B is 1 and the specific and/or non-specific class names exist in the next class names of the table B, the related matching of the table A class and the specific and non-specific class names of the table B class is established, but not the upward matching. For example, "yang yellow disease" (A04.02.03.01) in table a is the disease name of class 4, there is no exact match, its corresponding class 3 disease name is "jaundice" (A04.02.03) can be matched to "jaundice" (SA01) in table B, class 2 disease name "hepatic disease" (a04.02) can be matched to "hepatic disease class" (L3-SA0) in table B, and under "jaundice" (SA01) unspecified and unspecified class names, according to the rule of proximity, an upward matching relationship is established between "yang yellow disease" (A04.02.03.01) and "jaundice" (SA01), as shown in fig. 4. Wherein, non-specific (NOS) is a special classification name, which is used for the situation that the specific subdivision type cannot be determined due to incomplete information in the actual encoding process; the (other) specific classes are classified systems such as ICD, and do not cover all classification names, only a few common 'specific' classes are listed, and other subdivided classes which can be specifically judged according to actual information can be classified into 'other specific' classes.
In this embodiment, step S220 may further include the steps of: for the disease terms which fail to be matched upwards, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set downwards, and the disease term pairs which are successfully matched downwards are found, wherein downwards matching means that the connotation and the extension of the disease terms in the matching target set are smaller than those of the disease terms in the matching source set. The down-matching may include the following principles: selecting matching target set terms matched with the lower-level disease terms of the matching source set disease terms with the closest matching grades to be matched downwards according to the principle of closeness; when the disease term in the matching source set can establish a down-matching relationship with a plurality of disease terms in the matching target set, the down-matching is not performed.
Still taking the matching target set as the Chinese version of International Classification of diseases-11 as an example, the Down Match (narrow Match) indicates that the content and extent of the class name contained in Table B is smaller than that in Table A. In this embodiment, the downward matching follows the following principle: selecting the table B classification name matched with the lower-level disease name of the table A disease name with the most similar grade and accurately matched by applying a proximity principle to carry out downward matching; when the category in the table A can establish a downward matching relationship with a plurality of table B category names, downward matching is not performed. As the disease name "blackessence" (A11.01.04) in table a does not have a precise match, the only disease name "mixed eye" (A11.01.04.11) in the next 1 st level can be precisely matched to "mixed eye" (SC74) in table B, at which time a downward matching relationship is established between "blackessence" (A11.01.04) and "mixed eye" (A11.01.04.11), as shown in fig. 5.
In this embodiment, step S220 may further include the steps of: and for the disease terms which fail to be matched downwards, performing correlation matching on the disease terms in the matching source set and the disease terms in the matching target set respectively to find a disease term pair which is successfully matched, wherein the correlation matching refers to the meaning and the extension of the disease terms in the matching source set and the disease terms in the matching target set which partially contain the opposite parties with each other. The correlation matching includes the following principles: correlating non-specific disease terms that match to the matching target set if the disease terms that match the source set have sub-disease terms; if the disease term of the source set is matched with the disease term of the non-son disease term, the disease term matched to the matching target set is related to be specific; and selecting the specific or non-specific disease terms under the disease terms of the matching target set matched by the superior disease terms which are matched accurately and have the closest grades according to the principle of the recent time.
Still taking the matching target set as the Chinese version of International Classification of diseases-11 as an example, the related matching (related Match) represents the intersection relationship, i.e. the tables A and B partially contain the connotation and extension of each other. In this example, there are temporary diagnostic syndrome terms in Table A, where some disease names can be exactly matched to the category names in Table B, and the remaining terms that fail to exactly match are considered as being related. In addition, correlation matching follows the following principles: if the disease name in the table A has a sub-disease name, the disease name is related and matched to the unspecified classification name in the table B; if the category A has no child disease names, other specific category names are matched in a relevant way; and (3) selecting other specific or unspecified classified names under the list B category matched with the precisely matched superior disease names with the closest grades by applying the principle of near. For example, in Table A, "New Cold" (A01.03.01.01) exists, which has no child disease name, and the disease name of level 1 above it is "Warm" (A01.03.01), the disease name of level 3 above it is "external Cold" (A01), and both of them establish exact match with "Warm" (L3-SE0) and "external Cold" (L2-SD9), "Warm" (L3-SE0) exists "other specific Cold" (SE0Y), and "unspecified Warm" (SE0Z), "external Cold" (L2-SD9) exists "other specific Cold" (SE2Y) and "unspecified Cold" (SE2Z), at this time, since "New Cold" (A01.03.01.01) has no child disease, other specific class name should be selected, and "other specific Cold" (SE0) is the class name of the class of the most, so that a close match is established between them, as shown in fig. 6.
In this embodiment, when the above four matching relationships are established, the selected sequence is exact matching, upward matching, downward matching, and correlation matching. Of course, the order of exact match-down match-up match-correlation match can also be chosen.
Fig. 7 is a schematic structural diagram of a conventional medical disease ontology construction apparatus provided in embodiment 3 of the present invention. As shown in fig. 7, the conventional medical disease ontology constructing apparatus 700 provided by this embodiment includes a classification framework building unit 710 and a mapping unit 720. The mapping unit 720 includes an exact match module 721.
The classification frame building unit 710 is used for building an ontology classification frame. The operation of the classification frame building unit 710 may refer to the operation of step S210 described above with reference to fig. 2.
The mapping unit 720 is configured to implement mapping of the traditional medical condition term sets belonging to the respective classifications to the reference traditional medical condition term set, wherein the traditional medical condition term sets belonging to the ontology classification are used as matching source sets, and the reference traditional medical condition term set is used as a matching target set. The operation of the mapping unit 720 may refer to the operation of step S220 described above with reference to fig. 2.
The exact matching module 721 is configured to exactly match each disease term in the matching source set with each disease term in the matching target set, respectively, and find a pair of disease terms that are successfully matched, where exact matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
The precise matching may include at least one of the following matching modes: the disease terms in the matching source set are the same as the disease terms in the matching target set; the disease term in the matching source set is the same as the subject of the disease term in the matching target set; synonyms of disease terms in one set of the matching source set and the matching target set are the same as disease terms in the other set; and the disease term matching the source set is synonymous with the disease term matching the target set.
The mapping unit 720 may further include an upward matching module, configured to, for a disease term that fails to be precisely matched, perform upward matching on each disease term in the matching source set and each disease term in the matching target set, and find a disease term pair that is successfully matched upward, where the upward matching means that the content and the extent of the disease terms in the matching target set are greater than those in the matching source set. The upward matching may include the following principles: in a near principle, when a plurality of superior disease terms matching the disease terms in the source set are exactly matched with the disease terms in the matching target set, the disease terms in the matching target set matched with the superior disease terms with the closest grade are selected as the matching terms; when the grade difference between the matching source set and the matching target set is 1 and the specific and/or non-specific disease terms exist in the disease terms of the next level of the matching source set, the correlative matching of the disease terms of the matching source set and the specific or non-specific disease terms of the next level of the disease terms of the matching target set is established, and the correlative matching is not upward matching.
The mapping unit 720 may further include a downward matching module, configured to, for a disease term that fails to be matched upwards, match down each disease term in the matching source set with each disease term in the matching target set, respectively, and find a disease term pair that is successfully matched downwards, where downward matching refers to that the content and extent of the disease terms in the matching target set are smaller than those in the matching source set. The downward matching may include the following principles: applying a proximity principle, and selecting a matching target set term matched with the lower-level disease term of the matching source set disease term with the closest matched grade for downward matching; when the disease term in the matching source set can establish a down-matching relationship with a plurality of disease terms in the matching target set, the down-matching is not performed.
The mapping unit 720 may further include a correlation matching module, configured to, for disease terms that fail to be matched downwards, perform correlation matching on each disease term in the matching source set and the disease term in the matching target set, respectively, to find a disease term pair that is successfully matched in correlation, where correlation matching refers to that the disease terms in the matching source set and the disease terms in the matching target set partially include the connotation and the extension of each other. The correlation matching includes the following principles: correlating non-specific disease terms that match to the matching target set if the disease terms that match the source set have sub-disease terms; if the disease term of the source set is matched with the disease term without sub-disease terms, the disease term matched to the matching target set is related to be specific; and (3) selecting specific or non-specific disease terms under the matching target set disease terms matched by the superior disease terms which are matched accurately and have the closest grades by applying the principle of near.
Fig. 8 is a block diagram of a computing device for a construction process of a conventional medical disease ontology according to an embodiment of the present invention.
As shown in fig. 8, computing device 800 may include at least one processor 810, storage 820, memory 830, communication interface 840, and internal bus 850, with the at least one processor 810, storage 820, memory 830, and communication interface 840 being connected together via bus 850. The at least one processor 810 executes at least one computer-readable instruction (i.e., an element described above as being implemented in software) stored or encoded in a computer-readable storage medium (i.e., the memory 820).
In one embodiment, stored in the memory 820 are computer-executable instructions that, when executed, cause the at least one processor 810 to: building a body classification frame; implementing a mapping of the set of traditional medical condition terms belonging to each classification to the set of reference traditional medical condition terms, wherein the set of traditional medical condition terms belonging to the subject classification is taken as a set of matching sources and the set of reference traditional medical condition terms is taken as a set of matching targets, the implementing of the mapping of the set of traditional medical condition terms belonging to each classification to the set of reference traditional medical condition terms comprises: and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched with each other, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
It should be understood that the computer-executable instructions stored in the memory 820, when executed, cause the at least one processor 810 to perform the various operations and functions described above in connection with fig. 1-6 in the various embodiments of the present disclosure.
In the present disclosure, computing device 800 may include, but is not limited to: personal computers, server computers, workstations, desktop computers, laptop computers, notebook computers, mobile computing devices, smart phones, tablet computers, cellular phones, Personal Digital Assistants (PDAs), handsets, messaging devices, wearable computing devices, consumer electronics, and the like.
According to one embodiment, a program product, such as a non-transitory machine-readable medium, is provided. A non-transitory machine-readable medium may have instructions (i.e., elements described above as being implemented in software) that, when executed by a machine, cause the machine to perform various operations and functions described above in connection with fig. 1-6 in various embodiments of the present disclosure.
Specifically, a system or apparatus may be provided which is provided with a readable storage medium on which software program code implementing the functions of any of the above embodiments is stored, and causes a computer or processor of the system or apparatus to read out and execute instructions stored in the readable storage medium.
In this case, the program code itself read from the readable medium can realize the functions of any of the above-described embodiments, and thus the machine-readable code and the readable storage medium storing the machine-readable code form part of the present invention.
Examples of the readable storage medium include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or from the cloud via a communications network.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the claims, and all equivalent structures or equivalent processes that are transformed by the content of the specification and the drawings, or directly or indirectly applied to other related technical fields are included in the scope of the claims.

Claims (11)

1. A method for constructing a traditional medical disease ontology is characterized by comprising the following steps:
building a body classification frame;
a mapping of the set of conventional medical condition terms belonging to the respective classification to the set of reference conventional medical condition terms is effected, wherein,
taking the traditional medical disease term set belonging to the ontology classification as a matching source set and the reference traditional medical disease term set as a matching target set, and implementing the mapping from the traditional medical disease term set belonging to each classification to the reference traditional medical disease term set comprises:
and accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair successfully matched with each other, wherein the accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
2. The method for constructing the ontology of traditional medical diseases according to claim 1, wherein the exact matching comprises at least one of the following matching manners:
the disease terms in the matching source set are the same as the disease terms in the matching target set;
the disease term in the matching source set is the same as the subject of the disease term in the matching target set;
synonyms of disease terms in one set of the matching source set and the matching target set are the same as disease terms in the other set; and
the disease terms matching the source set have the same connotation as the disease terms matching the target set.
3. The method of constructing a ontology of traditional medical conditions according to claim 1, wherein said mapping the set of traditional medical conditions terms belonging to each category to the set of reference traditional medical conditions terms further comprises: for the disease terms which fail to be matched accurately, upward matching is carried out on each disease term in the matching source set and each disease term in the matching target set respectively, and a disease term pair which is matched successfully upwards is found, wherein the upward matching means that the connotation and the extension of the disease terms in the matching target set are larger than those of the disease terms in the matching source set.
4. The method for constructing the ontology of traditional medical diseases according to claim 3, wherein the upward matching comprises the following principles:
in a near principle, when a plurality of superior disease terms matching the disease terms in the source set are exactly matched with the disease terms in the matching target set, the disease terms in the matching target set matched with the superior disease terms with the closest grade are selected as the matching terms;
when the grade difference between the matching source set and the matching target set is 1 and the disease term of the next level of the matching source set exists in the specific and/or non-specific disease term, the correlative matching of the disease term of the matching source set and the specific or non-specific disease term of the next level of the disease term of the matching target set is established, and the correlative matching is not upward matching.
5. The method of constructing a ontology of traditional medical conditions according to claim 3, wherein said mapping the set of traditional medical conditions terms belonging to each category to the set of reference traditional medical conditions terms further comprises: for the disease terms which fail to be matched upwards, the disease terms in the matching source set are respectively matched with the disease terms in the matching target set downwards, and the disease term pairs which are successfully matched downwards are found, wherein the downwards matching means that the connotation and the extension of the disease terms in the matching target set are smaller than those of the disease terms in the matching source set.
6. The method for constructing the ontology of traditional medical diseases according to claim 5, wherein the downward matching comprises the following principles:
selecting matching target set terms matched with lower-level disease terms of matching source set disease terms with the closest grade after accurate matching for downward matching according to a principle of closeness;
when the disease term in the matching source set can establish a down-matching relationship with a plurality of disease terms in the matching target set, the down-matching is not performed.
7. The method of constructing a ontology of traditional medical conditions according to claim 5, wherein said mapping the set of traditional medical conditions terms belonging to each category to the set of reference traditional medical conditions terms further comprises: and for the disease terms which fail to be matched downwards, performing relevant matching on the disease terms in the matching source set and the disease terms in the matching target set respectively, and finding out a disease term pair which is successfully matched in a relevant way, wherein the relevant matching means that the disease terms in the matching source set and the disease terms in the matching target set partially contain the connotation and the extension of each other.
8. The method for constructing the ontology of traditional medical diseases according to claim 7, wherein the correlation matching comprises the following principles:
correlating non-specific disease terms that match to the matching target set if the disease terms that match the source set have sub-disease terms; if the disease term of the source set is matched with the disease term of the non-son disease term, the disease term matched to the matching target set is related to be specific;
and selecting the specific or non-specific disease terms under the matching target set disease terms matched by the superior disease terms which are matched accurately and have the closest grades according to the principle of the recent time.
9. A device for constructing a traditional medical disease body is characterized by comprising:
the classification frame building unit is used for building a body classification frame;
a mapping unit for implementing the mapping of the traditional medical disease term sets belonging to the respective classifications to the reference traditional medical disease term set, wherein the traditional medical disease term sets belonging to the ontology classification are used as the matching source set, the reference traditional medical disease term set is used as the matching target set,
the mapping unit comprises an accurate matching module, and is used for accurately matching each disease term in the matching source set with each disease term in the matching target set respectively to find a disease term pair which is successfully matched, wherein accurate matching means that the disease terms in the matching source set are semantically equivalent to the disease terms in the matching target set.
10. A computing device, comprising:
one or more processors, and
a memory coupled with the one or more processors, the memory storing instructions that, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
11. A machine-readable storage medium having stored thereon executable instructions that, when executed, cause the machine to perform the method of any one of claims 1 to 8.
CN202011222616.5A 2020-11-05 2020-11-05 Method and device for constructing traditional medical disease body Pending CN112445917A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011222616.5A CN112445917A (en) 2020-11-05 2020-11-05 Method and device for constructing traditional medical disease body

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011222616.5A CN112445917A (en) 2020-11-05 2020-11-05 Method and device for constructing traditional medical disease body

Publications (1)

Publication Number Publication Date
CN112445917A true CN112445917A (en) 2021-03-05

Family

ID=74735854

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011222616.5A Pending CN112445917A (en) 2020-11-05 2020-11-05 Method and device for constructing traditional medical disease body

Country Status (1)

Country Link
CN (1) CN112445917A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221543A (en) * 2021-05-07 2021-08-06 中国医学科学院医学信息研究所 Medical term integration method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105069124A (en) * 2015-08-13 2015-11-18 易保互联医疗信息科技(北京)有限公司 Automatic ICD (International Classification of Diseases) coding method and system
CN105574103A (en) * 2015-12-11 2016-05-11 浙江大学 Method and system for automatically establishing medical term mapping relationship based on word segmentation and coding
CN106919793A (en) * 2017-02-24 2017-07-04 黑龙江特士信息技术有限公司 A kind of data standardization processing method and device of medical big data
CN110096635A (en) * 2019-04-17 2019-08-06 广东技术师范大学 A kind of the inquiry visual display method and device of traditional Chinese and western medicine medicine information
CN111797207A (en) * 2020-07-14 2020-10-20 山东健康医疗大数据有限公司 Method for realizing hospital diagnosis data standardization

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105069124A (en) * 2015-08-13 2015-11-18 易保互联医疗信息科技(北京)有限公司 Automatic ICD (International Classification of Diseases) coding method and system
CN105574103A (en) * 2015-12-11 2016-05-11 浙江大学 Method and system for automatically establishing medical term mapping relationship based on word segmentation and coding
CN106919793A (en) * 2017-02-24 2017-07-04 黑龙江特士信息技术有限公司 A kind of data standardization processing method and device of medical big data
CN110096635A (en) * 2019-04-17 2019-08-06 广东技术师范大学 A kind of the inquiry visual display method and device of traditional Chinese and western medicine medicine information
CN111797207A (en) * 2020-07-14 2020-10-20 山东健康医疗大数据有限公司 Method for realizing hospital diagnosis data standardization

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113221543A (en) * 2021-05-07 2021-08-06 中国医学科学院医学信息研究所 Medical term integration method and system
CN113221543B (en) * 2021-05-07 2023-10-10 中国医学科学院医学信息研究所 Medical term integration method and system

Similar Documents

Publication Publication Date Title
US20220043813A1 (en) Method and system for ontology driven data collection and processing
US11921769B2 (en) Ontology mapping method and apparatus
Fu FCA based ontology development for data integration
Cortis et al. Discovering semantic equivalence of people behind online profiles
CN114153994A (en) Medical insurance information question and answer method and device
Ramar et al. Technical review on ontology mapping techniques
Wang et al. Effective online knowledge graph fusion
Tinelli et al. Embedding semantics in human resources management automation via SQL
Pereira et al. Querying semantic catalogues of biomedical databases
Gollapalli Literature review of attribute level and structure level data linkage techniques
CN112445917A (en) Method and device for constructing traditional medical disease body
KR20210150103A (en) Collaborative partner recommendation system and method based on user information
Oliveira et al. Automatic semantic enrichment of data services
Kumar et al. A Semantic Query Transformation Approach Based on Ontology for Search Engine
Yu et al. Data service generation framework from heterogeneous printed forms using semantic link discovery
Miah et al. Ontology techniques for representing the problem of discourse: Design of solution application perspective
Binnig et al. DeepVizdom: Deep Interactive Data Exploration
Rodger et al. Mobile speech and the armed services: making a case for adding siri-like features to vamta (voice-activated medical tracking application)
CN114564599B (en) Retrieval system based on query string template
Liu et al. Adaptive semantic matching in a multilingual context
Cortis et al. Techniques for the identification of semantically-equivalent online identities
Feng et al. Extracting meaningful correlations among heterogeneous datasets for medical question answering with domain knowledge
Yu A fast retrieval method of drug information based on multidimensional data analysis
Po et al. Automatic Lexical Annotation: an effective technique for dynamic data integration
Yu et al. A multilingual ontology-based approach to attribute correspondence identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210305

RJ01 Rejection of invention patent application after publication