CN117292846A - Construction method and device of intestinal microorganism knowledge graph - Google Patents
Construction method and device of intestinal microorganism knowledge graph Download PDFInfo
- Publication number
- CN117292846A CN117292846A CN202311588837.8A CN202311588837A CN117292846A CN 117292846 A CN117292846 A CN 117292846A CN 202311588837 A CN202311588837 A CN 202311588837A CN 117292846 A CN117292846 A CN 117292846A
- Authority
- CN
- China
- Prior art keywords
- information
- entities
- entity
- related entities
- intestinal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 244000005700 microbiome Species 0.000 title claims abstract description 118
- 230000000968 intestinal effect Effects 0.000 title claims abstract description 87
- 238000010276 construction Methods 0.000 title claims abstract description 16
- 201000010099 disease Diseases 0.000 claims abstract description 45
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 45
- 239000003814 drug Substances 0.000 claims abstract description 33
- 238000000034 method Methods 0.000 claims abstract description 30
- 229940079593 drug Drugs 0.000 claims abstract description 28
- 239000013543 active substance Substances 0.000 claims abstract description 22
- 238000011160 research Methods 0.000 claims abstract description 19
- 235000005911 diet Nutrition 0.000 claims abstract description 18
- 230000037213 diet Effects 0.000 claims abstract description 18
- 235000016709 nutrition Nutrition 0.000 claims abstract description 16
- 230000010354 integration Effects 0.000 claims abstract description 7
- 238000012545 processing Methods 0.000 claims description 25
- 238000004590 computer program Methods 0.000 claims description 15
- 230000003993 interaction Effects 0.000 claims description 11
- 230000000813 microbial effect Effects 0.000 claims description 10
- 235000012054 meals Nutrition 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 7
- 235000015097 nutrients Nutrition 0.000 claims description 6
- 238000013075 data extraction Methods 0.000 claims description 4
- 230000002906 microbiologic effect Effects 0.000 claims description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 3
- 238000001514 detection method Methods 0.000 abstract description 2
- 230000009897 systematic effect Effects 0.000 abstract 1
- 230000000875 corresponding effect Effects 0.000 description 25
- 238000004891 communication Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000011282 treatment Methods 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 241000590002 Helicobacter pylori Species 0.000 description 4
- 201000001352 cholecystitis Diseases 0.000 description 4
- 229940037467 helicobacter pylori Drugs 0.000 description 4
- 206010013710 Drug interaction Diseases 0.000 description 3
- 241000282414 Homo sapiens Species 0.000 description 3
- 230000004913 activation Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 241000736262 Microbiota Species 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 108020004465 16S ribosomal RNA Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000024172 Cardiovascular disease Diseases 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 208000012902 Nervous system disease Diseases 0.000 description 1
- 208000025966 Neurological disease Diseases 0.000 description 1
- 241000605861 Prevotella Species 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000005856 abnormality Effects 0.000 description 1
- 238000009412 basement excavation Methods 0.000 description 1
- 238000002405 diagnostic procedure Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 210000000232 gallbladder Anatomy 0.000 description 1
- 230000002496 gastric effect Effects 0.000 description 1
- 210000001035 gastrointestinal tract Anatomy 0.000 description 1
- 244000005709 gut microbiome Species 0.000 description 1
- 201000010536 head and neck cancer Diseases 0.000 description 1
- 208000014829 head and neck neoplasm Diseases 0.000 description 1
- 230000005802 health problem Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000006461 physiological response Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
- G16H70/20—ICT specially adapted for the handling or processing of medical references relating to practices or guidelines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
- G06N5/022—Knowledge engineering; Knowledge acquisition
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
- G16H70/40—ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage
Abstract
The invention relates to a construction method and a construction device of an intestinal microorganism knowledge graph; the method comprises the following steps: acquiring initial literature information of related researches of intestinal microorganisms; extracting the initial literature information, extracting related entities, related information among the related entities and clinical annotation model information, and carrying out standardization, arrangement and integration on the extracted related entities to construct a data frame of the related entities; and constructing an intestinal microorganism knowledge graph based on the data frames of the related entities, the related information among the related entities and the clinical annotation model information. The intestinal microorganism knowledge graph established by the invention is beneficial to structural and systematic research on the correlation information of intestinal flora, diseases, medicines, nutritional diet, other active substances and the like based on literature reports, can quickly obtain the disease risk of the diseases related to the intestinal microorganisms according to the detection result of the intestinal microorganisms, is convenient for issuing the intestinal microorganism report, saves labor and provides convenience for users.
Description
Technical Field
The invention relates to the technical field of knowledge maps, in particular to a method and a device for constructing an intestinal microorganism knowledge map.
Background
Intestinal microorganisms are defined as the collection of all microorganisms in the intestinal tract, including bacteria, fungi, protozoa and viruses. With the continuous and intensive research of scientists, in recent years, the intestinal microbiome is focused and studied by global multi-national scientists, and many researches find that the intestinal microbiome is closely related to various health and diseases of organisms, such as cancers, cardiovascular diseases, neurological diseases and the like. In addition, the interaction of active substances such as drugs, diets, etc. with intestinal microbiota plays a key role in human health, disease and physiological response to various treatments, and various exogenous active substances can alter the microbiota, thereby affecting their function and communication with the host. In the past decade, intestinal microorganism research has undergone an exponential growth, and a large number of research papers are generated, and research results generated by the research papers are relatively scattered, and a system and a structure are not formed, so that repeated utilization of the research results and further excavation of unknown information of intestinal microorganisms are not facilitated.
Therefore, the invention provides a construction method and a construction device of an intestinal microorganism knowledge graph.
Disclosure of Invention
Based on the above, it is necessary to provide a method and a device for constructing an intestinal microorganism knowledge graph.
In order to achieve the above object, the present invention provides a method for constructing an intestinal microorganism knowledge graph, comprising:
acquiring initial literature information of related researches of intestinal microorganisms;
extracting the initial literature information, extracting related entities, related information among the related entities and clinical annotation model information, standardizing, arranging and integrating the extracted related entities, constructing a data frame of the related entities, and constructing an intestinal microorganism entity library based on the data frame of the related entities; the data framework of the related entity comprises a related entity list and medical attribute information related to the corresponding related entity, and the clinical annotation model information comprises a clinical annotation model, knowledge data elements related to the corresponding clinical annotation model and clinical annotation description information;
based on the data frames of the related entities, the association information between the related entities and the clinical annotation model information, the identification of the data frames of the related entities or the identification of the clinical annotation model information is used as a node, the association information between the related entities and the subordinate relation information of the clinical annotation model and the corresponding knowledge data elements are used as directed line segments, and an intestinal microorganism knowledge graph is constructed.
Optionally, the related entities include microorganisms, diseases, pharmaceuticals, nutritional diets, other actives and literature.
Optionally, the normalizing, sorting and integrating processing is performed on the related entities, so as to construct a data frame of the related entities, which specifically includes:
the extracted related entity names are subjected to standardized processing to obtain standardized results of the related entity names, and the related entity names are subjected to sorting and integrating processing according to the microbial affiliation information, the disease affiliation information, the medicine affiliation information, the nutrient diet affiliation information, the other active substance affiliation information and the literature affiliation information to respectively obtain a microbial entity list, a disease entity list, a drug entity list, a nutrient diet entity list, other active substance entity list and a literature entity list;
based on the list of microbiological entities, the list of disease entities, the list of pharmaceutical entities, the list of nutritional meal entities, the list of other active substance entities and the list of literature entities, a data framework of the corresponding relevant entities is constructed.
Optionally, the constructing the intestinal microorganism entity library based on the data frame of the related entity specifically includes:
and taking the entity identification of the data frame of each related entity and the identification of the medical attribute information related to each related entity as nodes, taking the subordinate relation information of each related entity and the corresponding medical attribute information as directed line segments, and connecting the related entity and the corresponding medical attribute information to construct the intestinal microbial entity library.
Optionally, extracting the association information between the related entities specifically includes:
and (3) carrying out information extraction processing on the initial document information, using entity identifiers of microorganisms, diseases, medicines, nutritional diet, other active substances and documents as nodes, using relation information of all related entities as directed line segments, and connecting all related entities to obtain the related information among the related entities.
Optionally, extracting the clinical annotation model information specifically includes:
based on the interaction relation between related entities, a plurality of clinical annotation models based on intestinal microorganisms are obtained, knowledge data elements and clinical annotation description information related to the corresponding clinical annotation models are extracted, the identification of the clinical annotation models and the identification of the knowledge data elements related to each clinical annotation model are taken as nodes, the dependency relation information of the clinical annotation models and the corresponding knowledge data elements is taken as directed line segments, and the clinical annotation description information and the corresponding knowledge data elements are connected to obtain the clinical annotation model information.
Optionally, the clinical annotation descriptive information includes evidence-based relationships between related entities.
The invention also provides a device for constructing the intestinal microorganism knowledge graph, which comprises the following steps:
the data acquisition module is used for acquiring initial literature information of related researches of intestinal microorganisms;
the data extraction module is used for carrying out information extraction processing on the initial document information and extracting related entities, related information among the related entities and clinical annotation model information;
the data processing module is used for carrying out standardization, arrangement and integration processing on the extracted related entities and constructing a data frame of the related entities;
the intestinal microorganism entity library construction module is used for constructing an intestinal microorganism entity library based on a data frame of related entities;
the intestinal microorganism knowledge graph construction module is used for constructing an intestinal microorganism knowledge graph by taking the identification of the data frame of the related entities or the identification of the clinical annotation model information as nodes and the association information between the related entities and the subordinate relation information of the clinical annotation model and the corresponding knowledge data elements as directed line segments based on the data frame of the related entities, the association information between the related entities and the clinical annotation model information.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method when executing the computer program.
The invention also provides a computer readable storage medium storing a computer program which when executed by a processor implements the steps of the method.
The invention has the advantages that: according to the method and the device for constructing the intestinal microorganism knowledge graph, based on the acquired literature information of the related research of the intestinal microorganisms, related information among related entities and clinical annotation model information are respectively extracted, a data frame of the related entities is generated according to the extracted related entities, and the intestinal microorganism knowledge graph is constructed through the combination of the data frame of the related entities, the related information among the related entities and the clinical annotation model information.
Drawings
FIG. 1 is a flow chart of a method for constructing an intestinal microorganism knowledge graph;
FIG. 2 is a schematic structural diagram of a device for constructing an intestinal microorganism knowledge graph;
fig. 3 is a schematic structural diagram of an electronic device.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail by the following detailed description with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
In order to facilitate understanding of the present embodiment, the following describes embodiments of the present invention in detail.
Example 1
Fig. 1 is a flowchart of a method for constructing an intestinal microorganism knowledge graph according to an embodiment of the invention.
Referring to fig. 1, the method includes the steps of:
s101, acquiring initial literature information of related researches of intestinal microorganisms.
In this embodiment, the initial literature information of the related study of the intestinal microorganisms includes, but is not limited to, scientific literature data and patent literature data related to the intestinal microorganisms of an authoritative website, it should be understood that the authoritative website includes, but is not limited to, a public medical database of the united states, a database of the well-known network series, a universal data knowledge service platform, a database of the wip, etc., and the data sources of the patent literature include, but are not limited to, patent collections, a large patent database, SOOPAT, a patent search and analysis platform of the well-known office, etc.
S102, carrying out information extraction processing on the initial literature information, extracting related entities, related information among the related entities and clinical annotation model information, carrying out standardization, arrangement and integration processing on the extracted related entities, constructing a data frame of the related entities, and constructing an intestinal microorganism entity library based on the data frame of the related entities.
In this embodiment, the data frame of the related entity includes a related entity list and medical attribute information related to the corresponding related entity, and the clinical annotation model information includes a clinical annotation model and knowledge data elements and clinical annotation description information related to the corresponding clinical annotation model.
In this example, related entities include microorganisms, diseases, pharmaceuticals, nutritional diets, other actives and literature.
In this embodiment, the standardization, arrangement and integration process are performed on related entities, and a data frame of the related entities is constructed, which specifically includes:
the extracted related entity names are subjected to standardized processing to obtain standardized results of the related entity names, and the related entity names are subjected to sorting and integrating processing according to the microbial affiliation information, the disease affiliation information, the medicine affiliation information, the nutrient diet affiliation information, the other active substance affiliation information and the literature affiliation information to respectively obtain a microbial entity list, a disease entity list, a drug entity list, a nutrient diet entity list, other active substance entity list and a literature entity list;
based on the list of microbiological entities, the list of disease entities, the list of pharmaceutical entities, the list of nutritional meal entities, the list of other active substance entities and the list of literature entities, a data framework of the corresponding relevant entities is constructed.
In the present embodiment, for example: microorganism adopts NCBI Taxonomy database standard to unify microorganism names; disease name unification is carried out on the disease by adopting MESH and WHO standards; the drug adopts a drug bank database to carry out drug name standardization and the like.
In the present embodiment, for example: establishing a hierarchical structure of microorganism entities according to the information of the affiliation of microorganisms by adopting NCBI Taxonomy hierarchical tree classification method, and generating a microorganism entity list after finishing and integrating treatment; establishing a hierarchical structure of each disease entity according to the subordinate relation information of each disease by adopting a hierarchical tree classification method such as MESH and the like, and generating a disease entity list after finishing and integrating treatment; establishing a hierarchical structure of each microorganism entity according to the subordinate relation information of each microorganism by adopting NCBI Taxonomy and generating a microorganism entity list after finishing and integrating treatment; drug band, KEGG Drug, TTD Drug, pubChem, NPASS and the like are adopted for Drug naming filling, a hierarchical structure of each Drug entity is established according to the information of the subordinate relation of each Drug, and a Drug entity list and the like are generated after arrangement and integration treatment.
In this embodiment, a data frame of related entities is constructed, where the data frame of related entities includes a related entity list and medical attribute information related to the corresponding related entities, for example:
medical attribute information related to a document includes, but is not limited to, the title, author, year, publisher, level of evidence, U.S. public medical database identification (PubMed ID), disease of document study, crowd, keywords, document links, document profile, etc.
Medical attribute information related to a disease includes, but is not limited to, information such as medical topic vocabulary identification (MeSH ID), aliases, symptoms, common etiology descriptions, common diagnostic method descriptions, and the like.
Medical attribute information related to microorganisms includes, but is not limited to, NCBI Taxonomy ID, name, classification level, father, microorganism description, and the like.
Medical attribute information related to a drug includes, but is not limited to, information referencing drug library identification (drug bank ID, etc.), aliases, classifications, types, mechanisms of action, side effects, etc.
Medical attribute information related to a nutritional meal includes, but is not limited to, meal ID, alias, meal description, etc.
In this embodiment, the construction of the intestinal microorganism entity library based on the data frame of the related entities specifically includes:
and taking the entity identification of the data frame of each related entity and the identification of the medical attribute information related to each related entity as nodes, taking the subordinate relation information of each related entity and the corresponding medical attribute information as directed line segments, and connecting the related entity and the corresponding medical attribute information to construct the intestinal microbial entity library.
In this embodiment, the entity identifiers of microorganisms, diseases, medicines, nutritional diet, other active substances and documents and the identifiers of the medical attribute information related to each related entity are used as nodes, the subordinate relation information of each related entity and the medical attribute information thereof is used as a directed line segment, the connection entity and the medical attribute elements thereof are connected, and the association of the connection entity and the medical attribute elements is described, so that an intestinal microorganism entity library is constructed.
In this embodiment, extracting association information between related entities specifically includes:
and (3) carrying out information extraction processing on the initial document information, using entity identifiers of microorganisms, diseases, medicines, nutritional diet, other active substances and documents as nodes, using relation information of all related entities as directed line segments, and connecting all related entities to obtain the related information among the related entities.
In this embodiment, according to the initial literature information of the related study of the intestinal microorganisms, the association information between related entities is extracted, and the association types include, but are not limited to: microorganisms are associated with diseases, microorganisms are associated with drugs, microorganisms are associated with nutritional diets, microorganisms are associated with other active substances, microorganisms, diseases and drugs, microorganisms, diseases and nutritional diets, microorganisms, diseases and other active substances, and the like.
In this embodiment, the entity identifiers of microorganisms, diseases, medicines, nutritional diets, other active substances and documents are used as nodes, the relationship information of each related entity is used as a directed line segment, and each related entity is connected to obtain the relationship information between related entities, and the relationship is that each related entity is connected to describe the relationship between related entities, for example: taking the microbial-disease association as an example, in literature PMID: 28600626, an increase in the abundance of Prevotella is associated with colorectal cancer; in literature PMID:27259999, a decrease in the abundance of granulosa is associated with head and neck cancer for the association of a microorganism with a disease entity.
In this embodiment, extracting clinical annotation model information specifically includes:
based on the interaction relation between related entities, obtaining a plurality of clinical annotation models based on intestinal microorganisms, extracting knowledge data elements and clinical annotation description information related to the corresponding clinical annotation models, taking the identification of the clinical annotation models and the identification of the knowledge data elements related to each clinical annotation model as nodes, taking the dependency relation information of the clinical annotation models and the corresponding knowledge data elements as directed line segments, and connecting the clinical annotation description information and the corresponding knowledge data elements to obtain clinical annotation model information; the clinical annotation descriptive information includes evidence-based relationships between related entities.
In this embodiment, a plurality of clinical annotation models based on intestinal microorganisms are determined based on the interaction relationship between related entities, the clinical annotation models being integrated clinical annotation framework structures of intestinal microorganism related research literature including, but not limited to, microorganism-disease association information models, microorganism-drug interaction models, microorganism-nutritional meal interaction models, microorganism-other active substance interaction models, microorganism, disease and drug interaction models, microorganism, disease and nutritional meal interaction models, microorganism, disease and other active substance interaction models, and the like.
In this embodiment, knowledge data elements related to the respective clinical annotation model are extracted, for example: in a microorganism-disease interaction model, knowledge data elements that need to be extracted and correlated include, but are not limited to: microorganisms, diseases, effects on microorganisms, sequencing techniques, literature information, and the like; in a microorganism-drug interaction model, knowledge data elements that need to be extracted and correlated include, but are not limited to: microorganisms, drug names, drug categories, effects on microorganisms, effect intensities, test types, test disease information, literature, etc.
In this embodiment, clinical annotation description information is extracted, and the clinical annotation description information includes evidence-based relationships between related entities, for example: p value, etc.
In this embodiment, the identification of each clinical annotation model and the identification of the knowledge data element related to each clinical annotation model are used as nodes, the dependency information of each clinical annotation model and the corresponding knowledge data element is used as a directed line segment, the clinical annotation description information and the corresponding knowledge data element are connected, and the association of the clinical annotation description information and the corresponding knowledge data element is described, so as to obtain the clinical annotation model information, for example: in literature PMID:20939110 the infection rate of helicobacter pylori in the gallbladder of a patient with cholecystitis has statistical significance with the difference of the control group; the evidence-based relationship is to connect the entities and characterize their association, i.e. to associate helicobacter pylori with cholecystitis entities. Additional knowledge data elements that need to be extracted include, but are not limited to: effects on microorganisms (e.g., incrustation, etc.), sequencing techniques (e.g., 16s rRNA gene sequencing, etc.); the clinical annotation description information is: the difference between the positive rate of helicobacter pylori DNA in cholecystitis and that of control group has statistical significance (for example, P=0.007), namely, the possible relationship between helicobacter pylori DNA and cholecystitis is found.
S103, based on the data frames of the related entities, the association information between the related entities and the clinical annotation model information, the identification of the data frames of the related entities or the identification of the clinical annotation model information is used as a node, the association information between the related entities and the subordinate relation information of the clinical annotation model and the corresponding knowledge data elements are used as directed line segments, and the intestinal microorganism knowledge graph is constructed.
In this embodiment, the intestinal microorganism knowledge graph constructed by the invention includes the intestinal microorganism entity library, the relation information among the entities, the gastrointestinal microorganism clinical annotation model information and other contents, can quickly obtain the disease risk of the related diseases of the intestinal microorganisms based on the intestinal microorganism detection result, is beneficial to the issuing of the intestinal microorganism report, can know whether the adjustment of the flora components can be beneficial to the control of the disease progress, the prediction of the response of patients to treatment and other information, and provides a new idea for managing or treating human health problems through the interaction of microbiota and human beings.
The method for constructing the intestinal microorganism knowledge graph provided by the embodiment comprises the following steps: acquiring initial literature information of related researches of intestinal microorganisms; extracting the initial literature information, extracting related entities, related information among the related entities and clinical annotation model information, standardizing, arranging and integrating the extracted related entities, constructing a data frame of the related entities, and constructing an intestinal microorganism entity library based on the data frame of the related entities; based on the data frames of the related entities, the associated information among the related entities and the clinical annotation model information, the identification of the data frames of the related entities or the identification of the clinical annotation model information is used as a node, the associated information among the related entities and the subordinate relation information of the clinical annotation model and the corresponding knowledge data elements are used as directed line segments, and an intestinal microorganism knowledge graph is constructed; according to the implementation method, based on the acquired literature information of related researches of intestinal microorganisms, related information and clinical annotation model information among related entities are respectively extracted, a data frame of the related entities is generated according to the extracted related entities, and an intestinal microorganism knowledge graph is constructed through the combination of the data frame of the related entities, the related information among the related entities and the clinical annotation model information.
Example two
On the basis of the first embodiment, the present embodiment provides a device 200 for constructing an intestinal microorganism knowledge graph, please refer to fig. 2, for implementing the steps of the method for constructing an intestinal microorganism knowledge graph according to the first embodiment, the device 200 mainly includes: a data acquisition module 210, a data extraction module 220, a data processing module 230, an intestinal microorganism entity library construction module 240 and an intestinal microorganism knowledge graph construction module 250, wherein,
the data acquisition module is used for acquiring initial literature information of related researches of intestinal microorganisms;
the data extraction module is used for carrying out information extraction processing on the initial document information and extracting related entities, related information among the related entities and clinical annotation model information;
the data processing module is used for carrying out standardization, arrangement and integration processing on the extracted related entities and constructing a data frame of the related entities;
the intestinal microorganism entity library construction module is used for constructing an intestinal microorganism entity library based on a data frame of related entities;
the intestinal microorganism knowledge graph construction module is used for constructing an intestinal microorganism knowledge graph by taking the identification of the data frame of the related entities or the identification of the clinical annotation model information as nodes and the association information between the related entities and the subordinate relation information of the clinical annotation model and the corresponding knowledge data elements as directed line segments based on the data frame of the related entities, the association information between the related entities and the clinical annotation model information.
Example III
The present embodiment further provides an electronic device based on the first embodiment, please refer to fig. 3, and the electronic device shown in fig. 3 is only an example, and should not bring any limitation to the function and the application scope of the embodiments of the present disclosure.
As shown in fig. 3, the electronic device may include a processing means (e.g., a central processor, a graphics processor, etc.) 301 that may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 302 or a program loaded from a storage means 308 into a Random Access Memory (RAM) 303. In the RAM 303, various programs and data required for the operation of the electronic device are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
In general, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, a touch panel, a keyboard, a mouse, a camera, etc., output devices 307 including, for example, a Liquid Crystal Display (LCD), a speaker, etc., storage devices 308 including, for example, a magnetic tape, a hard disk, etc., and communication devices 309. The communication means 309 may allow the electronic device to communicate with other devices wirelessly or by wire to exchange data. While fig. 3 shows an electronic device having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 3 may represent one device or a plurality of devices as needed.
In particular, according to some embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, some embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via communications device 309, or from storage device 308, or from ROM 302. The above-described functions defined in the methods of some embodiments of the present disclosure are performed when the computer program is executed by the processing means 301.
Example IV
The present embodiment provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method described above.
It should be noted that, in some embodiments of the present disclosure, the computer readable medium may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the present disclosure, however, the computer-readable signal medium may comprise a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In this embodiment, the client, server, etc. may communicate using any currently known or future developed network protocol, such as HTTP (HyperText TransferProtocol ), etc., and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the apparatus or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring training data, and converting the training data to obtain initial data; determining an initial rule base based on the initial data, and performing parameter optimization on the initial rule base to obtain a target rule base; calculating rules in the target rule base according to a preset activation weight calculation formula to obtain activation weights; and determining abnormal information according to the test data and the activation weight.
Computer program code for carrying out operations for some embodiments of the present disclosure may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in some embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The described units may also be provided in a processor, for example, described as: a processor includes a data acquisition unit, a rule determination unit weight calculation unit, and an abnormality determination unit. The names of these units do not constitute a limitation on the unit itself in some cases, and for example, the data acquisition unit may also be described as "a unit that acquires training data".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
The foregoing is a further detailed description of the invention in connection with specific embodiments, and is not intended to limit the practice of the invention to such descriptions. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.
Claims (10)
1. The construction method of the intestinal microorganism knowledge graph is characterized by comprising the following steps:
acquiring initial literature information of related researches of intestinal microorganisms;
extracting the initial literature information, extracting related entities, related information among the related entities and clinical annotation model information, standardizing, arranging and integrating the extracted related entities, constructing a data frame of the related entities, and constructing an intestinal microorganism entity library based on the data frame of the related entities; the data framework of the related entity comprises a related entity list and medical attribute information related to the corresponding related entity, and the clinical annotation model information comprises a clinical annotation model, knowledge data elements related to the corresponding clinical annotation model and clinical annotation description information;
based on the data frames of the related entities, the association information between the related entities and the clinical annotation model information, the identification of the data frames of the related entities or the identification of the clinical annotation model information is used as a node, the association information between the related entities and the subordinate relation information of the clinical annotation model and the corresponding knowledge data elements are used as directed line segments, and an intestinal microorganism knowledge graph is constructed.
2. The method for constructing an intestinal microbial knowledge graph according to claim 1, wherein the related entities comprise microorganisms, diseases, drugs, nutritional diet, other active substances and literature.
3. The method for constructing an intestinal microorganism knowledge graph according to claim 2, wherein the steps of normalizing, sorting and integrating the related entities comprise:
the extracted related entity names are subjected to standardized processing to obtain standardized results of the related entity names, and the related entity names are subjected to sorting and integrating processing according to the microbial affiliation information, the disease affiliation information, the medicine affiliation information, the nutrient diet affiliation information, the other active substance affiliation information and the literature affiliation information to respectively obtain a microbial entity list, a disease entity list, a drug entity list, a nutrient diet entity list, other active substance entity list and a literature entity list;
based on the list of microbiological entities, the list of disease entities, the list of pharmaceutical entities, the list of nutritional meal entities, the list of other active substance entities and the list of literature entities, a data framework of the corresponding relevant entities is constructed.
4. The method for constructing an intestinal microorganism knowledge graph according to claim 3, wherein the constructing an intestinal microorganism entity library based on the data frame of related entities specifically comprises:
and taking the entity identification of the data frame of each related entity and the identification of the medical attribute information related to each related entity as nodes, taking the subordinate relation information of each related entity and the corresponding medical attribute information as directed line segments, and connecting the related entity and the corresponding medical attribute information to construct the intestinal microbial entity library.
5. The method for constructing an intestinal microorganism knowledge graph according to claim 2, wherein extracting the association information between the related entities comprises:
and (3) carrying out information extraction processing on the initial document information, using entity identifiers of microorganisms, diseases, medicines, nutritional diet, other active substances and documents as nodes, using relation information of all related entities as directed line segments, and connecting all related entities to obtain the related information among the related entities.
6. The method for constructing an intestinal microorganism knowledge graph according to claim 2, wherein the extracting of the clinical annotation model information specifically comprises:
based on the interaction relation between related entities, a plurality of clinical annotation models based on intestinal microorganisms are obtained, knowledge data elements and clinical annotation description information related to the corresponding clinical annotation models are extracted, the identification of the clinical annotation models and the identification of the knowledge data elements related to each clinical annotation model are taken as nodes, the dependency relation information of the clinical annotation models and the corresponding knowledge data elements is taken as directed line segments, and the clinical annotation description information and the corresponding knowledge data elements are connected to obtain the clinical annotation model information.
7. The method of claim 6, wherein the clinical annotation descriptive information comprises evidence-based relationships between related entities.
8. The device for constructing the intestinal microorganism knowledge graph is characterized by comprising the following components:
the data acquisition module is used for acquiring initial literature information of related researches of intestinal microorganisms;
the data extraction module is used for carrying out information extraction processing on the initial document information and extracting related entities, related information among the related entities and clinical annotation model information;
the data processing module is used for carrying out standardization, arrangement and integration processing on the extracted related entities and constructing a data frame of the related entities;
the intestinal microorganism entity library construction module is used for constructing an intestinal microorganism entity library based on a data frame of related entities;
the intestinal microorganism knowledge graph construction module is used for constructing an intestinal microorganism knowledge graph by taking the identification of the data frame of the related entities or the identification of the clinical annotation model information as nodes and the association information between the related entities and the subordinate relation information of the clinical annotation model and the corresponding knowledge data elements as directed line segments based on the data frame of the related entities, the association information between the related entities and the clinical annotation model information.
9. An electronic device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311588837.8A CN117292846A (en) | 2023-11-27 | 2023-11-27 | Construction method and device of intestinal microorganism knowledge graph |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311588837.8A CN117292846A (en) | 2023-11-27 | 2023-11-27 | Construction method and device of intestinal microorganism knowledge graph |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117292846A true CN117292846A (en) | 2023-12-26 |
Family
ID=89244850
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311588837.8A Pending CN117292846A (en) | 2023-11-27 | 2023-11-27 | Construction method and device of intestinal microorganism knowledge graph |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117292846A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117747096A (en) * | 2024-02-21 | 2024-03-22 | 神州医疗科技股份有限公司 | Auxiliary diagnosis and treatment system based on pathogroup knowledge base and construction method |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107357924A (en) * | 2017-07-25 | 2017-11-17 | 为朔医学数据科技(北京)有限公司 | A kind of precisely medical knowledge map construction method and apparatus |
CN114255884A (en) * | 2021-12-13 | 2022-03-29 | 首都医科大学附属北京安贞医院 | Hypertension drug treatment knowledge graph construction method and device |
CN114944199A (en) * | 2022-04-26 | 2022-08-26 | 北京邮电大学 | Artificial intelligence based strain screening method and device |
CN116226404A (en) * | 2023-03-13 | 2023-06-06 | 福建医科大学 | Knowledge graph construction method and knowledge graph system for intestinal-brain axis |
-
2023
- 2023-11-27 CN CN202311588837.8A patent/CN117292846A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107357924A (en) * | 2017-07-25 | 2017-11-17 | 为朔医学数据科技(北京)有限公司 | A kind of precisely medical knowledge map construction method and apparatus |
CN114255884A (en) * | 2021-12-13 | 2022-03-29 | 首都医科大学附属北京安贞医院 | Hypertension drug treatment knowledge graph construction method and device |
CN114944199A (en) * | 2022-04-26 | 2022-08-26 | 北京邮电大学 | Artificial intelligence based strain screening method and device |
CN116226404A (en) * | 2023-03-13 | 2023-06-06 | 福建医科大学 | Knowledge graph construction method and knowledge graph system for intestinal-brain axis |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117747096A (en) * | 2024-02-21 | 2024-03-22 | 神州医疗科技股份有限公司 | Auxiliary diagnosis and treatment system based on pathogroup knowledge base and construction method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Subrahmanya et al. | The role of data science in healthcare advancements: applications, benefits, and future prospects | |
Taylor et al. | Prediction of in‐hospital mortality in emergency department patients with sepsis: a local big data–driven, machine learning approach | |
Chen et al. | Artificial intelligence in action: addressing the COVID-19 pandemic with natural language processing | |
De Maria Marchiano et al. | Translational research in the era of precision medicine: where we are and where we will go | |
Castro et al. | The Mass General Brigham Biobank Portal: an i2b2-based data repository linking disparate and high-dimensional patient data to support multimodal analytics | |
Khoie et al. | A hospital recommendation system based on patient satisfaction survey | |
CN117292846A (en) | Construction method and device of intestinal microorganism knowledge graph | |
Bhuiyan et al. | iHealthcare: Predictive model analysis concerning big data applications for interactive healthcare systems | |
Meystre et al. | Natural language processing enabling COVID-19 predictive analytics to support data-driven patient advising and pooled testing | |
Bora | Big data analytics in healthcare: A critical analysis | |
Patel et al. | Demographic pattern and hospitalization outcomes of depression among 2.1 million Americans with four major cancers in the United States | |
Berros et al. | Enhancing digital health services with big data analytics | |
Liu et al. | DQueST: dynamic questionnaire for search of clinical trials | |
García-García et al. | Real-world data and machine learning to predict cardiac amyloidosis | |
Sierra et al. | Artificial Intelligence-Assisted Diagnosis for Early Intervention Patients | |
Abbasizanjani et al. | Harmonising electronic health records for reproducible research: challenges, solutions and recommendations from a UK-wide COVID-19 research collaboration | |
Koumakis et al. | A content-aware analytics framework for open health data | |
Sarasa Cabezuelo | Application of machine learning techniques to analyze patient returns to the emergency department | |
Maurits et al. | A framework for employing longitudinally collected multicenter electronic health records to stratify heterogeneous patient populations on disease history | |
Mande et al. | Leveraging distributed data over big data analytics platform for healthcare services | |
Wang et al. | Big data in personalized healthcare | |
Oliveira et al. | Survivability prediction of colorectal cancer patients: a system with evolving features for continuous improvement | |
Redolfi et al. | Italian, European, and international neuroinformatics efforts: An overview | |
Rajula et al. | Overview of federated facility to harmonize, analyze and management of missing data in cohorts | |
Ayaz et al. | A Framework for Automatic Clustering of EHR Messages Using a Spatial Clustering Approach |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |