CN113035346A - Medical knowledge map-based disease category assessment device and method - Google Patents

Medical knowledge map-based disease category assessment device and method Download PDF

Info

Publication number
CN113035346A
CN113035346A CN202110200064.6A CN202110200064A CN113035346A CN 113035346 A CN113035346 A CN 113035346A CN 202110200064 A CN202110200064 A CN 202110200064A CN 113035346 A CN113035346 A CN 113035346A
Authority
CN
China
Prior art keywords
dimensional
disease
module
knowledge
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110200064.6A
Other languages
Chinese (zh)
Other versions
CN113035346B (en
Inventor
殷波
焦立博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN202110200064.6A priority Critical patent/CN113035346B/en
Publication of CN113035346A publication Critical patent/CN113035346A/en
Application granted granted Critical
Publication of CN113035346B publication Critical patent/CN113035346B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/60ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
    • G16H40/67ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for remote operation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a disease category assessment device and method based on a medical knowledge graph, which comprises the following steps: the acquisition module is used for acquiring N-dimensional pathological condition characteristics; the determining module is used for inputting the N-dimensional disease state characteristics to a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training. The embodiment can make up the current situation of insufficient online high-quality medical resources by means of big data and artificial intelligence technology, and provides effective support for remote inquiry.

Description

Medical knowledge map-based disease category assessment device and method
Technical Field
The invention relates to the technical field of computers, in particular to a medical knowledge graph-based disease category assessment device and method.
Background
At present, based on a mobile internet technology and a mobile intelligent terminal, the medical internet process is accelerated, the total number of mobile users of three operators in China reaches 15.98 hundred million users, wherein the number of internet medical users in China breaks through 1 hundred million in 2015 and reaches 3.8 hundred million in 2017. According to the data of market prospects and investment opportunity research reports of the mobile medical industry in 2022 years of 2018 and 2022 released by the research institute of the Chinese commerce industry, the scale of the market of the internet medical industry is rapidly increased due to the fact that the knowledge payment era and the medical and electronic commerce policies are released, the scale of the internet medical industry reaches 231.4 billion yuan in 2017, and the scale of the internet medical market is expected to exceed 1000 billion yuan in 2020. Along with the popularization of mobile intelligent terminals, a variety of mobile medical software is emerging continuously, and more convenient medical information and treatment service can be provided for hospitals and common people. The medical information inquiry and online appointment registration user utilization rates are respectively 10.8% and 10.4%, and then online consultation and inquiry, online medicine purchase, medical equipment, health products and exercise and fitness management account for about 6% of the online people. After the suggestion about promoting the development of the internet and medical health is issued and implemented, the internet hospitals of main medical institutions of China are established in succession, all internet medical institutions are also windy and cloudy, and the internet medical treatment covers a plurality of disciplines of clinical medicine, including internal, external, women, children, rehabilitation, nursing, monitoring, images, oral cavities, five sense organs, psychosis, skin, psychology, medical education and the like. But mainly develops internet medical services in the forms of registration, light inquiry and the like, and also makes some important progress. Currently, internet medical treatment developed in China basically focuses on the links of medical service flow optimization and light inquiry, and is almost stranded when a sudden major public health event occurs. With the application of advanced technologies such as big data, artificial intelligence and the like in internet medical treatment, the internet medical treatment can play more and more roles in a medical treatment system, and is expected to play a key role in dealing with large-scale outburst epidemic situations.
Disclosure of Invention
In order to solve the problems in the prior art, embodiments of the present invention provide a disease category assessment apparatus and method based on a medical knowledge map.
In a first aspect, an embodiment of the present invention provides a medical knowledge map-based disease category assessment apparatus, including:
the acquisition module is used for acquiring N-dimensional pathological condition characteristics;
the determining module is used for inputting the N-dimensional disease state characteristics to a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training.
Further, the disease category assessment model based on the medical knowledge map specifically includes:
the data labeling module is used for extracting N-dimensional pathological condition characteristics of the disease samples of the K categories;
the model training module is used for constructing a classification model by adopting a multi-classification support vector machine algorithm of paired classification of Gaussian kernel function for model training based on the N-dimensional pathology characteristics of the disease samples of the K classes;
the knowledge map retrieval module is used for carrying out knowledge retrieval through key fields based on a preset medical knowledge map and determining N-dimensional pathology characteristics corresponding to the key fields;
and the evaluation module is used for inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation and determining the disease category corresponding to the N-dimensional pathology feature samples.
Further, the medical knowledge map stores a map database;
correspondingly, the knowledge-graph retrieval module is specifically configured to:
performing knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
determining an N-dimensional pathology feature corresponding to the key field based on nodes and attributes associated with the key field.
Further, the apparatus further comprises:
the data query module is used for randomly extracting data based on the medical knowledge database and importing the extracted data into the labeling platform;
the prior labeling module is used for labeling the imported data based on a labeling rule preset by the labeling platform;
and the data storage module is used for storing the marked data into the marked database.
Further, the evaluation module is specifically configured to:
inputting the N-dimensional pathology features corresponding to the key fields into the classification model according to a pathology feature recording table corresponding to each type of independent model recorded in the training process for evaluation calculation, and performing voting decision on the prediction result of each type of independent model to determine the evaluation score corresponding to the N-dimensional pathology feature sample;
determining the disease category based on the assessment score.
Further, the medical knowledge map-based disease category assessment model further comprises:
a data acquisition module for acquiring medical data based on a medical knowledge database;
and the data cleaning module is used for cleaning the medical data acquired by the medical knowledge database.
Further, still include:
constructing a medical knowledge map based on a knowledge extraction module, a knowledge fusion module and a knowledge storage module; the knowledge extraction module is used for extracting the incidence relation among the N-dimensional pathology characteristic data of the disease samples of the K categories; the knowledge fusion module is used for extracting the similarity and the difference of different relations based on the incidence relation among the N-dimensional pathology characteristic data of the disease samples of the K categories, and merging or distinguishing based on the similarity and the difference of the different relations; the knowledge storage module is used for storing the data in the knowledge extraction module and the knowledge fusion module.
In a second aspect, an embodiment of the present invention provides a method for evaluating disease categories based on a medical knowledge map, including:
acquiring N-dimensional disease state characteristics;
inputting the N-dimensional disease state characteristics into a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training.
Further, the medical knowledge-map-based disease category assessment model comprises:
extracting N-dimensional pathology characteristics of the disease samples of the K categories;
based on the N-dimensional pathological condition characteristics of the disease samples of the K classes, adopting a multi-classification support vector machine algorithm of paired classification of Gaussian kernel function to construct a classification model for model training;
performing knowledge retrieval through key fields based on a preset medical knowledge map, and determining N-dimensional disease state characteristics corresponding to the key fields;
inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation, and determining the disease category corresponding to the N-dimensional pathology feature samples.
Further, the medical knowledge map stores a map database;
correspondingly, the method comprises the following steps of carrying out knowledge retrieval through a key field based on a preset medical knowledge graph, and determining the N-dimensional disease state characteristics corresponding to the key field, wherein the method specifically comprises the following steps:
performing knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
determining an N-dimensional pathology feature corresponding to the key field based on nodes and attributes associated with the key field.
In a third aspect, the present invention further provides an electronic device, which includes a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the program to implement the steps of the medical knowledge map-based disease category assessment method according to the second aspect.
In a fourth aspect, the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, the computer program, when executed by a processor, implementing the steps of the medical knowledge-map-based disease category assessment method according to the second aspect.
According to the technical scheme, the disease category assessment device and method based on the medical knowledge graph are used for acquiring N-dimensional disease state characteristics through the acquisition module; the determining module is used for inputting the N-dimensional disease state characteristics to a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph is obtained by training based on a machine learning algorithm by using an N-dimensional pathology characteristic sample as input data and using a disease category corresponding to the N-dimensional pathology characteristic sample as output data. The embodiment can make up the current situation of insufficient online high-quality medical resources by means of big data and artificial intelligence technology, and provides effective support for remote inquiry.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a medical knowledge base disease category assessment device according to an embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a medical knowledge base disease category assessment apparatus according to another embodiment of the present invention;
FIG. 3 is a flow chart of a method for evaluating disease category based on medical knowledge-graph according to an embodiment of the present invention;
fig. 4 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be described below in conjunction with the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some embodiments of the present invention, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making an invasive task, are within the scope of the present invention. The medical knowledge-map-based disease category assessment apparatus provided by the present invention will be explained and illustrated in detail by specific examples.
Fig. 1 is a schematic structural diagram of a medical knowledge base disease category assessment apparatus according to an embodiment of the present invention, as shown in fig. 1, the method includes: an obtaining module 201 and a determining module 202, wherein:
the acquiring module 201 is configured to acquire an N-dimensional pathology feature;
a determining module 202, configured to input the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph, so as to obtain a disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease characteristic sample as input data, and a disease category corresponding to the N-dimensional disease characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training.
In the present embodiment, it should be noted that N-dimensional pathology features such as dizziness, headache, fever, dry eyes, red eyes, dim eyes, tearing, stuffy nose, swelling of nose and head, and physical discomfort symptoms such as stomach ache, pain in chest, chest distress, aching pain, etc. are regarded as N-dimensional pathology features.
In this embodiment, it should be noted that the disease category assessment model based on the medical knowledge map is constructed by relying on the medical knowledge map and the clinical knowledge base, such as the traditional Chinese medicine knowledge map, the Chinese medical knowledge map CMeKG2.0 edition (the Chinese medical knowledge map jointly constructed by the Pengcheng laboratory artificial intelligence research center intelligent health medical topic group, the Beijing university computational linguistics research institute, the Zhengzhou university natural language processing laboratory), and the biomedical knowledge map, such as the clinical knowledge base query system developed by the Quicky apricot (Just for health) (the clinical knowledge base query system is a database query platform integrated with the authoritative medical information sources at home and abroad,
the system contains various information required by clinical medication, such as a medicine instruction book, a clinical guide, a Chinese medicine prescription and the like, which are effective tools for medical and health technicians such as doctors, pharmacists and the like to obtain medicine information), a clinical diagnosis and treatment knowledge base V3.2, a personal hygiene clinical knowledge base and the like.
In the embodiment, the N-dimensional disease state features are input into a disease category evaluation model based on a medical knowledge graph, and the disease category corresponding to the N-dimensional disease state features is obtained; the disease category evaluation model based on the medical knowledge graph is obtained by training based on a machine learning algorithm by using an N-dimensional pathology characteristic sample as input data and using a disease category corresponding to the N-dimensional pathology characteristic sample as output data. For example, three-dimensional disease characteristics of heartburn, vomiting and diarrhea are input into a disease category evaluation model based on a medical knowledge map, and the disease category corresponding to the three-dimensional disease characteristics, namely gastroenteritis, is obtained; for example, the disease category evaluation model trained based on the machine learning algorithm is applicable to the case that the pregnancy does not strictly belong to one disease category, but the pregnancy has some initial symptoms because the pregnancy has some initial symptoms.
In this embodiment, it should be noted that the disease category evaluation model based on the medical knowledge graph may include one or more of a data labeling module, a model training module, a knowledge graph retrieval module, and an evaluation module, where the data labeling module is configured to extract multi-dimensional information of disease samples of different categories, so as to establish a basic standard as basic data of model training; the model training module adopts a multi-classification support vector machine algorithm of paired classification, takes the multidimensional characteristics of diseases as input and takes the types of the diseases as output training models; the knowledge graph retrieval module is used for retrieving relevant multidimensional characteristics of the disease name or the keywords or the fields from a knowledge graph database through the disease name, the keywords, the fields and the like; and the evaluation module is used for outputting the disease category to which the disease probability belongs through evaluation model calculation according to the input multi-dimensional characteristics of the disease.
According to the technical scheme, the disease category evaluation device based on the medical knowledge graph provided by the embodiment of the invention is used for acquiring N-dimensional disease state characteristics through the acquisition module; the determination module is used for inputting the N-dimensional disease state characteristics to a disease category evaluation model based on a medical knowledge map to obtain the belonging disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training. The embodiment can make up the current situation of insufficient online high-quality medical resources by means of big data and artificial intelligence technology, and provides effective support for remote inquiry. If the situation that high-quality medical resources are transferred to off-line patients for treatment under the condition of solving a certain epidemic situation or a large-scale disease is solved, the shortage of the on-line high-quality medical resources can be made up by means of the device, and the intelligent support is provided for remote inquiry.
On the basis of the above embodiments, in this embodiment, the medical knowledge base disease category assessment model specifically includes:
the data labeling module is used for extracting N-dimensional pathological condition characteristics of the disease samples of the K categories;
the model training module is used for constructing a classification model by adopting a multi-classification support vector machine algorithm of paired classification of Gaussian kernel function for model training based on the N-dimensional pathology characteristics of the disease samples of the K classes;
the knowledge map retrieval module is used for carrying out knowledge retrieval through key fields based on a preset medical knowledge map and determining N-dimensional pathology characteristics corresponding to the key fields;
and the evaluation module is used for inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation and determining the disease category corresponding to the N-dimensional pathology feature samples.
In this embodiment, it should be noted that:
the data labeling module extracts multi-dimensional information of different types of disease samples, establishes a basic standard and uses the basic standard as basic data of model training;
the model training module extracts N-dimensional information of known K disease sample data of different classes, adopts a multi-classification support vector machine method of paired classification of Gaussian kernel functions to construct a specific classification model, realizes input of information such as data characteristics and the like, outputs an evaluation model of the disease class, and can realize training of the disease class confirmation rate evaluation model under small data volume.
The knowledge map retrieval module is used for retrieving knowledge of the constructed medical knowledge map according to the keywords and returning the information of each dimension of the medical knowledge map of the disease.
And the evaluation module is used for inputting the data information of each dimension related to the disease, which is inquired by the knowledge graph retrieval module, into the multi-classification support vector machine evaluation model trained and finished by the training module and outputting the disease category to which the disease probability belongs.
In this embodiment, it should be noted that the multi-classification support vector machine is a generalized linear classifier that performs multi-classification on data in a supervised learning manner, and a decision boundary of the multi-classification support vector machine is a maximum margin hyperplane for solving learned samples.
The model construction steps of the paired multi-classification support vector machine voting model are specifically described here.
Firstly, inputting a sample set of known diseases, assuming that the number of samples is M and the characteristic dimension of the diseases is N, and knowing K diseases, namely the sample set
Figure BDA0002947796490000091
Since the SVM is a binary classification method, and the embodiment needs to classify the sample set into K classes (i.e., K diseases), a pairwise classification multi-classification SVM method is adopted. Constructing a binary support vector machine between every two classes, this practice uses dijRepresents the decision boundary of the binary support vector machine between the ith disease class and the jth disease class, so this embodiment uses
Figure BDA0002947796490000101
The decision boundaries separate the training set into K classes.
Since some linear divisibility of the sample set of the disease cannot be determined, the present implementation first uses a linear transformation function
Figure BDA0002947796490000102
The N-dimensional features are mapped into a high-dimensional linearly separable space. And this linear transformation takes the gaussian kernel function:
Figure BDA0002947796490000103
therefore, in the high-dimensional linear separable space, the optimization problem of the support vector machine of the present implementation is converted into:
Figure BDA0002947796490000104
to solve this problem, the present embodiment classifies the known disease sample set as a training set into K classes, and for a new position disease data, the present embodiment only needs to put it into the model to infer the disease class to which it belongs.
Aiming at the problems that deep mining and analysis of medical knowledge are lacked in the prior art, and the accuracy of evaluating the disease category to which a disease belongs is difficult to improve, the disease category evaluating device based on the medical knowledge graph provided by the embodiment of the invention extracts multi-dimensional information of disease samples of different categories through the data labeling module, establishes a basic standard and uses the multi-dimensional information as basic data of model training; training models by using multi-dimensional characteristics of diseases as input through a model training module, training the models by using known data sets of different types of diseases, and training classification models by using a pairwise classification multi-classification support vector machine method; searching relevant multidimensional characteristics of the disease from a knowledge map database through a knowledge map searching module according to the disease name; and the evaluation module outputs the disease category to which the disease probability belongs through evaluation model calculation according to the input multi-dimensional characteristics of the disease, so that the accuracy of determining the disease category on line is improved.
On the basis of the above embodiment, in the present embodiment, the medical knowledge-map stores a map database;
correspondingly, the knowledge-graph retrieval module is specifically configured to:
performing knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
determining an N-dimensional pathology feature corresponding to the key field based on nodes and attributes associated with the key field.
In this implementation, it should be noted that, for the knowledge graph retrieval module, the disease category evaluation apparatus based on the medical knowledge graph according to the embodiment of the present invention performs search in the graph database stored in the knowledge graph by inputting the disease name keyword, and returns the information such as the node and the attribute associated with the disease node, thereby determining the N-dimensional disease characteristic corresponding to the key field.
On the basis of the above embodiment, in this embodiment, the apparatus further includes:
the data query module is used for randomly extracting data based on the medical knowledge database and importing the extracted data into the labeling platform;
the prior labeling module is used for labeling the imported data based on a labeling rule preset by the labeling platform;
and the data storage module is used for storing the marked data into the marked database.
In this embodiment, it should be noted that the disease category assessment apparatus based on the medical knowledge graph provided in the embodiment of the present invention is used for performing three steps of data query, prior labeling and data storage through the data query module, the prior labeling module and the data storage module. Data query, namely randomly extracting data for labeling from a medical knowledge database, and exporting the data to a labeling platform; prior labeling, namely labeling the data according to a marking rule by a medical expert through a labeling rule of a labeling platform; and (4) data storage, namely storing the labeled medical disease data into a labeled disease database through a labeling platform.
On the basis of the foregoing embodiment, in this embodiment, the evaluation module is specifically configured to:
inputting the N-dimensional pathology features corresponding to the key fields into the classification model according to a pathology feature recording table corresponding to each type of independent model recorded in the training process for evaluation calculation, and performing voting decision on the prediction result of each type of independent model to determine the evaluation score corresponding to the N-dimensional pathology feature sample; determining the disease category based on the assessment score.
In this embodiment, for example, the evaluation module is configured to perform two steps of feature extraction and evaluation scoring, that is, after feature extraction is performed on disease-related data retrieved by the knowledge-graph retrieval module, according to a data feature record table corresponding to each independent model recorded in a training process, the extracted feature data is input into a classifier model of the corresponding training module, evaluation calculation is performed, a voting decision is performed on prediction results of all the independent models, and finally a disease evaluation score is output, so that a disease category is determined based on the evaluation score. And if 50 scores of fever disease assessment, 60 scores of cold disease assessment and 70 scores of hot cold disease assessment are output, determining the disease category as hot cold.
On the basis of the above embodiments, in this embodiment, the medical knowledge-map-based disease category assessment model further includes:
a data acquisition module for acquiring medical data based on a medical knowledge database;
and the data cleaning module is used for cleaning the medical data acquired by the medical knowledge database.
On the basis of the above embodiment, in this embodiment, the method further includes:
constructing a medical knowledge map based on a knowledge extraction module, a knowledge fusion module and a knowledge storage module; the knowledge extraction module is used for extracting the incidence relation among the N-dimensional pathology characteristic data of the disease samples of the K categories; the knowledge fusion module is used for extracting the similarity and the difference of different relations based on the incidence relation among the N-dimensional pathology characteristic data of the disease samples of the K categories, and merging or distinguishing based on the similarity and the difference of the different relations; the knowledge storage module is used for storing the data in the knowledge extraction module and the knowledge fusion module.
For better understanding of the present solution, the following further illustrates the contents of the present invention with reference to fig. 2, but the present invention is not limited to the following examples.
A data preprocessing module is formed on the basis of the data acquisition module, the data cleaning module and the data labeling module; a knowledge map generation module is formed on the basis of the knowledge extraction module, the knowledge fusion module and the knowledge storage module; and the reasoning module is formed based on the model training module, the matching module, the searching module and the evaluation module.
Specifically, the method comprises the steps of interacting with a user through a data preprocessing module, collecting data, cleaning the data, labeling the data and the like, extracting multi-dimensional information of disease samples of different types, establishing a basic standard, using the basic standard as basic data of model training, and preparing for subsequent knowledge graph construction. And the knowledge extraction, knowledge fusion, knowledge storage and the like are completed through a knowledge map generation module. Completing model training, feature matching and feature searching through an inference module, training a classification model by using a multi-classification support vector machine method of paired classification through a model training module by taking multi-dimensional features of diseases as input and training the model by using a known data set of different types of diseases; searching relevant multidimensional characteristics of the disease from a knowledge map database through a knowledge map retrieval module according to the disease name; and the evaluation module outputs the disease category to which the disease probability belongs through evaluation model calculation according to the input multi-dimensional characteristics of the disease. And reasonably evaluating the result to generate a disease result. The data acquisition module acquires user interaction data and other medical data. And the data cleaning module is used for basically cleaning the acquired data. And the data labeling module extracts multi-dimensional information of the disease samples of different categories, establishes a basic standard and uses the basic standard as basic data of model training. And the knowledge extraction module is used for extracting the association relation among the characteristic data of the different types of disease samples. And the knowledge fusion module extracts the similarity and difference between different relations and combines or distinguishes data. And the knowledge storage module is used for storing data of the processed knowledge. The model training module extracts N-dimensional information of known K disease sample data of different classes, adopts a multi-classification support vector machine method of paired classification of Gaussian kernel function to construct a specific classification model, realizes information such as input data characteristics and the like, outputs an evaluation model of the affiliated disease class, and can realize training of a disease confirmation rate evaluation model under small data volume. And the knowledge map searching module is used for carrying out knowledge retrieval on the constructed medical knowledge map according to the keywords and returning the information of each dimension of the medical knowledge map of the disease. And the evaluation module is used for inputting the information of the data of each dimension related to the disease inquired by the knowledge map retrieval module into the multi-classification support vector machine evaluation model trained and finished by the training module and outputting the disease category to which the disease probability belongs.
Fig. 3 is a schematic flow chart of a medical knowledge base disease category assessment method according to an embodiment of the present invention, as shown in fig. 3, the method includes:
step 101: obtaining N-dimensional disease state characteristics.
Step 102: inputting the N-dimensional disease state characteristics into a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training.
On the basis of the above embodiments, in this embodiment, the medical knowledge-map-based disease category assessment model includes:
extracting N-dimensional pathology characteristics of the disease samples of the K categories;
based on the N-dimensional pathological condition characteristics of the disease samples of the K classes, adopting a multi-classification support vector machine algorithm of paired classification of Gaussian kernel function to construct a classification model for model training;
performing knowledge retrieval through key fields based on a preset medical knowledge map, and determining N-dimensional disease state characteristics corresponding to the key fields;
inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation, and determining the disease category corresponding to the N-dimensional pathology feature samples.
On the basis of the above embodiment, in the present embodiment, the medical knowledge-map stores a map database;
correspondingly, the method comprises the following steps of carrying out knowledge retrieval through a key field based on a preset medical knowledge graph, and determining the N-dimensional disease state characteristics corresponding to the key field, wherein the method specifically comprises the following steps:
performing knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
determining an N-dimensional pathology feature corresponding to the key field based on nodes and attributes associated with the key field.
The method for evaluating the disease category based on the medical knowledge graph provided by the embodiment of the invention can be specifically used for the device for evaluating the disease category based on the medical knowledge graph of the embodiment, the technical principle and the beneficial effect are similar, and the method can be specifically referred to the embodiment, and is not described herein again.
Based on the same inventive concept, an embodiment of the present invention provides an electronic device, which specifically includes the following components, with reference to fig. 4: a processor 301, a communication interface 303, a memory 302, and a communication bus 304;
the processor 301, the communication interface 303 and the memory 302 complete mutual communication through the communication bus 304; the communication interface 303 is used for realizing information transmission between related equipment such as modeling software, an intelligent manufacturing equipment module library and the like; the processor 301 is configured to call the computer program in the memory 302, and the processor executes the computer program to implement the apparatus provided by the above-mentioned apparatus embodiments, for example, the processor executes the computer program to implement the following steps: acquiring N-dimensional disease state characteristics; inputting the N-dimensional disease state characteristics into a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training.
Based on the same inventive concept, another embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, is implemented to execute the apparatus provided by the above-mentioned apparatus embodiments, for example, to obtain N-dimensional disease state features; inputting the N-dimensional disease state characteristics into a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training.
The above-described method embodiments are merely illustrative, wherein units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without undue invasive labor.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute the various embodiments or some parts of the embodiments.
In addition, in the present invention, terms such as "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Moreover, in the present invention, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or device that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or device. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, apparatus, article, or device that comprises the element.
Furthermore, in the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Moreover, various embodiments or examples and features of various embodiments or examples described in this specification can be combined and combined by one skilled in the art without conflicting disclosure.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may be modified or some technical features may be equivalently replaced; such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A medical knowledge map-based disease category assessment apparatus, comprising:
the acquisition module is used for acquiring N-dimensional pathological condition characteristics;
the determining module is used for inputting the N-dimensional disease state characteristics to a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training.
2. The medical knowledge-map based disease category assessment apparatus according to claim 1, wherein said medical knowledge-map based disease category assessment model comprises:
the data labeling module is used for extracting N-dimensional pathological condition characteristics of the disease samples of the K categories;
the model training module is used for constructing a classification model by adopting a multi-classification support vector machine algorithm of paired classification of Gaussian kernel function for model training based on the N-dimensional pathological condition characteristics of the disease samples of the K classes;
the knowledge map retrieval module is used for carrying out knowledge retrieval through key fields based on a preset medical knowledge map and determining N-dimensional pathology characteristics corresponding to the key fields;
and the evaluation module is used for inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation, and determining the disease category corresponding to the N-dimensional pathology feature samples.
3. The medical knowledge graph-based disease category assessment apparatus according to claim 2, wherein said medical knowledge graph stores a graph database;
correspondingly, the knowledge-graph retrieval module is specifically configured to:
performing knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
determining an N-dimensional pathology feature corresponding to the key field based on nodes and attributes associated with the key field.
4. The medical knowledge-graph-based disease category assessment apparatus according to claim 2, further comprising:
the data query module is used for randomly extracting data based on the medical knowledge database and importing the extracted data into the labeling platform;
the prior labeling module is used for labeling the imported data based on a labeling rule preset by the labeling platform;
and the data storage module is used for storing the marked data into the marked database.
5. The medical knowledge-graph-based disease category assessment apparatus according to claim 2, wherein said assessment module is specifically configured to:
inputting the N-dimensional pathology features corresponding to the key fields into the classification model according to a pathology feature recording table corresponding to each type of independent model recorded in the training process for evaluation calculation, and performing voting decision on the prediction result of each type of independent model to determine an evaluation score corresponding to the N-dimensional pathology feature sample;
determining the disease category based on the assessment score.
6. The medical knowledge-map based disease category assessment apparatus according to claim 1, wherein said medical knowledge-map based disease category assessment model further comprises:
a data acquisition module for acquiring medical data based on a medical knowledge database;
and the data cleaning module is used for cleaning the medical data acquired by the medical knowledge database.
7. The medical knowledge-map-based disease category assessment device of claim 1, further comprising:
constructing a medical knowledge map based on a knowledge extraction module, a knowledge fusion module and a knowledge storage module; the knowledge extraction module is used for extracting the incidence relation among the N-dimensional pathology characteristic data of the disease samples of the K categories; the knowledge fusion module is used for extracting the similarity and the difference of different relations based on the incidence relation among the N-dimensional pathology characteristic data of the disease samples of the K categories, and merging or distinguishing based on the similarity and the difference of the different relations; and the knowledge storage module is used for storing the data in the knowledge extraction module and the knowledge fusion module.
8. A disease category assessment method based on a medical knowledge map is characterized by comprising the following steps:
acquiring N-dimensional disease state characteristics;
inputting the N-dimensional disease state characteristics into a disease category evaluation model based on a medical knowledge graph to obtain a disease category corresponding to the N-dimensional disease state characteristics; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology characteristic sample as input data, and a disease category corresponding to the N-dimensional pathology characteristic sample as output data, wherein the output data is obtained based on machine learning algorithm training.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the medical knowledge base disease category assessment method according to claim 8.
10. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the medical knowledge-map-based disease category assessment method according to claim 8.
CN202110200064.6A 2021-02-22 2021-02-22 Disease category assessment device and method based on medical knowledge graph Active CN113035346B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110200064.6A CN113035346B (en) 2021-02-22 2021-02-22 Disease category assessment device and method based on medical knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110200064.6A CN113035346B (en) 2021-02-22 2021-02-22 Disease category assessment device and method based on medical knowledge graph

Publications (2)

Publication Number Publication Date
CN113035346A true CN113035346A (en) 2021-06-25
CN113035346B CN113035346B (en) 2023-09-22

Family

ID=76461223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110200064.6A Active CN113035346B (en) 2021-02-22 2021-02-22 Disease category assessment device and method based on medical knowledge graph

Country Status (1)

Country Link
CN (1) CN113035346B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295229A (en) * 2016-08-30 2017-01-04 青岛大学 A kind of mucocutaneous lymphnode syndrome grade predicting method based on medical data modeling
US20170071496A1 (en) * 2014-03-10 2017-03-16 H. Lee Moffitt Cancer Center And Research Institute, Inc. Radiologically identified tumor habitats
CN109346169A (en) * 2018-10-17 2019-02-15 长沙瀚云信息科技有限公司 A kind of artificial intelligence assisting in diagnosis and treatment system and its construction method, equipment and storage medium
CN110391021A (en) * 2019-07-04 2019-10-29 北京爱医生智慧医疗科技有限公司 A kind of disease inference system based on medical knowledge map
CN110911009A (en) * 2019-11-14 2020-03-24 南京医科大学 Clinical diagnosis aid decision-making system and medical knowledge map accumulation method
CN111657925A (en) * 2020-07-08 2020-09-15 中国科学院苏州生物医学工程技术研究所 Electrocardiosignal classification method, system, terminal and storage medium based on machine learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170071496A1 (en) * 2014-03-10 2017-03-16 H. Lee Moffitt Cancer Center And Research Institute, Inc. Radiologically identified tumor habitats
CN106295229A (en) * 2016-08-30 2017-01-04 青岛大学 A kind of mucocutaneous lymphnode syndrome grade predicting method based on medical data modeling
CN109346169A (en) * 2018-10-17 2019-02-15 长沙瀚云信息科技有限公司 A kind of artificial intelligence assisting in diagnosis and treatment system and its construction method, equipment and storage medium
CN110391021A (en) * 2019-07-04 2019-10-29 北京爱医生智慧医疗科技有限公司 A kind of disease inference system based on medical knowledge map
CN110911009A (en) * 2019-11-14 2020-03-24 南京医科大学 Clinical diagnosis aid decision-making system and medical knowledge map accumulation method
CN111657925A (en) * 2020-07-08 2020-09-15 中国科学院苏州生物医学工程技术研究所 Electrocardiosignal classification method, system, terminal and storage medium based on machine learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JUNXIAN CAI等: "Multi-ECGNet for ECG Arrythmia Multi-Label Classification", 《IEEE ACCESS》, vol. 8, pages 110848 - 110858, XP011795222, DOI: 10.1109/ACCESS.2020.3001284 *
陆家发等: "基于深度学习的疾病诊断", 《医学信息学杂志》, vol. 38, no. 04, pages 39 - 43 *
马彩云: "基于机器学习的房颤诊断模型研究与应用", 《中国优秀硕士学位论文全文数据库 医药卫生科技辑》, no. 02, pages 062 - 461 *

Also Published As

Publication number Publication date
CN113035346B (en) 2023-09-22

Similar Documents

Publication Publication Date Title
CN107403068B (en) Merge the intelligence auxiliary way of inquisition and system of clinical thinking
CN112131393B (en) Medical knowledge graph question-answering system construction method based on BERT and similarity algorithm
Nie et al. Bridging the vocabulary gap between health seekers and healthcare knowledge
CN112863630A (en) Personalized accurate medical question-answering system based on data and knowledge
Lei et al. A novel data-driven robust framework based on machine learning and knowledge graph for disease classification
Xie et al. Open knowledge accessing method in IoT-based hospital information system for medical record enrichment
CN111191048B (en) Knowledge graph-based emergency inquiry and answer system construction method
CN110675944A (en) Triage method and device, computer equipment and medium
Chen et al. Identify topic relations in scientific literature using topic modeling
WO2023029506A1 (en) Illness state analysis method and apparatus, electronic device, and storage medium
WO2023178971A1 (en) Internet registration method, apparatus and device for seeking medical advice, and storage medium
Lin et al. Patient similarity via joint embeddings of medical knowledge graph and medical entity descriptions
Wang et al. Attention-based multi-instance neural network for medical diagnosis from incomplete and low quality data
Pendyala et al. Automated medical diagnosis from clinical data
CN113764112A (en) Online medical question and answer method
CN115293161A (en) Reasonable medicine taking system and method based on natural language processing and medicine knowledge graph
Gudivada et al. A literature review on machine learning based medical information retrieval systems
CN113409907A (en) Intelligent pre-inquiry method and system based on Internet hospital
CN116910172A (en) Follow-up table generation method and system based on artificial intelligence
Saranya et al. Intelligent medical data storage system using machine learning approach
Wang et al. A review of the application of natural language processing in clinical medicine
CN113035346B (en) Disease category assessment device and method based on medical knowledge graph
CN114496231B (en) Knowledge graph-based constitution identification method, device, equipment and storage medium
CN115831380A (en) Intelligent medical data management system and method based on medical knowledge graph
CN111339252B (en) Searching method, searching device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant