CN113035346B - Disease category assessment device and method based on medical knowledge graph - Google Patents

Disease category assessment device and method based on medical knowledge graph Download PDF

Info

Publication number
CN113035346B
CN113035346B CN202110200064.6A CN202110200064A CN113035346B CN 113035346 B CN113035346 B CN 113035346B CN 202110200064 A CN202110200064 A CN 202110200064A CN 113035346 B CN113035346 B CN 113035346B
Authority
CN
China
Prior art keywords
disease
dimensional
module
data
knowledge graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110200064.6A
Other languages
Chinese (zh)
Other versions
CN113035346A (en
Inventor
殷波
焦立博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Information Science and Technology University
Original Assignee
Beijing Information Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Information Science and Technology University filed Critical Beijing Information Science and Technology University
Priority to CN202110200064.6A priority Critical patent/CN113035346B/en
Publication of CN113035346A publication Critical patent/CN113035346A/en
Application granted granted Critical
Publication of CN113035346B publication Critical patent/CN113035346B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/60ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices
    • G16H40/67ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the operation of medical equipment or devices for remote operation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a disease category assessment device and method based on a medical knowledge graph, comprising the following steps: the acquisition module is used for acquiring the N-dimensional pathology characteristics; the determining module is used for inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease feature sample as input data, and belongs disease categories corresponding to the N-dimensional disease feature sample as output data, wherein the output data is obtained based on machine learning algorithm training. The embodiment can make up the current situation of insufficient online high-quality medical resources by means of big data and artificial intelligence technology, and provides effective support for remote inquiry.

Description

Disease category assessment device and method based on medical knowledge graph
Technical Field
The invention relates to the technical field of computers, in particular to a disease category assessment device and method based on a medical knowledge graph.
Background
At present, based on a mobile internet technology and a mobile intelligent terminal, the medical internet process is fast, the total number of mobile users of three operators in China reaches 15.98 hundred million users, wherein the medical users in China break through 1 hundred million in 2015 and reach 3.8 hundred million people in 2017. According to the data of the market prospect and investment opportunity research report of the mobile medical industry in 2018-2022 published by the China industry research institute, the Internet medical industry market scale is rapidly increased due to the release of the knowledge payment time and the medical electric business policy, the Internet medical industry market scale reaches 231.4 hundred million yuan in 2017, and the Internet medical market scale is expected to exceed 1000 hundred million yuan in 2020. With the popularization of mobile intelligent terminals, various mobile medical software is continuously emerging, and more convenient medical information and diagnosis and treatment services are provided for hospitals and common people. The medical information inquiry and online appointment registration users have the highest utilization rates of 10.8% and 10.4% respectively, and then the medical information inquiry and online appointment registration users perform online consultation, online medicine purchase, medical equipment purchase, health product exercise and body-building management, and the proportion of the medical information inquiry and online appointment registration users is about 6% of the proportion of the netizens. After the release of the opinion on promoting the development of the Internet and medical health, the main medical institutions in China establish Internet hospitals successively, and each Internet medical institution is also in a cloud, and Internet medical treatment covers a plurality of subjects of clinical medicine including internal, external, women, infants, rehabilitation, nursing, monitoring, imaging, oral cavity, five sense organs, psychosis, skin, psychological education, medical education and the like. But internet medical services in the form of registration, light inquiry and the like have been mainly developed, and some important progress has been made. At present, the Internet medical treatment developed in China is basically concentrated in the links of medical service flow optimization and light inquiry, and almost no strategy is brought when sudden and serious public health events are encountered. Along with the application of leading edge technologies such as big data, artificial intelligence and the like in Internet medical treatment, the Internet medical treatment plays an increasingly large role in a medical system, and is also expected to play a key role in coping with large-scale sudden epidemic situations.
Disclosure of Invention
Aiming at the problems existing in the prior art, the embodiment of the invention provides a disease category assessment device and method based on a medical knowledge graph.
In a first aspect, an embodiment of the present invention provides a disease category evaluation device based on a medical knowledge graph, including:
the acquisition module is used for acquiring the N-dimensional pathology characteristics;
the determining module is used for inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease feature sample as input data, and the disease category corresponding to the N-dimensional disease feature sample as output data, wherein the output data is obtained based on machine learning algorithm training.
Further, the disease category evaluation model based on the medical knowledge graph specifically comprises:
the data labeling module is used for extracting N-dimensional pathology features of K categories of disease samples;
the model training module is used for constructing a classification model by adopting a multi-classification support vector machine algorithm of the paired classification of the Gaussian kernel function based on the N-dimensional pathology characteristics of the K classes of disease samples to carry out model training;
the knowledge graph retrieval module is used for carrying out knowledge retrieval through the key field based on a preset medical knowledge graph and determining N-dimensional pathology features corresponding to the key field;
and the evaluation module is used for inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation and determining the disease category corresponding to the N-dimensional pathology feature sample.
Further, the medical knowledge graph stores a graph database;
correspondingly, the knowledge graph retrieval module is specifically configured to:
carrying out knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
an N-dimensional pathology feature corresponding to the key field is determined based on the nodes and attributes associated with the key field.
Further, the apparatus further comprises:
the data query module is used for randomly extracting data based on the medical knowledge database and importing the extracted data into the labeling platform;
the priori marking module is used for marking the imported data based on marking rules preset by the marking platform;
and the data storage module is used for storing the marked data into the marked database.
Further, the evaluation module is specifically configured to:
inputting the N-dimensional pathology features corresponding to the key fields into the classification model according to a pathology feature record table corresponding to each type of independent model recorded in the training process to evaluate, and carrying out voting decision on the prediction result of each type of independent model to determine an evaluation value corresponding to the N-dimensional pathology feature sample;
and determining the category of the disease based on the evaluation value.
Further, the disease category evaluation model based on the medical knowledge graph further comprises:
the data acquisition module is used for acquiring medical data based on the medical knowledge database;
and the data cleaning module is used for cleaning the data of the medical data acquired by the medical knowledge database.
Further, the method further comprises the following steps:
constructing a medical knowledge graph based on the knowledge extraction module, the knowledge fusion module and the knowledge storage module; the knowledge extraction module is used for extracting association relations between N-dimensional pathology feature data of K categories of disease samples; the knowledge fusion module is used for extracting similarity and difference of different relations based on the association relation between the N-dimensional pathology feature data of the K categories of disease samples, and combining or distinguishing based on the similarity and the difference of the different relations; the knowledge storage module is used for storing the data in the knowledge extraction module and the knowledge fusion module.
In a second aspect, an embodiment of the present invention provides a disease category evaluation method based on a medical knowledge graph, including:
acquiring N-dimensional pathology features;
inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease feature sample as input data, and belongs disease categories corresponding to the N-dimensional disease feature sample as output data, wherein the output data is obtained based on machine learning algorithm training.
Further, the disease category evaluation model based on the medical knowledge graph comprises:
extracting N-dimensional pathology features of K categories of disease samples;
based on the N-dimensional pathology features of the K classes of disease samples, constructing a classification model by adopting a multi-classification support vector machine algorithm of the paired classification of the Gaussian kernel function to perform model training;
carrying out knowledge retrieval through a key field based on a preset medical knowledge graph, and determining N-dimensional pathology features corresponding to the key field;
and inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation, and determining the type of the disease corresponding to the N-dimensional pathology feature sample.
Further, the medical knowledge graph stores a graph database;
correspondingly, knowledge retrieval is carried out through key fields based on a preset medical knowledge graph, and N-dimensional pathology features corresponding to the key fields are determined, specifically comprising:
carrying out knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
an N-dimensional pathology feature corresponding to the key field is determined based on the nodes and attributes associated with the key field.
In a third aspect, an embodiment of the present invention further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the disease category assessment method based on a medical knowledge graph according to the second aspect above when the processor executes the program.
In a fourth aspect, embodiments of the present invention also provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the medical knowledge graph based disease category assessment method according to the above second aspect.
As can be seen from the above technical solutions, the disease category assessment device and method based on a medical knowledge graph provided by the embodiments of the present invention are configured to obtain N-dimensional pathology features through an obtaining module; the determination module is used for inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph is obtained by taking an N-dimensional disease feature sample as input data and the disease category corresponding to the N-dimensional disease feature sample as output data and training based on a machine learning algorithm. The embodiment can make up the current situation of insufficient online high-quality medical resources by means of big data and artificial intelligence technology and provides effective support for far Cheng Wenzhen.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is apparent that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic structural diagram of a disease category evaluation device based on a medical knowledge graph according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a disease category evaluation device based on a medical knowledge graph according to another embodiment of the present invention;
fig. 3 is a flow chart of a disease category evaluation method based on a medical knowledge graph according to an embodiment of the invention;
fig. 4 is a schematic physical structure of an electronic device according to an embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by one of ordinary skill in the art without undue burden on the person of ordinary skill in the art based on embodiments of the present invention, are within the scope of the present invention. The disease category evaluation device based on the medical knowledge graph provided by the invention will be explained and illustrated in detail by specific examples.
Fig. 1 is a schematic structural diagram of a disease category evaluation device based on a medical knowledge graph according to an embodiment of the present invention, as shown in fig. 1, the method includes: an acquisition module 201 and a determination module 202, wherein:
wherein, the obtaining module 201 is configured to obtain an N-dimensional pathology feature;
a determining module 202, configured to input the N-dimensional pathology feature to a disease category evaluation model based on a medical knowledge graph spectrum, to obtain a disease category corresponding to the N-dimensional pathology feature; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease feature sample as input data, and a disease category corresponding to the N-dimensional disease feature sample as output data, wherein the output data is obtained based on machine learning algorithm training.
In this example, it is noted that the N-dimensional symptoms such as dizziness, headache, fever, dry eyes, redness and swelling of eyes, eye blossoming, tearing, nasal obstruction, nasal turgescence, and stomach pain, chest distress, soreness, and the like are regarded as the N-dimensional symptoms.
In this embodiment, it should be noted that the disease category evaluation model based on the medical knowledge graph is constructed by means of the medical knowledge graph and the clinical knowledge base, such as the traditional Chinese medicine knowledge graph, the cmekg2.0 version of the traditional Chinese medicine knowledge graph (the traditional Chinese medicine knowledge graph constructed by the intelligent health care subject group of the artificial intelligence research center of the pengcheng university, the natural language processing laboratory of the Zhengzhou university), the biomedical knowledge graph, and the like, such as the clinical knowledge base query system developed by the fast apricot (just for health) (the clinical knowledge base query system is a database query platform integrating authoritative medical information sources at home and abroad),
the system receives various information required by clinical medication, such as a medicine specification, a clinical guideline, a traditional Chinese medicine prescription, and the like, and is an effective tool for medical and health technicians such as doctors, pharmacists, and the like to acquire medical information), a clinical diagnosis and treatment knowledge base V3.2, a personal and health clinical knowledge base, and the like.
In this embodiment, the N-dimensional pathology features are input to a disease category evaluation model based on a medical knowledge graph, so as to obtain a disease category to which the N-dimensional pathology features correspond; the disease category evaluation model based on the medical knowledge graph is obtained by taking an N-dimensional disease feature sample as input data and the disease category corresponding to the N-dimensional disease feature sample as output data and training based on a machine learning algorithm. For example, three-dimensional symptoms of nausea, vomiting and diarrhea are input into a disease category evaluation model based on a medical knowledge graph, so as to obtain gastroenteritis which is a disease category corresponding to the three-dimensional symptoms; for example, "nausea", "vomiting" may even be "somnolence" and may also cause "frequent urination", "urgent urination" five-dimensional pathology features to be input into a disease category assessment model based on a medical knowledge graph spectrum, resulting in a belonging disease category corresponding to the five-dimensional pathology features, i.e. pregnancy, in which case it is to be noted that pregnancy cannot strictly belong to one disease category, but that the disease category assessment model based on machine learning algorithm training is applicable due to the presence of some initial symptoms of pregnancy.
In this embodiment, it should be noted that the disease category evaluation model based on the medical knowledge graph may include one or more of a data labeling module, a model training module, a knowledge graph retrieval module and an evaluation module, where the data labeling module is used to extract multidimensional information of different disease samples, so as to establish a basic standard as basic data of model training; the model training module adopts a multi-classification support vector machine algorithm of paired classification, takes disease multidimensional characteristics as input and disease types as output training models; the knowledge graph retrieval module is used for retrieving relevant multidimensional features of the disease name or the key word or the field from the knowledge graph database through the disease name, the key word, the field and the like; and the evaluation module is used for outputting the disease category to which the disease probability belongs by calculating according to the multi-dimensional characteristics input into the disease and the evaluation model.
As can be seen from the above technical solutions, the disease category evaluation device based on a medical knowledge graph provided by the embodiments of the present invention is configured to obtain N-dimensional pathology features through an obtaining module; the determination module is used for inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease feature sample as input data, and belongs disease categories corresponding to the N-dimensional disease feature sample as output data, wherein the output data is obtained based on machine learning algorithm training. The embodiment can make up the current situation of insufficient online high-quality medical resources by means of big data and artificial intelligence technology, and provides effective support for remote inquiry. If the situation that high-quality medical resources are adjusted to off-line patients is solved under certain epidemic situation or large-scale disease condition, the device can make up the shortage of the on-line high-quality medical resources and provide intelligent support for remote consultation.
On the basis of the foregoing embodiment, in this embodiment, the disease category evaluation model based on a medical knowledge graph specifically includes:
the data labeling module is used for extracting N-dimensional pathology features of K categories of disease samples;
the model training module is used for constructing a classification model by adopting a multi-classification support vector machine algorithm of the paired classification of the Gaussian kernel function based on the N-dimensional pathology characteristics of the K classes of disease samples to carry out model training;
the knowledge graph retrieval module is used for carrying out knowledge retrieval through the key field based on a preset medical knowledge graph and determining N-dimensional pathology features corresponding to the key field;
and the evaluation module is used for inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation and determining the disease category corresponding to the N-dimensional pathology feature sample.
In this embodiment, it should be noted that:
the data labeling module is used for extracting multidimensional information of different types of disease samples, establishing a basic standard and taking the basic standard as basic data of model training;
the model training module extracts N-dimensional information of known K different types of disease sample data, builds a specific classification model by adopting a multi-classification support vector machine method of paired classification of Gaussian kernel functions, realizes information such as input data characteristics and the like, outputs an evaluation model of a disease type, and can realize training of the disease type confirmation rate evaluation model under small data volume.
The knowledge graph retrieval module is used for performing knowledge retrieval on the constructed medical knowledge graph according to the keywords and returning the dimension information of the medical knowledge graph of the disease.
The evaluation module is used for inputting the relevant dimensionality data information of the diseases, which is queried by the knowledge graph retrieval module, into the multi-classification support vector machine evaluation model which is trained by the training module, and outputting the disease category to which the disease probability belongs.
In this embodiment, it should be noted that the multi-classification support vector machine is a type of generalized linear classifier that performs multi-classification on data in a supervised learning manner, and the decision boundary is the maximum margin hyperplane for solving the learning sample.
The model building steps of the paired multi-class support vector machine voting model are specifically described herein.
First, a sample set of known diseases is input, and given that the number of samples is M, the characteristic dimension of the disease is N, K diseases are known, i.e. the sample set
Because the support vector machine is a binary classification method, the present embodiment requires classifying the sample set into K classes(i.e., K diseases), a multi-classification support vector machine approach to pairwise classification is employed herein. Constructing a binary support vector machine between every two classes, and implementing d ij Representing a binary support vector machine decision boundary between the ith disease class and the jth disease class, the present embodiment usesThe decision boundaries divide the training set into K classes.
Since a certain linear partition of a sample set of a disease cannot be determined, the present implementation uses a linear transformation functionThe N-dimensional features are mapped into a high-dimensional linearly separable space. And this linear transformation selects the gaussian kernel:
therefore, in a high-dimensional linear separable space, the optimization problem of the support vector machine of the present implementation is converted into:
solving this problem, the present embodiment classifies the known disease sample set as a training set into K classes, and for a new data of the disease at the location, the present implementation can infer the disease class to which it belongs by only putting it into the model.
Aiming at the problem that the prior art lacks of deep mining and analysis of medical knowledge and is difficult to improve the accuracy of evaluating the disease category to which the disease belongs, the disease category evaluating device based on the medical knowledge graph provided by the embodiment of the invention extracts multidimensional information of disease samples of different categories through a data labeling module and establishes a basic standard as basic data of model training; through a model training module, taking multi-dimensional characteristics of diseases as input, training a model by using a data set of known different kinds of diseases, and training a classification model by using a multi-classification support vector machine method of paired classification; searching relevant multidimensional features of the disease from a knowledge graph database according to the name of the disease through a knowledge graph searching module; and the evaluation module is used for outputting the disease category to which the disease probability belongs by calculating through an evaluation model according to the multidimensional characteristics of the input disease, so that the accuracy of determining the disease category on line is improved.
On the basis of the above embodiment, in this embodiment, the medical knowledge graph stores a graph database;
correspondingly, the knowledge graph retrieval module is specifically configured to:
carrying out knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
an N-dimensional pathology feature corresponding to the key field is determined based on the nodes and attributes associated with the key field.
In this embodiment, it should be noted that, for the knowledge graph retrieval module, the disease category evaluation device based on the medical knowledge graph provided by the embodiment of the present invention performs searching in the graph database stored by the knowledge graph by inputting the disease name keyword, and returns the information such as the node and the attribute associated with the disease node, so as to determine the N-dimensional pathology feature corresponding to the key field.
On the basis of the foregoing embodiment, in this embodiment, the apparatus further includes:
the data query module is used for randomly extracting data based on the medical knowledge database and importing the extracted data into the labeling platform;
the priori marking module is used for marking the imported data based on marking rules preset by the marking platform;
and the data storage module is used for storing the marked data into the marked database.
In this embodiment, it should be noted that, the disease category evaluation device based on the medical knowledge graph provided by the embodiment of the present invention is configured to perform three steps of data query, priori labeling, and data storage through the data query module, the priori labeling module, and the data storage module. Data query, namely randomly extracting data to be marked from a medical knowledge database, and leading the data out to a marking platform; the prior labeling is carried out, and medical professionals label the data according to the scoring rule through the labeling rule of the labeling platform; and (3) storing the data, namely storing the marked medical disease data into a marked disease database through a marking platform.
On the basis of the foregoing embodiment, in this embodiment, the evaluation module is specifically configured to:
inputting the N-dimensional pathology features corresponding to the key fields into the classification model according to a pathology feature record table corresponding to each type of independent model recorded in the training process to evaluate, and carrying out voting decision on the prediction result of each type of independent model to determine an evaluation value corresponding to the N-dimensional pathology feature sample; and determining the category of the disease based on the evaluation value.
In this embodiment, for example, the evaluation module is configured to perform two steps of feature extraction and evaluation, that is, after feature extraction is performed on the disease-related data retrieved by the knowledge graph retrieval module, the extracted feature data is input into the classifier model of the corresponding training module according to the data feature record table corresponding to each independent model recorded in the training process, evaluation calculation is performed, voting decisions are performed on the prediction results of all independent models, and finally, a disease evaluation score is output, so that the disease category is determined based on the evaluation score. If the fever disease evaluation value is output for 50 minutes, the cold disease evaluation value is output for 60 minutes, and the hot cold disease evaluation value is output for 70 minutes, the disease category is determined to be hot cold.
On the basis of the foregoing embodiment, in this embodiment, the disease category evaluation model based on a medical knowledge graph further includes:
the data acquisition module is used for acquiring medical data based on the medical knowledge database;
and the data cleaning module is used for cleaning the data of the medical data acquired by the medical knowledge database.
On the basis of the above embodiment, in this embodiment, further includes:
constructing a medical knowledge graph based on the knowledge extraction module, the knowledge fusion module and the knowledge storage module; the knowledge extraction module is used for extracting association relations between N-dimensional pathology feature data of K categories of disease samples; the knowledge fusion module is used for extracting similarity and difference of different relations based on the association relation between the N-dimensional pathology feature data of the K categories of disease samples, and combining or distinguishing based on the similarity and the difference of the different relations; the knowledge storage module is used for storing the data in the knowledge extraction module and the knowledge fusion module.
For a better understanding of the present solution, the following description of the present invention will be further described with reference to fig. 2, but the present invention is not limited to the following examples.
Forming a data preprocessing module based on the data acquisition module, the data cleaning module and the data labeling module; the knowledge graph generation module is formed based on the knowledge extraction module, the knowledge fusion module and the knowledge storage module; the model training module, the matching module, the searching module and the evaluating module form an reasoning module.
Specifically, the data preprocessing module is interacted with a user, data acquisition, data cleaning, data labeling and the like are performed, multidimensional information of different types of disease samples is extracted, a basic standard is established as basic data of model training, and preparation is performed for subsequent knowledge graph construction. And the knowledge extraction, knowledge fusion, knowledge storage and the like are completed through the knowledge map generation module. Model training is completed through an inference module, feature matching and feature searching are performed, a model training module is used for training a classification model by taking multi-dimensional features of diseases as input, known data sets of different kinds of diseases are used for training the model, and a multi-classification support vector machine method of paired classification is used for training the classification model; searching relevant multidimensional features of the disease from a knowledge graph database according to the name of the disease through a knowledge graph searching module; and the evaluation module is used for outputting the disease category to which the disease probability belongs by calculating through an evaluation model according to the multidimensional characteristics of the input disease. And reasonably evaluating the results to generate disease results. The data acquisition module acquires user interaction data and other medical data. And the data cleaning module is used for basically cleaning the acquired data. And the data labeling module is used for extracting multidimensional information of different types of disease samples, establishing a basic standard and taking the basic standard as basic data of model training. And the knowledge extraction module is used for extracting the association relation between the characteristic data of different types of disease samples. And the knowledge fusion module is used for extracting the similarity and the difference between different relations and merging or distinguishing the data. And the knowledge storage module is used for storing the processed knowledge in data. The model training module extracts N-dimensional information of known K different types of disease sample data, builds a specific classification model by adopting a multi-classification support vector machine method of pairwise classification of Gaussian kernel functions, realizes information such as input data characteristics and the like, outputs an evaluation model of a disease type, and can realize the training of a disease confirmation rate evaluation model under a small data volume. And the knowledge map searching module is used for carrying out knowledge retrieval on the constructed medical knowledge map according to the keywords and returning the dimension information of the medical knowledge map of the disease. And the evaluation module is used for inputting the relevant dimensionality data information of the diseases, which is queried by the knowledge graph retrieval module, into the multi-classification support vector machine evaluation model which is trained by the training module, and outputting the disease category to which the disease probability belongs.
Fig. 3 is a flow chart of a disease category evaluation method based on a medical knowledge graph according to an embodiment of the invention, as shown in fig. 3, the method includes:
step 101: an N-dimensional pathology feature is acquired.
Step 102: inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology feature sample as input data, and the disease category corresponding to the N-dimensional pathology feature sample as output data, wherein the output data is obtained based on machine learning algorithm training.
On the basis of the foregoing embodiment, in this embodiment, the disease category evaluation model based on a medical knowledge graph includes:
extracting N-dimensional pathology features of K categories of disease samples;
based on the N-dimensional pathology features of the K classes of disease samples, constructing a classification model by adopting a multi-classification support vector machine algorithm of the paired classification of the Gaussian kernel function to perform model training;
carrying out knowledge retrieval through a key field based on a preset medical knowledge graph, and determining N-dimensional pathology features corresponding to the key field;
and inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation, and determining the type of the disease corresponding to the N-dimensional pathology feature sample.
On the basis of the above embodiment, in this embodiment, the medical knowledge graph stores a graph database;
correspondingly, knowledge retrieval is carried out through key fields based on a preset medical knowledge graph, and N-dimensional pathology features corresponding to the key fields are determined, specifically comprising:
carrying out knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
an N-dimensional pathology feature corresponding to the key field is determined based on the nodes and attributes associated with the key field.
The disease category evaluation method based on the medical knowledge graph provided by the embodiment of the invention can be particularly used for the disease category evaluation device based on the medical knowledge graph of the embodiment, and the technical principle and the beneficial effects are similar, and the detailed description of the embodiment is omitted.
Based on the same inventive concept, an embodiment of the present invention provides an electronic device, referring to fig. 4, including the following details: a processor 301, a communication interface 303, a memory 302 and a communication bus 304;
wherein, the processor 301, the communication interface 303 and the memory 302 complete the communication with each other through the communication bus 304; the communication interface 303 is used for realizing information transmission between related devices such as modeling software, an intelligent manufacturing equipment module library and the like; the processor 301 is configured to invoke a computer program in the memory 302, and when the processor executes the computer program, the processor implements the apparatus provided in the above apparatus embodiments, for example, when the processor executes the computer program, the processor implements the following steps: acquiring N-dimensional pathology features; inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional pathology feature sample as input data, and the disease category corresponding to the N-dimensional pathology feature sample as output data, wherein the output data is obtained based on machine learning algorithm training.
Based on the same inventive concept, a further embodiment of the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the apparatus provided by the apparatus embodiments described above, for example, to obtain N-dimensional pathology features; inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease feature sample as input data, and the disease category corresponding to the N-dimensional disease feature sample as output data, wherein the output data is obtained based on machine learning algorithm training.
The above-described method embodiments are merely illustrative, in which elements illustrated as separate elements may or may not be physically separate, and elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on such understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the various embodiments or the means of some parts of the embodiments.
Furthermore, in the present disclosure, such as "first," "second," are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In the description of the present invention, the meaning of "plurality" means at least two, for example, two, three, etc., unless specifically defined otherwise.
Moreover, in the present invention, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or device that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or device. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article or device that comprises the element.
Furthermore, in the description herein, reference to the terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms are not necessarily directed to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples, as well as features of various embodiments or examples, described in this specification may be combined and combined by those skilled in the art without contradiction.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (9)

1. A disease category assessment device based on a medical knowledge graph, comprising:
the acquisition module is used for acquiring the N-dimensional pathology characteristics;
the determining module is used for inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease feature sample as input data and a disease category corresponding to the N-dimensional disease feature sample as output data, wherein the output data is obtained based on machine learning algorithm training;
wherein, the disease category assessment model based on medical knowledge graph comprises:
the data labeling module is used for extracting N-dimensional pathology features of K categories of disease samples;
the model training module is used for constructing a classification model by adopting a multi-classification support vector machine algorithm of the paired classification of the Gaussian kernel function based on the N-dimensional pathology characteristics of the K classes of disease samples to carry out model training;
the knowledge graph retrieval module is used for carrying out knowledge retrieval through the key field based on a preset medical knowledge graph and determining N-dimensional pathology features corresponding to the key field;
the evaluation module is used for inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation and determining the disease category corresponding to the N-dimensional pathology feature sample;
the multi-classification support vector machine algorithm adopting the Gaussian kernel function for paired classification constructs a classification model for model training, and the method comprises the following steps:
obtaining a sample set of known diseasesWhere i= … M, the number of samples X is M, the characteristic dimension of the disease is N, and the sample set of known diseases corresponds to K diseases;
gaussian kernel function based on linear transformationMapping the N-dimensional features into a high-dimensional linear separable space;
constructing a binary support vector machine between every two diseases, wherein d ij Representing a binary support vector machine decision boundary between an ith disease class and a jth disease class;
in a high-dimensional linearly separable space, based onThe decision boundaries divide the sample set of known diseases into K classes.
2. The medical knowledge-graph-based disease category assessment device of claim 1, wherein the medical knowledge graph stores a graph database;
correspondingly, the knowledge graph retrieval module is specifically configured to:
carrying out knowledge retrieval through key fields based on a graph database;
the graph database returns nodes and attributes associated with the key fields;
an N-dimensional pathology feature corresponding to the key field is determined based on the nodes and attributes associated with the key field.
3. The medical knowledge-graph-based disease category assessment device of claim 1, further comprising:
the data query module is used for randomly extracting data based on the medical knowledge database and importing the extracted data into the labeling platform;
the priori marking module is used for marking the imported data based on marking rules preset by the marking platform;
and the data storage module is used for storing the marked data into the marked database.
4. The medical knowledge graph based disease category assessment device of claim 1, wherein the assessment module is specifically configured to:
inputting the N-dimensional pathology features corresponding to the key fields into the classification model according to a pathology feature record table corresponding to each type of independent model recorded in the training process, performing evaluation calculation, performing voting decision on the prediction result of each type of independent model, and determining an evaluation value corresponding to the N-dimensional pathology feature sample;
and determining the category of the disease based on the evaluation value.
5. The medical knowledge-based disease category assessment device of claim 1, wherein the medical knowledge-based disease category assessment model further comprises:
the data acquisition module is used for acquiring medical data based on the medical knowledge database;
and the data cleaning module is used for cleaning the data of the medical data acquired by the medical knowledge database.
6. The medical knowledge-graph-based disease category assessment device of claim 1, further comprising:
constructing a medical knowledge graph based on the knowledge extraction module, the knowledge fusion module and the knowledge storage module; the knowledge extraction module is used for extracting association relations between N-dimensional pathology feature data of K categories of disease samples; the knowledge fusion module is used for extracting similarity and difference of different relations based on the association relation between the N-dimensional pathology feature data of the K categories of disease samples, and combining or distinguishing based on the similarity and the difference of the different relations; the knowledge storage module is used for storing the data in the knowledge extraction module and the knowledge fusion module.
7. A disease category assessment method based on medical knowledge graph, comprising:
acquiring N-dimensional pathology features;
inputting the N-dimensional pathology features into a disease category evaluation model based on a medical knowledge graph to obtain the disease category corresponding to the N-dimensional pathology features; the disease category evaluation model based on the medical knowledge graph adopts an N-dimensional disease feature sample as input data and a disease category corresponding to the N-dimensional disease feature sample as output data, wherein the output data is obtained based on machine learning algorithm training;
wherein, the disease category assessment model based on medical knowledge graph comprises:
the data labeling module is used for extracting N-dimensional pathology features of K categories of disease samples;
the model training module is used for constructing a classification model by adopting a multi-classification support vector machine algorithm of the paired classification of the Gaussian kernel function based on the N-dimensional pathology characteristics of the K classes of disease samples to carry out model training;
the knowledge graph retrieval module is used for carrying out knowledge retrieval through the key field based on a preset medical knowledge graph and determining N-dimensional pathology features corresponding to the key field;
the evaluation module is used for inputting the N-dimensional pathology features corresponding to the key fields into the classification model for evaluation and determining the disease category corresponding to the N-dimensional pathology feature sample;
the multi-classification support vector machine algorithm adopting the Gaussian kernel function for paired classification constructs a classification model for model training, and the method comprises the following steps:
obtaining a sample set of known diseasesWhere i= … M, the number of samples X is M, the characteristic dimension of the disease is N, and the sample set of known diseases corresponds to K diseases;
gaussian kernel function based on linear transformationMapping the N-dimensional features into a high-dimensional linear separable space;
constructing a binary support vector machine between every two diseases, wherein d ij Representing a binary support vector machine decision boundary between an ith disease class and a jth disease class;
in a high-dimensional linearly separable space, based onThe decision boundaries divide the sample set of known diseases into K classes。
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the medical knowledge graph based disease category assessment method of claim 7 when the program is executed by the processor.
9. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements the medical knowledge graph based disease category assessment method of claim 7.
CN202110200064.6A 2021-02-22 2021-02-22 Disease category assessment device and method based on medical knowledge graph Active CN113035346B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110200064.6A CN113035346B (en) 2021-02-22 2021-02-22 Disease category assessment device and method based on medical knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110200064.6A CN113035346B (en) 2021-02-22 2021-02-22 Disease category assessment device and method based on medical knowledge graph

Publications (2)

Publication Number Publication Date
CN113035346A CN113035346A (en) 2021-06-25
CN113035346B true CN113035346B (en) 2023-09-22

Family

ID=76461223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110200064.6A Active CN113035346B (en) 2021-02-22 2021-02-22 Disease category assessment device and method based on medical knowledge graph

Country Status (1)

Country Link
CN (1) CN113035346B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295229A (en) * 2016-08-30 2017-01-04 青岛大学 A kind of mucocutaneous lymphnode syndrome grade predicting method based on medical data modeling
CN109346169A (en) * 2018-10-17 2019-02-15 长沙瀚云信息科技有限公司 A kind of artificial intelligence assisting in diagnosis and treatment system and its construction method, equipment and storage medium
CN110391021A (en) * 2019-07-04 2019-10-29 北京爱医生智慧医疗科技有限公司 A kind of disease inference system based on medical knowledge map
CN110911009A (en) * 2019-11-14 2020-03-24 南京医科大学 Clinical diagnosis aid decision-making system and medical knowledge map accumulation method
CN111657925A (en) * 2020-07-08 2020-09-15 中国科学院苏州生物医学工程技术研究所 Electrocardiosignal classification method, system, terminal and storage medium based on machine learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015138385A1 (en) * 2014-03-10 2015-09-17 H. Lee Moffitt Cancer Center And Research Institute, Inc. Radiologically identifed tumor habitats

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295229A (en) * 2016-08-30 2017-01-04 青岛大学 A kind of mucocutaneous lymphnode syndrome grade predicting method based on medical data modeling
CN109346169A (en) * 2018-10-17 2019-02-15 长沙瀚云信息科技有限公司 A kind of artificial intelligence assisting in diagnosis and treatment system and its construction method, equipment and storage medium
CN110391021A (en) * 2019-07-04 2019-10-29 北京爱医生智慧医疗科技有限公司 A kind of disease inference system based on medical knowledge map
CN110911009A (en) * 2019-11-14 2020-03-24 南京医科大学 Clinical diagnosis aid decision-making system and medical knowledge map accumulation method
CN111657925A (en) * 2020-07-08 2020-09-15 中国科学院苏州生物医学工程技术研究所 Electrocardiosignal classification method, system, terminal and storage medium based on machine learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Multi-ECGNet for ECG Arrythmia Multi-Label Classification;Junxian Cai等;《IEEE Access》;第8卷;110848-110858 *
基于机器学习的房颤诊断模型研究与应用;马彩云;《中国优秀硕士学位论文全文数据库 医药卫生科技辑》(第02期);E062-461 *
基于深度学习的疾病诊断;陆家发等;《医学信息学杂志》;第38卷(第04期);39-43 *

Also Published As

Publication number Publication date
CN113035346A (en) 2021-06-25

Similar Documents

Publication Publication Date Title
CN111414393B (en) Semantic similar case retrieval method and equipment based on medical knowledge graph
US11521751B2 (en) Patient data visualization method and system for assisting decision making in chronic diseases
US20210233658A1 (en) Identifying Relevant Medical Data for Facilitating Accurate Medical Diagnosis
CN111538894B (en) Query feedback method and device, computer equipment and storage medium
CN112863630A (en) Personalized accurate medical question-answering system based on data and knowledge
Lei et al. A novel data-driven robust framework based on machine learning and knowledge graph for disease classification
CN110675944A (en) Triage method and device, computer equipment and medium
Shen et al. Discovering the potential opportunities of scientific advancement and technological innovation: A case study of smart health monitoring technology
CN111191048A (en) Emergency call question-answering system construction method based on knowledge graph
CN109378066A (en) A kind of control method and control device for realizing disease forecasting based on feature vector
WO2023029506A1 (en) Illness state analysis method and apparatus, electronic device, and storage medium
Wang et al. Attention-based multi-instance neural network for medical diagnosis from incomplete and low quality data
CN110706807A (en) Medical question-answering method based on ontology semantic similarity
CN113764112A (en) Online medical question and answer method
Pendyala et al. Automated medical diagnosis from clinical data
CN116910172B (en) Follow-up table generation method and system based on artificial intelligence
CN113724815A (en) Information pushing method and device based on decision grouping model
WO2023155441A1 (en) Medical resource recommendation method and apparatus, device, and storage medium
CN113409907A (en) Intelligent pre-inquiry method and system based on Internet hospital
CN115910319A (en) Otology inquiry assisting method and device, electronic equipment and storage medium
Saranya et al. Intelligent medical data storage system using machine learning approach
CN113035346B (en) Disease category assessment device and method based on medical knowledge graph
CN116628219A (en) Question-answering method based on knowledge graph
CN114496231B (en) Knowledge graph-based constitution identification method, device, equipment and storage medium
CN111339252B (en) Searching method, searching device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant