CN115083599A - Knowledge graph-based preliminary diagnosis and treatment method for disease state - Google Patents

Knowledge graph-based preliminary diagnosis and treatment method for disease state Download PDF

Info

Publication number
CN115083599A
CN115083599A CN202210823192.0A CN202210823192A CN115083599A CN 115083599 A CN115083599 A CN 115083599A CN 202210823192 A CN202210823192 A CN 202210823192A CN 115083599 A CN115083599 A CN 115083599A
Authority
CN
China
Prior art keywords
disease
entity
medical record
symptom
data set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210823192.0A
Other languages
Chinese (zh)
Inventor
刘鹏
张真
高中强
左成婷
张堃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Innovative Data Technologies Inc
Original Assignee
Nanjing Innovative Data Technologies Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Innovative Data Technologies Inc filed Critical Nanjing Innovative Data Technologies Inc
Priority to CN202210823192.0A priority Critical patent/CN115083599A/en
Publication of CN115083599A publication Critical patent/CN115083599A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention discloses a knowledge graph-based preliminary diagnosis and treatment method for disease symptoms, which comprises the following steps: s1, acquiring the electronic medical record to construct a medical record data set, performing word segmentation processing, and analyzing and judging the alias of the disease or the symptom; s2, labeling the entity and the entity relation of the processed data in the electronic medical record data set based on the named entity recognition model and the entity relation extraction model respectively; s3, constructing an entity and relationship labeling data set in an RDF format, and constructing a knowledge graph based on disease diagnosis by using the entity and relationship labeling data set; s4, performing preliminary diagnosis on disease symptoms based on the purity of disease symptoms and the disease information entropy, and obtaining a recommended preliminary treatment scheme by combining a knowledge map based on disease diagnosis with a collaborative recommendation algorithm. The method not only can realize the initial diagnosis of disease symptoms, but also can recommend an initial treatment scheme for the patient, and can effectively improve the accuracy of the recommended treatment scheme.

Description

Knowledge graph-based preliminary diagnosis and treatment method for disease state
Technical Field
The invention relates to the technical field of medical knowledge maps, in particular to a preliminary diagnosis and treatment method for disease symptoms based on a knowledge map.
Background
Knowledge graph is one of the technologies of semantic web, and has become a research focus of the current search engine technology development. It is desirable to characterize various entities and concepts in real time, and the associations between them, by means of a knowledge graph. The knowledge map extracts knowledge in the Internet text, and a relation network is constructed in the form of a graph, so that a relation view is provided for researchers to analyze and research problems.
As a big data technology, the knowledge graph has the characteristic of visualization and is convenient for analyzing the relationship between entities. The knowledge map expresses mass information of the Internet into a form closer to the world of human cognition, provides the capability of better organizing, managing and understanding information, has the advantages of intuition, quantification, knowledge discovery and the like, and is suitable for the field of research and medical treatment. The current knowledge graph technology can be applied to the aspects of intelligent semantic search, knowledge question answering, data analysis decision and the like.
An Electronic Medical Record (EMR) refers to a Medical Record that is generated by Medical staff using a Medical institution information system during Medical activities and can be stored, managed, transmitted and reproduced, and is digitalized information such as characters, symbols, charts, graphs, data, images and the like. In the process of using the electronic medical records in a hospital, a large number of medical record records are accumulated. How to process the massive medical data collected by the large hospitals with high efficiency and high efficiency is a problem which is seriously concerned by every enterprise engaged in the medical health industry. Therefore, the invention provides a preliminary diagnosis and treatment method of the disease state based on the knowledge map.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides a knowledge-map-based preliminary diagnosis and treatment method for the disease state, so as to overcome the technical problems in the prior related art.
Therefore, the invention adopts the following specific technical scheme:
a preliminary diagnosis and treatment method of disease state based on knowledge map, the method includes the following steps:
s1, acquiring the electronic medical record to construct a medical record data set, performing word segmentation processing on fields in the medical record data set based on a word segmentation algorithm of a dictionary, and analyzing and judging aliases of diseases or symptoms;
s2, labeling the entity and the entity relation of the processed data in the electronic medical record data set based on the named entity recognition model and the entity relation extraction model respectively;
s3, constructing an entity and relationship labeling data set in an RDF format, and constructing a knowledge graph based on disease diagnosis by using the entity and relationship labeling data set;
s4, performing preliminary diagnosis on disease symptoms based on the purity of disease symptoms and the disease information entropy, and obtaining a recommended preliminary treatment scheme by combining a knowledge map based on disease diagnosis with a collaborative recommendation algorithm.
Further, the step of collecting the electronic medical record in S1 to construct a medical record data set, and the step of performing word segmentation processing on fields in the medical record data set based on a dictionary-based word segmentation algorithm further includes the following steps:
and cleaning and preprocessing the medical record data in the medical record data set by using the data processing module, and performing medical record word segmentation, null data elimination, invalid data elimination and words and repeated data elimination without emotional significance.
Further, the word segmentation processing of the fields in the medical record data set by the word segmentation algorithm based on the dictionary comprises the following steps:
and matching all words in the pre-established word segmentation dictionary with fields in the medical record data set one by one according to a preset strategy, identifying the words belonging to the word segmentation dictionary and contained in the fields, and returning the identified words as useful information.
Further, in S2, the labeling of the entity and the entity relationship of the processed data in the electronic medical record dataset based on the named entity identification model and the entity relationship extraction model respectively includes the following steps:
s21, dividing the medical record data set after word segmentation into an artificial medical record annotation data set and an automatic medical record annotation data set according to a preset proportion;
s22, the medical expert uses professional knowledge to label the entity and entity relationship of the data in the artificial medical record label data set;
s23, inputting medical record data in the labeled manual medical record labeling data set into a pre-constructed named entity recognition model and an entity relation extraction model for training;
and S24, inputting the automatic medical record labeling data into the trained named entity recognition model and entity relationship extraction model one by one to perform entity recognition and automatic labeling of the entity and the entity relationship.
Further, the entity comprises basic entity information of symptoms, diseases, parts, medicines, departments and people, and the entity relationship comprises a part symptom relationship, a part disease relationship, a symptom disease relationship, a disease department relationship, a medicine disease relationship, a medicine symptom relationship and a medicine and people relationship.
Further, the knowledge graph based on disease diagnosis is composed of 6 entities and 7 entity relations, and is represented by a directed graph G, wherein G ═ (V, E), and V ═ V [, (V, E) ] 1 ,v 2 ,…,v n Represents a set of vertices pointing to different entities, and E is a set of edges representing different types of relationships between entities.
Further, the step of performing a preliminary diagnosis on the disease condition based on the purity of disease symptoms and the entropy of disease information in S4, and obtaining a recommended preliminary treatment plan by using a knowledge map based on disease diagnosis in combination with a collaborative recommendation algorithm includes the steps of:
s41, calculating the purity P of each symptom in the knowledge map and the disease information entropy S of the disease related to the symptom, and analyzing based on the purity P and the disease information entropy S to obtain the symptom;
and S42, obtaining a recommended preliminary treatment scheme by combining a knowledge map based on disease diagnosis with a collaborative recommendation algorithm.
Further, the step S41 of calculating the purity P of each symptom in the knowledge graph and the disease information entropy S of the disease related to the symptom, and analyzing the disease information entropy S based on the purity P and the disease information entropy P to obtain the disease state includes the following steps:
s411, calculating the purity P of each symptom in the knowledge graph, selecting the symptom with the highest purity, and calculating the disease information entropy S of the disease related to the symptom, wherein the formula of the purity P is as follows:
Figure BDA0003742065870000031
the calculation formula of the disease information entropy S is as follows:
Figure BDA0003742065870000032
wherein N represents the number of diseases, V i Numerical value representing quantitative relationship of diseases associated with symptoms or combinations of symptoms, N 2 Represents the square of the number of diseases associated with a symptom or combination of symptoms, V i ' A quantitative relationship value, V, representing a symptom or a combination of symptoms to a disease i "means all quantitative relational values for symptoms associated with the disease;
s412, judging whether the disease information entropy S is larger than a preset threshold or is the last symptom, if so, selecting the disease with the largest disease information entropy S for storage, deleting the disease from all original disease lists, repeating the steps, finishing iteration for N times, obtaining N diseases, finally calculating the disease information entropy for all input symptoms through the disease to realize sorting, and analyzing according to the sorting result to obtain a preliminary symptom.
Further, the step of obtaining the recommended preliminary treatment plan by using the knowledge map based on disease diagnosis and the collaborative recommendation algorithm in S42 includes the following steps:
s421, acquiring the disease state information of the patient, and analyzing by using a knowledge graph based on disease diagnosis to obtain a first recommended treatment scheme;
s422, recommending a treatment scheme similar to the disease state information of the patient by using a collaborative recommendation algorithm to obtain a second recommended treatment scheme;
and S423, analyzing by combining the first recommended treatment scheme and the second recommended treatment scheme to obtain a recommended preliminary treatment scheme.
Further, the recommending a treatment plan similar to the disease condition information for the patient by using the collaborative recommendation algorithm in S422 to obtain a second recommended treatment plan includes the following steps:
s4221, acquiring symptom information data of the patient, and calculating the similarity between the medical record data set and the symptom information data of the patient, wherein the similarity calculation formula is as follows:
Figure BDA0003742065870000041
in the formula, sim (u) 1 ,u 2 ) For patient u 1 And u 2 S is the total number of symptoms,
Figure BDA0003742065870000042
the same is 1 in the single-choice case, different is 0, and in the multiple-choice case:
Figure BDA0003742065870000043
S 1 is the number of multi-option;
s4222, judging similar patients according to the formula, and selecting a plurality of patients with the closest similarity for sorting to obtain a case group;
s4223, calculating a recommended value of a treatment scheme of a case in a case group for the current patient pathology, recommending the treatment scheme for the patient according to the recommended value result, and obtaining a second recommended treatment scheme, wherein a calculation formula of the recommended value of the treatment scheme is as follows:
Figure BDA0003742065870000044
wherein p represents a recommended value, sim (u, u) i ) Patient u and patient u i Similarity of (2), r i Is an average estimate of patient symptoms i over the group of cases,
Figure BDA0003742065870000051
an average estimate of the cases for patient symptom i, n representing the total number of business services.
The invention has the beneficial effects that: the electronic medical record data set is subjected to word segmentation processing by using a word segmentation algorithm based on a dictionary, and entity relation labeling is performed on data in the electronic medical record data set by using an entity relation extraction model, so that a knowledge graph based on disease diagnosis can be constructed by using the entity and relation labeling data set, further, preliminary diagnosis of disease symptoms can be realized based on the purity of disease symptoms and disease information entropy, a preliminary treatment scheme can be recommended to a patient by using the knowledge graph based on disease diagnosis in combination with a collaborative recommendation algorithm, and meanwhile, the accuracy of the recommended treatment scheme can be effectively improved by using the knowledge graph based on disease diagnosis in combination with the collaborative recommendation algorithm to recommend the patient.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for preliminary diagnosis and treatment of a disease state based on a knowledge-map according to an embodiment of the present invention.
Detailed Description
For further explanation of the various embodiments, the drawings which form a part of the disclosure and which are incorporated in and constitute a part of this specification, illustrate embodiments and, together with the description, serve to explain the principles of operation of the embodiments, and to enable others of ordinary skill in the art to understand the various embodiments and advantages of the invention, and, by reference to these figures, reference is made to the accompanying drawings, which are not to scale and wherein like reference numerals generally refer to like elements.
According to an embodiment of the present invention, a method for preliminary diagnosis and treatment of a condition based on a knowledge-map is provided.
Referring now to the drawings and the detailed description, the invention will be further described, as shown in fig. 1, a method for preliminary diagnosis and treatment of a disease state based on a knowledge map according to an embodiment of the invention, the method comprising the steps of:
s1, acquiring the electronic medical record to construct a medical record data set, performing word segmentation processing on fields in the medical record data set based on a word segmentation algorithm of a dictionary, and analyzing and judging aliases of diseases or symptoms;
wherein, the step of collecting the electronic medical record in the step of S1 to construct a medical record data set, and the step of performing word segmentation processing on fields in the medical record data set based on a word segmentation algorithm of a dictionary further comprises the following steps:
and cleaning and preprocessing the medical record data in the medical record data set by using the data processing module, and performing medical record word segmentation, null data elimination, invalid data elimination and words and repeated data elimination without emotional significance.
Specifically, the analysis and judgment of the alias of the disease or the pathological condition are processed by adopting a pre-constructed automatic identification model of the alias of the disease, and the method specifically comprises the following steps:
obtaining corpus data in a medical record data set; constructing a disease domain ontology; expanding the disease domain ontology to obtain an expanded disease domain ontology; automatically labeling the corpus data acquired from the medical record data set by using the expanded disease domain ontology to obtain a training corpus with expanded disease alias labels; establishing a disease alias automatic identification model according to the training corpus with the expanded disease alias labels; according to the disease alias automatic identification model, the identification of the disease alias is carried out on the field to be processed, such as the Alzheimer disease, which is also called the senile dementia.
Specifically, the word segmentation processing of the fields in the medical record data set by the word segmentation algorithm based on the dictionary comprises the following steps:
and matching all words in a pre-established word segmentation dictionary (the word segmentation dictionary is the combination of a common dictionary and a medical special dictionary) with fields in a medical record data set one by one according to a preset strategy, identifying the words contained in the fields and belonging to the word segmentation dictionary, and returning the identified words as useful information.
S2, labeling the entity and the entity relation of the processed data in the electronic medical record data set based on the named entity recognition model and the entity relation extraction model respectively;
in S2, the labeling of the entity and the entity relationship of the data in the processed electronic medical record dataset based on the named entity identification model and the entity relationship extraction model includes the following steps:
s21, dividing the medical record data set after word segmentation into an artificial medical record annotation data set and an automatic medical record annotation data set according to a preset proportion;
s22, the medical expert uses professional knowledge to label the entity and entity relationship of the data in the artificial medical record label data set;
specifically, the entity includes basic entity information of symptoms, diseases, parts, medicines, departments and people, and the entity relationship includes a part symptom relationship, a part disease relationship, a symptom disease relationship, a disease department relationship, a medicine disease relationship, a medicine symptom relationship and a medicine and people relationship.
S23, inputting medical record data in the labeled manual medical record labeling data set into a pre-constructed named entity recognition model and an entity relation extraction model for training;
and S24, inputting the automatic medical record labeling data into the trained named entity recognition model and entity relationship extraction model one by one to perform entity recognition and automatic labeling of the entity and the entity relationship.
S3, constructing an entity and relationship labeling data set in an RDF format, and constructing a knowledge graph based on disease diagnosis by using the entity and relationship labeling data set;
wherein the knowledge-graph based on disease diagnosis is composed of 6 entities and 7 entity relations, and is represented by a directed graph G, wherein G ═ (V, E), and in the formula, V ═ V { (V, E) } 1 ,v 2 ,…,v n Represents a set of vertices pointing to different entities, and E is a set of edges representing different types of relationships between entities.
S4, performing preliminary diagnosis on disease symptoms based on the purity of disease symptoms and the disease information entropy, and obtaining a recommended preliminary treatment scheme by combining a knowledge map based on disease diagnosis with a collaborative recommendation algorithm.
Wherein, the step S4 of preliminarily diagnosing disease symptoms based on the purity of disease symptoms and the entropy of disease information, and obtaining a recommended preliminary treatment plan by using a knowledge map based on disease diagnosis and a collaborative recommendation algorithm comprises the following steps:
s41, calculating the purity P of each symptom in the knowledge map and the disease information entropy S of the disease related to the symptom, and analyzing to obtain the symptoms based on the purity P and the disease information entropy S;
specifically, the step of calculating the purity P of each symptom in the knowledge graph and the disease information entropy S of the disease related to the symptom in S41, and analyzing the obtained disease state based on the purity P and the disease information entropy S includes the following steps:
s411, calculating the purity P of each symptom in the knowledge graph, selecting the symptom with the highest purity, and calculating the disease information entropy S of the disease related to the symptom, wherein the formula of the purity P is as follows:
Figure BDA0003742065870000071
the calculation formula of the disease information entropy S is as follows:
Figure BDA0003742065870000081
wherein N represents the number of diseases, V i Numerical value representing quantitative relationship of diseases associated with symptoms or combinations of symptoms, N 2 Represents the square of the number of diseases associated with a symptom or combination of symptoms, V i ' A quantitative relationship value, V, representing a symptom or a combination of symptoms to a disease i "means all quantitative relational values for symptoms associated with the disease;
s412, judging whether the disease information entropy S is larger than a preset threshold or is the last symptom, if so, selecting the disease with the largest disease information entropy S for storage, deleting the disease from all original disease lists, repeating the steps, finishing iteration for N times, obtaining N diseases, finally calculating the disease information entropy for all input symptoms through the disease to realize sorting, and analyzing according to the sorting result to obtain a preliminary symptom.
And S42, obtaining a recommended preliminary treatment scheme by combining a knowledge map based on disease diagnosis with a collaborative recommendation algorithm.
Specifically, the step of obtaining the recommended preliminary treatment plan by combining the knowledge graph based on the disease diagnosis and the collaborative recommendation algorithm in the step S42 includes the following steps:
s421, acquiring the disease state information of the patient, and analyzing by using a knowledge graph based on disease diagnosis to obtain a first recommended treatment scheme;
s422, recommending a treatment scheme similar to the disease state information of the patient by using a collaborative recommendation algorithm to obtain a second recommended treatment scheme; the method specifically comprises the following steps:
s4221, acquiring symptom information data of the patient, and calculating the similarity between the medical record data set and the symptom information data of the patient, wherein the similarity calculation formula is as follows:
Figure BDA0003742065870000082
in the formula, sim (u) 1 ,u 2 ) For patient u 1 And u 2 S is the total number of symptoms,
Figure BDA0003742065870000083
the same is 1 in the single-choice case, different is 0, and in the multiple-choice case:
Figure BDA0003742065870000084
S 1 is the number of multi-option;
s4222, judging similar patients according to the formula, and selecting a plurality of patients with the closest similarity for sorting to obtain a case group;
s4223, calculating a treatment scheme recommendation value of a case in a case group to the current patient pathology, recommending a treatment scheme for the patient according to the recommendation value result, and obtaining a second recommended treatment scheme, wherein the calculation formula of the treatment scheme recommendation value is as follows:
Figure BDA0003742065870000091
wherein p represents a recommended value, sim (u, u) i ) Patient u and patient u i Similarity of (2), r i Is an average estimate of patient symptoms i over the group of cases,
Figure BDA0003742065870000092
the average estimate of the case for patient symptom i, n represents the total number of business services.
And S423, analyzing by combining the first recommended treatment scheme and the second recommended treatment scheme to obtain a recommended preliminary treatment scheme.
Specifically, when the first recommended treatment scheme and the second recommended treatment scheme are the same, the first recommended treatment scheme or the second recommended treatment scheme is generated to be a recommended preliminary treatment scheme, when the first recommended treatment scheme and the second recommended treatment scheme are different, whether the number of symptoms of the patient is larger than a preset threshold value needs to be judged, if yes, the second recommended treatment scheme is selected to be the recommended preliminary treatment scheme, and if not, the first recommended treatment scheme is selected to be the recommended preliminary treatment scheme.
In summary, according to the technical scheme of the invention, the electronic medical record data set is subjected to word segmentation processing by using the word segmentation algorithm based on the dictionary, and the data in the electronic medical record data set is subjected to entity and entity relationship labeling by using the entity-relationship-based extraction model, so that a knowledge graph based on disease diagnosis can be constructed by using the entity and relationship labeling data set, and further, not only can the preliminary diagnosis of disease symptoms be realized based on the purity of disease symptoms and disease information entropy, but also a preliminary treatment scheme can be recommended to a patient by using the knowledge graph based on disease diagnosis in combination with the collaborative recommendation algorithm, and meanwhile, the accuracy of the recommended treatment scheme can be effectively improved by using the knowledge graph based on disease diagnosis in combination with the collaborative recommendation algorithm to recommend the preliminary treatment scheme to the patient.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (10)

1. A preliminary diagnosis and treatment method of a disease state based on a knowledge map is characterized by comprising the following steps:
s1, acquiring the electronic medical record to construct a medical record data set, performing word segmentation processing on fields in the medical record data set based on a word segmentation algorithm of a dictionary, and analyzing and judging aliases of diseases or symptoms;
s2, labeling the entity and the entity relation of the processed data in the electronic medical record data set based on the named entity recognition model and the entity relation extraction model respectively;
s3, constructing an entity and relationship labeling data set in an RDF format, and constructing a knowledge graph based on disease diagnosis by using the entity and relationship labeling data set;
s4, performing preliminary diagnosis on disease symptoms based on the purity of disease symptoms and the disease information entropy, and obtaining a recommended preliminary treatment scheme by combining a knowledge map based on disease diagnosis with a collaborative recommendation algorithm.
2. The method of claim 1, wherein the step of collecting the electronic medical record in step S1 to construct a medical record dataset, and the step of performing the word segmentation on the fields in the medical record dataset by using the dictionary-based word segmentation algorithm further comprises the following steps:
and cleaning and preprocessing the medical record data in the medical record data set by using the data processing module, and performing medical record word segmentation, null data elimination, invalid data elimination and words and repeated data elimination without emotional significance.
3. The method of claim 2, wherein the dictionary-based word segmentation algorithm for performing word segmentation on the fields in the medical record dataset comprises the following steps:
and matching all words in the pre-established word segmentation dictionary with fields in the medical record data set one by one according to a preset strategy, identifying the words belonging to the word segmentation dictionary and contained in the fields, and returning the identified words as useful information.
4. The method of claim 1, wherein the step of performing entity and entity relationship labeling on the data in the processed electronic medical record dataset based on the named entity identification model and the entity relationship extraction model in the step S2 comprises the steps of:
s21, dividing the medical record data set after word segmentation into an artificial medical record annotation data set and an automatic medical record annotation data set according to a preset proportion;
s22, the medical expert uses professional knowledge to label the entity and entity relationship of the data in the artificial medical record label data set;
s23, inputting medical record data in the labeled manual medical record labeling data set into a pre-constructed named entity recognition model and an entity relation extraction model for training;
and S24, inputting the automatic medical record labeling data into the trained named entity recognition model and entity relationship extraction model one by one to perform entity recognition and automatic labeling of the entity and the entity relationship.
5. The method of claim 1, wherein the entities comprise symptom, disease, part, medicine, department and population basic entity information, and the entity relationships comprise part symptom relationships, part disease relationships, symptom disease relationships, disease department relationships, medicine disease relationships, medicine symptom relationships and medicine population relationships.
6. The method of claim 1 wherein the disease diagnosis-based knowledgebase profile is composed of 6-entity and 7-entity relationships, represented by a directed graph G, where G ═ (V, E), where V ═ V { (V, E) } 1 ,v 2 ,…,v n Represents a set of vertices pointing to different entities, and E is a set of edges representing different types of relationships between entities.
7. The method of claim 1, wherein the step of S4 comprises performing preliminary diagnosis for disease symptoms based on the purity of disease symptoms and entropy of disease information, and using the knowledge mapping based on disease diagnosis in combination with a collaborative recommendation algorithm to obtain a recommended preliminary treatment plan, comprising the steps of:
s41, calculating the purity P of each symptom in the knowledge map and the disease information entropy S of the disease related to the symptom, and analyzing to obtain the symptoms based on the purity P and the disease information entropy S;
and S42, obtaining a recommended preliminary treatment scheme by combining a knowledge map based on disease diagnosis with a collaborative recommendation algorithm.
8. The preliminary diagnosis and treatment method for disease states based on knowledge graph of claim 7, wherein the step of calculating purity P of each symptom in knowledge graph and disease information entropy S of disease related to the symptom in S41 and analyzing the disease states based on the purity P and the disease information entropy S comprises the following steps:
s411, calculating the purity P of each symptom in the knowledge graph, selecting the symptom with the highest purity, and calculating the disease information entropy S of the disease related to the symptom, wherein the formula of the purity P is as follows:
Figure FDA0003742065860000021
the calculation formula of the disease information entropy S is as follows:
Figure FDA0003742065860000031
wherein N represents the number of diseases, V i Numerical value representing quantitative relationship of diseases associated with symptoms or combinations of symptoms, N 2 Represents the square of the number of diseases associated with a symptom or combination of symptoms, V i ' A quantitative relationship value, V, representing a symptom or a combination of symptoms to a disease i "means all quantitative relational values for symptoms associated with the disease;
s412, judging whether the disease information entropy S is larger than a preset threshold or is the last symptom, if so, selecting the disease with the largest disease information entropy S for storage, deleting the disease in all original disease lists, repeating the steps, and repeating the steps for N times to obtain N diseases, and finally calculating the disease information entropy for all input symptoms through the diseases to realize sorting, and analyzing according to the sorting result to obtain a primary symptom.
9. The method of claim 1, wherein the step of obtaining the recommended preliminary treatment plan using the knowledge-based disease diagnosis and collaborative recommendation algorithm in S42 comprises the steps of:
s421, acquiring the disease state information of the patient, and analyzing by using a knowledge graph based on disease diagnosis to obtain a first recommended treatment scheme;
s422, recommending a treatment scheme similar to the disease state information of the patient by using a collaborative recommendation algorithm to obtain a second recommended treatment scheme;
and S423, analyzing by combining the first recommended treatment scheme and the second recommended treatment scheme to obtain a recommended preliminary treatment scheme.
10. The method of claim 9, wherein the step of recommending a treatment plan similar to the disease condition information for the patient by using a collaborative recommendation algorithm in S422 comprises the following steps:
s4221, acquiring symptom information data of the patient, and calculating the similarity between the medical record data set and the symptom information data of the patient, wherein the similarity calculation formula is as follows:
Figure FDA0003742065860000032
in the formula, sim (u) 1 ,u 2 ) For patient u 1 And u 2 S is the total number of symptoms,
Figure FDA0003742065860000033
the same is 1 in the single-choice case, different is 0, and in the multiple-choice case:
Figure FDA0003742065860000041
S 1 is the number of multi-option;
s4222, judging similar patients according to the formula, and selecting a plurality of patients with the closest similarity for sorting to obtain a case group;
s4223, calculating a recommended value of a treatment scheme of a case in a case group for the current patient pathology, recommending the treatment scheme for the patient according to the recommended value result, and obtaining a second recommended treatment scheme, wherein a calculation formula of the recommended value of the treatment scheme is as follows:
Figure FDA0003742065860000042
wherein p represents a recommended value, sim (u, u) i ) Patient u and patient u i Similarity of (2), r i Is an average estimate of patient symptoms i over the group of cases,
Figure FDA0003742065860000043
the average estimate of the case for patient symptom i, n represents the total number of business services.
CN202210823192.0A 2022-07-12 2022-07-12 Knowledge graph-based preliminary diagnosis and treatment method for disease state Pending CN115083599A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210823192.0A CN115083599A (en) 2022-07-12 2022-07-12 Knowledge graph-based preliminary diagnosis and treatment method for disease state

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210823192.0A CN115083599A (en) 2022-07-12 2022-07-12 Knowledge graph-based preliminary diagnosis and treatment method for disease state

Publications (1)

Publication Number Publication Date
CN115083599A true CN115083599A (en) 2022-09-20

Family

ID=83259545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210823192.0A Pending CN115083599A (en) 2022-07-12 2022-07-12 Knowledge graph-based preliminary diagnosis and treatment method for disease state

Country Status (1)

Country Link
CN (1) CN115083599A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116959692A (en) * 2023-09-18 2023-10-27 北方健康医疗大数据科技有限公司 Electronic medical record quality control method, system, terminal and storage medium
CN117393156A (en) * 2023-12-12 2024-01-12 珠海灏睿科技有限公司 Multi-dimensional remote auscultation and diagnosis intelligent system based on cloud computing

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116959692A (en) * 2023-09-18 2023-10-27 北方健康医疗大数据科技有限公司 Electronic medical record quality control method, system, terminal and storage medium
CN117393156A (en) * 2023-12-12 2024-01-12 珠海灏睿科技有限公司 Multi-dimensional remote auscultation and diagnosis intelligent system based on cloud computing
CN117393156B (en) * 2023-12-12 2024-04-05 珠海灏睿科技有限公司 Multi-dimensional remote auscultation and diagnosis intelligent system based on cloud computing

Similar Documents

Publication Publication Date Title
CN107731269B (en) Disease coding method and system based on original diagnosis data and medical record file data
CN107705839B (en) Disease automatic coding method and system
CN111191048B (en) Knowledge graph-based emergency inquiry and answer system construction method
CN112863630A (en) Personalized accurate medical question-answering system based on data and knowledge
CN115083599A (en) Knowledge graph-based preliminary diagnosis and treatment method for disease state
CN107833629A (en) Aided diagnosis method and system based on deep learning
Carchiolo et al. Medical prescription classification: a NLP-based approach
CN111949759A (en) Method and system for retrieving medical record text similarity and computer equipment
CN110600121B (en) Knowledge graph-based primary etiology diagnosis method
CN112349369A (en) Medical image big data intelligent analysis method, system and storage medium
CN112466462B (en) EMR information association and evolution method based on deep learning of image
Karaca et al. Computational methods for data analysis
Wang et al. Multiple valued logic approach for matching patient records in multiple databases
CN115239993A (en) Human body alopecia type and stage identification system based on cross-domain semi-supervised learning
Saranya et al. Intelligent medical data storage system using machine learning approach
CN115036034B (en) Similar patient identification method and system based on patient characterization map
CN116628219A (en) Question-answering method based on knowledge graph
CN116737924A (en) Medical text data processing method and device
CN116719840A (en) Medical information pushing method based on post-medical-record structured processing
Al-Hagery et al. Knowledge discovery in the data sets of hepatitis disease for diagnosis and prediction to support and serve community
CN115831380A (en) Intelligent medical data management system and method based on medical knowledge graph
CN116227594A (en) Construction method of high-credibility knowledge graph of medical industry facing multi-source data
CN114780738A (en) Medical image examination project name standardization method and system based on different application scenes
CN113972009A (en) Medical examination consultation system based on clinical examination medical big data
Yang et al. Medical assistant diagnosis method based on graph neural network and attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination