CN111897967A - Medical inquiry recommendation method based on knowledge graph and social media - Google Patents

Medical inquiry recommendation method based on knowledge graph and social media Download PDF

Info

Publication number
CN111897967A
CN111897967A CN202010639311.8A CN202010639311A CN111897967A CN 111897967 A CN111897967 A CN 111897967A CN 202010639311 A CN202010639311 A CN 202010639311A CN 111897967 A CN111897967 A CN 111897967A
Authority
CN
China
Prior art keywords
disease
medical
entity
doctor
entities
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010639311.8A
Other languages
Chinese (zh)
Inventor
孙艳春
黄罡
武家伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202010639311.8A priority Critical patent/CN111897967A/en
Publication of CN111897967A publication Critical patent/CN111897967A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • G06F16/90344Query processing by using string matching techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06395Quality analysis or management
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H80/00ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Public Health (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Primary Health Care (AREA)
  • Computational Linguistics (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Marketing (AREA)
  • Pathology (AREA)
  • Game Theory and Decision Science (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a medical inquiry recommending method based on a knowledge graph and social media. The medical knowledge map is constructed based on disease information opened by the Internet, the medical comment data of social media are combined, the service quality of doctors and departments in treatment is automatically evaluated according to the evaluation index of the medical service quality, recommendation service is provided for users, and the increasing mobile medical service requirements of the users are met to a certain extent; the invention can complete the self-diagnosis of diseases and the recommendation service of doctors and hospitals at the same time, and provide better service quality for users; the method and the system have the advantages that the diseases are recommended by combining information in multiple aspects, the problems of long recommendation lists, no recommendation significance and the like caused by simple symptom keyword matching are avoided, and meanwhile, the recommendation options are enriched, and the potential diseases of the user can be recommended more easily; the invention obtains the service quality of doctors and hospitals by combining the existing medical service quality evaluation indexes based on the medical comment data, and provides open and easy-to-see recommendation service for users.

Description

Medical inquiry recommendation method based on knowledge graph and social media
Technical Field
The invention relates to a data mining technology, in particular to a medical inquiry recommending method based on a knowledge graph and social media.
Background
Currently, there is an increasing demand for mobile medical services. In life, people often perceive certain symptoms of the body and cannot timely find out corresponding departments and hospitals and doctors with good service quality. The problem is alleviated to some extent by the emergence of internet medical services (such as Tencent medical treatment, medical inquiry, good doctor and the like), however, most medical websites only provide functions of disease science popularization, online reservation and the like, and a few websites provide means for disease self-diagnosis, however, most of the websites only allow users to input a single keyword, recommend diseases based on keyword matching, and the given disease recommendation list is very long and has no recommendation meaning, and does not relate to recommendations for doctors and hospitals.
Currently, people have conducted extensive research on medical knowledge maps. Foreign people such as Rotmensch and Wang extract information from Electronic Health Records (EHRs), construct various medical knowledge maps, and apply the medical knowledge maps to drug recommendation systems, medical diagnosis assistance systems, and the like. However, in China, the medical knowledge graph is also researched for a review, and marmon and michigan et al introduce the core technology for constructing the medical knowledge graph and generalize the application scenes of the core technology into a clinical decision support system, a medical semantic search engine, a medical question-answering system and the like.
In recent years, with the development of natural language processing technology, there have been some research efforts to mine user emotion information from medical comments. Hao et al mined several emotional attributes of comments based on Hadamard website comment text data using an LDA (Latent dirichletAllocation) model, and simply analyzed the polarity expression on the several emotional attributes. And the doctor on the line can comment text data on the line in the same way, a semantic dictionary and a semantic frame means are used to construct the emotional theme of the comment, and the emotional polarity and the strength of the emotional theme are analyzed. Emotional attribute extraction can reflect the satisfaction degree of users in a specific aspect, but the method used by the users has certain limitation and does not fully utilize the leading edge method in the natural language processing field.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a medical inquiry recommendation method based on a knowledge graph and social media.
The invention discloses a medical inquiry recommending method based on a knowledge graph and social media, which comprises the following steps:
1) acquiring open structured disease information in internet medical treatment, extracting diseases and related information thereof from the information, wherein the related information of the diseases comprises symptom keywords, morbidity, susceptible population, complications, clinic and symptom description, further extracting age and gender information from the susceptible population, and constructing a medical knowledge graph of disease-symptom, wherein the medical knowledge graph comprises five entities and five relations, and the five entities are as follows: disease entity, department entity, age entity, gender entity and symptom keyword entity, wherein the disease entity has symptom description attribute and incidence attribute, and five relations are as follows: complication relation exists between the disease entity and the disease entity, possession relation exists between the disease entity and the symptom keyword entity, doctor and department relation exists between the disease entity and the department entity, susceptible age relation exists between the disease entity and the age entity, and susceptible gender relation exists between the disease entity and the gender entity;
2) for the medical knowledge graph constructed in the step 1), training knowledge graph embedding by using a distance translation model, and mapping entities and relations in the medical knowledge graph into expressions in a vector space to obtain embedded vector information of disease entities in the medical knowledge graph;
3) acquiring medical comment data opened on the Internet, wherein the medical comment data comprises doctor names, treatment departments to which doctors belong, hospitals to which the treatment departments belong and comment texts on patients of the doctors; according to the evaluation indexes of the medical service quality, marking the comment texts of the patients, carrying out sentiment polarity analysis on each index dimension of the comment texts of the patients by using a natural language processing model, counting the good evaluation rate of each doctor, summarizing the comment texts of the patients in the same department of treatment according to the relationship between the doctors and the department of treatment to obtain the good evaluation rates of the corresponding departments of treatment, and respectively obtaining the Wilson scores of the doctors and the department of treatment according to the Wilson interval method;
4) the method comprises the following steps that M kinds of information including symptom keywords are contained in symptom description information input by a user, namely the symptom keywords are information which needs to be input, M is more than or equal to 1 and less than or equal to 4, a medical knowledge graph is inquired according to the M kinds of information input by the user, an initial disease entity alternative set is constructed, the most similar disease entity expansion alternative set is selected according to embedded vector information of the disease entities in the medical knowledge graph, the diseases potentially suffered by the user are mined, finally, recommended diseases are screened according to the similarity of the sex, the age, the symptom keywords and the corresponding M aspects in the symptom description input by the user, and hospitals and doctors to which the recommended Welchon-scoring highest-rating treatment department belong are recommended according to the treatment departments of the corresponding disease entities:
a) constructing an initial disease entity alternative set: inquiring a medical knowledge map according to symptom keywords input by a user, screening a plurality of disease entities with most similar symptom keywords according to the ownership relationship between the disease entities and the symptom keyword entities, and selecting a plurality of disease entities with highest morbidity according to the morbidity attribute owned by the disease entities to obtain an initial disease entity alternative set;
b) expanding a disease entity alternative set: based on the embedded vectors of the disease entities in the medical knowledge map, selecting one or more disease entities which are closest to the Euclidean distance of the embedded vector of each disease entity in the disease entity candidate set, namely the most similar disease entities, expanding the initial disease entity candidate set to obtain a disease entity expanded candidate set, and further mining the diseases potentially suffered by the user;
c) giving the final recommended disease outcome: according to the similarity of M aspects in sex, age, symptom keyword and symptom description, screening a recommendation result from a disease entity expansion candidate set, wherein the similarity of sex and age is based on character string matching, the similarity of symptom keyword is based on set intersection operation, the symptom description uses a word Frequency-Inverse Document Frequency (TF-IDF, Term Frequency-Inverse Document Frequency) model to obtain a vector expression of the symptom description, and the final symptom description similarity is measured by cosine similarity between vectors; finally, selecting multiple disease entities with the highest sum of similarity in M aspects as a recommendation result, searching a medical knowledge map, and respectively giving out a diagnosis department corresponding to each disease entity according to the diagnosis department relationship between the disease entities and the department entities;
d) and respectively aiming at each disease entity, selecting the hospital with the highest score in the disease entities according to the finally recommended clinic of each disease entity obtained in the step c) and the Wilson score in the step 3), and then selecting the doctor with the highest score in the hospital of the disease entities to recommend to the user.
Further, in step 4) d), according to the final recommended disease entity visit department obtained in step c), selecting a plurality of hospitals with highest scores among the various hospitals according to the Wilson score in step 3), selecting the highest ranked hospitals among the plurality of hospitals according to the ranking order on the social network site, and then selecting the doctors with the highest scores under the visit departments of the hospitals to which the hospitals belong to recommend the doctors to the user.
The disease and disease entities appearing in step 1) are referred to as being virtually uniform, the disease entities are expressions in the knowledge map, the disease is an expression corresponding to life, and the like.
In step 2), the knowledge-graph embedded model is trained to be one of distance translation models.
In step 3), the natural language processing model adopts a deep learning model, such as one of LSTM (Long Short-term memory network), BERT (Bidirectional Encoder expressions from transformers), and the like. The emotion polarity analysis and service quality evaluation information comprises the following steps:
a) according to the evaluation indexes of the medical service quality, emotion polarities of the patient comment texts are manually labeled, and the emotion polarities labeled by the evaluation index dimensionality of each service quality are three types: positive, neutral and negative, the number of manually labeled patient comment texts must not be less than 6000;
b) converting the marked comment text data into digital representation, and inputting a natural language processing model for training;
c) performing emotion polarity analysis on the remaining comment text data by using the trained natural language processing model to obtain the favorable rating of each doctor, wherein the favorable rating of each doctor is the ratio of the number of positive patient comment texts of the same doctor to the total number of the patient comment texts;
d) summarizing the patient comment texts of the same doctor department of the same hospital according to the relationship between the doctors and the doctor departments to obtain the favorable comment rate of the corresponding doctor department of the hospital, wherein the favorable comment rate of the doctor department, namely the number of the positive patient comment texts of all doctors in the same doctor department accounts for the total number of the patient comment texts of all doctors in the doctor department;
e) the calculation formula of the Wilson Score in the evaluation index dimension of the service quality of doctors and departments in treatment is as follows:
Figure BDA0002570877390000041
wherein p represents a good score, zαThe quantile of normal distribution is represented, and the value range is [1.6,6.0 ]]The method is used for measuring the scoring credibility, the credibility range is 90% -100%, and n represents the total amount of the comment text data;
f) according to the above formula, the wilson score for each doctor and the wilson score for each visit department are given.
The reference medical quality of service evaluation index comes from the servqualit (quality of service) model proposed by Berry et al, which is divided into five aspects, respectively, Tangibles (i.e., formability for measuring the performance of service provider environmental facilities and appearance of service personnel), Reliability (i.e., Reliability for measuring the ability of service provider to honor commitments), Responsiveness (i.e., Responsiveness for measuring the desire of service provider to help customer to improve service quickly), assence (i.e., assurances for measuring the knowledge, etiquette of service personnel and expressing the ability of confidence and credibility), Empathy (i.e., commonness for measuring the desire and ability of service provider to care about and improve personalized service for customer). In the process of practical application, the invention is slightly adjusted according to the characteristics of medical comments, and the method specifically comprises the following steps:
1. the shapeability evaluation dimension is eliminated due to the medical review lacking a description of the hospital hardware facilities;
2. because the responsiveness and the sympathy performance contents are too similar in the medical comments, the invention combines the responsiveness and the sympathy performance contents into a whole.
The invention has the advantages that:
the medical knowledge map is constructed based on disease information opened by the Internet, the medical comment data of social media are combined, the service quality of doctors and hospitals is automatically evaluated according to the evaluation index of the medical service quality, recommendation service is provided for users, and the increasing mobile medical service requirements of the users are met to a certain extent.
Compared with other medical inquiry recommendations, the invention has the following advantages: 1) meanwhile, the self-diagnosis of diseases and the recommendation service of doctors and hospitals are completed, and better service quality is provided for users; 2) the disease is recommended by combining information in multiple aspects, the problems of long recommendation list, no recommendation significance and the like caused by simple symptom keyword matching are avoided, and meanwhile, the structured information in the medical knowledge map is utilized, the recommendation options are enriched, and the potential disease of the user can be recommended more easily; 3) the invention analyzes and obtains the service quality of doctors and hospitals based on the text data of the patient comments and the existing evaluation indexes of the medical service quality, and provides more open and clear recommendation service for users.
Drawings
FIG. 1 is a schematic diagram of a medical knowledge-graph obtained by the method for recommending medical inquiry based on knowledge-graph and social media according to the invention.
Detailed Description
The invention will be further elucidated by means of specific embodiments in the following with reference to the drawing.
The medical inquiry recommending method based on the knowledge graph and the social media comprises the following steps:
1) acquiring open structured disease information in internet medical treatment, extracting diseases and related information thereof from the information, wherein the related information of the diseases comprises symptom keywords, morbidity, susceptible population, complications, clinic and symptom description, further extracting age and gender information from the susceptible population, and constructing a medical knowledge graph of 'disease-symptom', wherein the medical knowledge graph comprises five entities and five relations, and the five entities are as follows as shown in figure 1: disease entity, department entity, age entity, gender entity and symptom keyword entity, wherein the disease entity has symptom description attribute and incidence attribute, and five relations are as follows: the medical knowledge map comprises 15418 entities and 85303 relationships, wherein the medical knowledge map comprises a complication relationship between disease entities, an ownership relationship between the disease entities and symptom keyword entities, a clinic relationship between the disease entities and department entities, an age susceptible relationship between the disease entities and age entities, and a gender susceptible relationship between the disease entities and gender entities;
2) adopting a TransD model to train the medical knowledge graph constructed in the step 1) to embed the medical knowledge graph, wherein the parameters are iteration times of 150 times, the vector length is 100, the learning rate is 1.0, the optimizer is a random gradient descent method, the loss obtained by final training is 6.807, and mapping the entities and the relations in the medical knowledge graph into expression in a vector space to obtain embedded vector information of disease entities in the medical knowledge graph;
3) acquiring medical comment data opened on the Internet, wherein the medical comment data comprises doctor names, treatment departments to which doctors belong, hospitals to which the treatment departments belong and comment texts on patients of the doctors; according to the evaluation indexes of the medical service quality, marking the comment texts of the patients, carrying out sentiment polarity analysis on each index dimension of the comment texts of the patients by using a natural language processing model, counting the good evaluation rate of each doctor, summarizing the comment texts of the patients in the same department of treatment according to the relationship between the doctors and the department of treatment to obtain the good evaluation rates of the corresponding departments of treatment, and respectively obtaining the Wilson scores of the doctors and the department of treatment according to the Wilson interval method:
a) according to the evaluation indexes of the medical service quality, emotion polarities of the patient comment texts are manually labeled, and the emotion polarities labeled by the evaluation index dimensionality of each service quality are three types: positive, neutral, and negative, manually annotating patient comment text 6019 strips;
b) converting the marked comment text data into a digital representation, inputting the digital representation into a BERT model for training, wherein the BERT model is loaded with an official pre-trained Chinese model, namely Chinese _ L-12_ H-768_ A-12, the design parameter is iteration number 1, the sequence length is 200, the loss function is a classification cross entropy function (classification _ cross entropy), and the optimizer is Adam (0.00001);
c) performing emotion polarity analysis on the remaining comment text data by using the trained natural language processing model to obtain the favorable rating of each doctor, wherein the favorable rating of each doctor is the ratio of the number of positive patient comment texts of the same doctor to the total number of the patient comment texts;
d) summarizing the patient comment texts of the same doctor department of the same hospital according to the affiliated relationship between the doctors and the doctor departments to obtain the favorable comment rate of the corresponding doctor department, wherein the favorable comment rate of the doctor department, namely the number of the positive patient comment texts of all doctors in the same doctor department accounts for the total number of the patient comment texts of the doctor department;
e) the calculation formula of the Wilson Score in the evaluation index dimension of the service quality of doctors and departments in treatment is as follows:
Figure BDA0002570877390000061
wherein p represents the good comment rate, the calculation method is the proportion of positive comments in the total number of comments, and zαExpressing the quantile of normal distribution, taking the value of 2, and expressing the total amount of the comment text data by n, wherein the score credibility is about 95%;
f) according to the above formula, the wilson score of each doctor and the wilson score of each doctor department are given and stored in a JSON (JavaScript Object Notation) file;
4) the symptom keyword, the sex, the age and the symptom description information input by the user contain M kinds of information of the symptom keyword, namely the symptom keyword is information which needs to be input, M is more than or equal to 1 and less than or equal to 4, a medical knowledge graph is inquired according to the M kinds of information input by the user, an initial disease entity alternative set is constructed, the most similar disease entity expansion alternative set is selected according to embedded vector information of the disease entity in the medical knowledge graph, the disease potentially suffered by the user is mined, finally, the recommended disease is screened according to the similarity of the sex, the age, the symptom keyword and the corresponding M aspects in the symptom description input by the user, the medical department and the doctor with the highest Wilson score are recommended according to the medical departments of the corresponding disease entities, and the hospital to which the recommended Welsson score is highest is given:
a) constructing an initial disease entity alternative set: inquiring a medical knowledge map according to symptom keywords input by a user, screening a plurality of disease entities with most similar symptom keywords according to the ownership relationship between the disease entities and the symptom keyword entities, and selecting a plurality of disease entities with highest morbidity according to the morbidity attribute owned by the disease entities to obtain an initial disease entity alternative set;
b) expanding a disease entity alternative set: based on the embedded vectors of the disease entities in the medical knowledge map, selecting one or more disease entities which are closest to the Euclidean distance of the embedded vector of each disease entity in the disease entity candidate set, namely the most similar disease entities, expanding the initial disease entity candidate set to obtain a disease entity expanded candidate set, and further mining the diseases potentially suffered by the user;
c) giving the final recommended disease outcome: according to the similarity of M aspects in sex, age, symptom keyword and symptom description, screening a recommendation result from a disease entity expansion candidate set, wherein the similarity of sex and age is based on character string matching, the similarity of symptom keyword is based on set intersection operation, the symptom description uses a word Frequency-Inverse Document Frequency (TF-IDF, Term Frequency-Inverse Document Frequency) model to obtain a vector expression of the symptom description, and the final symptom description similarity is measured by cosine similarity between vectors; finally, selecting multiple disease entities with the highest sum of similarity in M aspects as a recommendation result, searching a medical knowledge map, and respectively giving out a diagnosis department corresponding to each disease entity according to the diagnosis department relationship between the disease entities and the department entities;
d) respectively aiming at each disease entity, selecting a plurality of hospitals with highest scores in the departments according to the Wilson score in the step 3) according to the finally recommended departments corresponding to each disease entity obtained in the step c), and in the actual recommendation process, the invention also considers some experience rules, such as preferentially recommending hospitals which are listed in the front nationwide list in the good doctor website statistics, preferentially recommending hospitals with higher categories, such as Hospital III and the like, and then selecting doctors with highest scores in the affiliated departments to recommend to the user.
Finally, it is noted that the disclosed embodiments are intended to aid in further understanding of the invention, but those skilled in the art will appreciate that: various substitutions and modifications are possible without departing from the spirit and scope of the invention and the appended claims. Therefore, the invention should not be limited to the embodiments disclosed, but the scope of the invention is defined by the appended claims.

Claims (6)

1. A medical inquiry recommending method based on knowledge graph and social media is characterized by comprising the following steps:
1) acquiring open structured disease information in internet medical treatment, extracting diseases and related information thereof from the information, wherein the related information of the diseases comprises symptom keywords, morbidity, susceptible population, complications, clinic and symptom description, further extracting age and gender information from the susceptible population, and constructing a medical knowledge graph of disease-symptom, wherein the medical knowledge graph comprises five entities and five relations, and the five entities are as follows: disease entity, department entity, age entity, gender entity and symptom keyword entity, wherein the disease entity has symptom description attribute and incidence attribute, and five relations are as follows: complication relation exists between the disease entity and the disease entity, possession relation exists between the disease entity and the symptom keyword entity, doctor and department relation exists between the disease entity and the department entity, susceptible age relation exists between the disease entity and the age entity, and susceptible gender relation exists between the disease entity and the gender entity;
2) for the medical knowledge graph constructed in the step 1), training knowledge graph embedding by using a distance translation model, and mapping entities and relations in the medical knowledge graph into expressions in a vector space to obtain embedded vector information of disease entities in the medical knowledge graph;
3) acquiring medical comment data opened on the Internet, wherein the medical comment data comprises doctor names, treatment departments to which doctors belong, hospitals to which the treatment departments belong and comment texts on patients of the doctors; according to the evaluation indexes of the medical service quality, marking the comment texts of the patients, carrying out sentiment polarity analysis on each index dimension of the comment texts of the patients by using a natural language processing model, counting the good evaluation rate of each doctor, summarizing the comment texts of the patients in the same department of treatment according to the relationship between the doctors and the department of treatment to obtain the good evaluation rates of the corresponding departments of treatment, and respectively obtaining the Wilson scores of the doctors and the department of treatment according to the Wilson interval method;
4) the method comprises the steps of inputting symptom keywords, sex, age and symptom description information by a user, wherein the symptom keywords comprise M kinds of information including the symptom keywords, namely the symptom keywords are information which needs to be input, the M is more than or equal to 1 and less than or equal to 4, inquiring a medical knowledge graph according to the M kinds of information input by the user, constructing an initial disease entity alternative set, selecting the most similar disease entity expansion alternative set according to embedded vector information of the disease entities in the medical knowledge graph, excavating diseases potentially suffered by the user, finally screening recommended diseases according to the sex, age, symptom keywords and the similarity of the corresponding M aspects in the symptom description input by the user, and recommending hospitals and doctors to which the doctor department with the highest Wilson score belongs according to the doctor departments in which the corresponding disease entities see the doctor.
2. The medical inquiry recommendation method of claim 1 wherein in step 2), the knowledge-graph embedded model is trained as one of distance translation models.
3. The medical inquiry recommendation method of claim 1 wherein in step 3), the natural language processing model employs a deep learning model.
4. The medical inquiry recommendation method according to claim 1, wherein the emotion polarity analysis and the service quality evaluation information in step 3) comprise the steps of:
a) according to the evaluation indexes of the medical service quality, emotion polarities of the patient comment texts are manually labeled, and the emotion polarities labeled by the evaluation index dimensionality of each service quality are three types: positive, neutral and negative, the number of manually labeled patient comment texts must not be less than 6000;
b) converting the marked comment text data into digital representation, and inputting a natural language processing model for training;
c) performing emotion polarity analysis on the remaining comment text data by using the trained natural language processing model to obtain the favorable rating of each doctor, wherein the favorable rating of each doctor is the ratio of the number of positive patient comment texts of the same doctor to the total number of the patient comment texts;
d) summarizing the patient comment texts of the same doctor department of the same hospital according to the relationship between the doctors and the doctor departments to obtain the favorable comment rate of the corresponding doctor department of the hospital, wherein the favorable comment rate of the doctor department, namely the number of the positive patient comment texts of all doctors in the same doctor department accounts for the total number of the patient comment texts of all doctors in the doctor department;
e) the calculation formula of the Wilson Score in the evaluation index dimension of the service quality of doctors and departments in treatment is as follows:
Figure FDA0002570877380000021
wherein p represents a good score, zαA quantile representing normal distribution, n representing the total amount of comment text data;
f) according to the above formula, the wilson score for each doctor and the wilson score for each visit department are given.
5. The medical inquiry recommendation method according to claim 1, wherein in step 4), referring to the office and the doctor according to the information inputted by the user and giving the belonging hospital, comprises the steps of:
a) constructing an initial disease entity alternative set: inquiring a medical knowledge map according to symptom keywords input by a user, screening a plurality of disease entities with most similar symptom keywords according to the ownership relationship between the disease entities and the symptom keyword entities, and selecting a plurality of disease entities with highest morbidity according to the morbidity attribute owned by the disease entities to obtain an initial disease entity alternative set;
b) expanding a disease entity alternative set: based on the embedded vectors of the disease entities in the medical knowledge map, selecting one or more disease entities which are closest to the Euclidean distance of the embedded vector of each disease entity in the disease entity candidate set, namely the most similar disease entities, expanding the initial disease entity candidate set to obtain a disease entity expanded candidate set, and further mining the diseases potentially suffered by the user;
c) giving the final recommended disease outcome: according to the similarity of M aspects in sex, age, symptom keywords and symptom description, screening out a recommendation result from a disease entity expansion candidate set, wherein the similarity of sex and age is based on character string matching, the similarity of symptom keywords is based on set intersection operation, the symptom description uses a word frequency-inverse file frequency model to obtain vector expression of the symptom description, and the final symptom description similarity is measured by cosine similarity between vectors; finally, selecting multiple disease entities with the highest sum of similarity in M aspects as a recommendation result, searching a medical knowledge map, and respectively giving out a diagnosis department corresponding to each disease entity according to the diagnosis department relationship between the disease entities and the department entities;
d) and respectively aiming at each disease entity, selecting the hospital with the highest score in the disease entities according to the finally recommended clinic of each disease entity obtained in the step c) and the Wilson score in the step 3), and then selecting the doctor with the highest score in the hospital of the disease entities to recommend to the user.
6. The medical inquiry recommendation method of claim 5, wherein in step 4) d), according to the visit department of the finally recommended disease entity obtained in step c), a plurality of hospitals with highest scores among the visit departments are selected according to the Wilson score in step 3), and the highest-ranked hospital among the plurality of hospitals is selected according to the ranking order on the social network site, and then the doctor with the highest score is selected from the visit departments of the hospital to which the selected hospitals belong is recommended to the user.
CN202010639311.8A 2020-07-06 2020-07-06 Medical inquiry recommendation method based on knowledge graph and social media Pending CN111897967A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010639311.8A CN111897967A (en) 2020-07-06 2020-07-06 Medical inquiry recommendation method based on knowledge graph and social media

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010639311.8A CN111897967A (en) 2020-07-06 2020-07-06 Medical inquiry recommendation method based on knowledge graph and social media

Publications (1)

Publication Number Publication Date
CN111897967A true CN111897967A (en) 2020-11-06

Family

ID=73193027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010639311.8A Pending CN111897967A (en) 2020-07-06 2020-07-06 Medical inquiry recommendation method based on knowledge graph and social media

Country Status (1)

Country Link
CN (1) CN111897967A (en)

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112528153A (en) * 2020-12-22 2021-03-19 北京百度网讯科技有限公司 Content recommendation method, device, equipment, storage medium and program product
CN112530576A (en) * 2020-11-30 2021-03-19 百度健康(北京)科技有限公司 Online doctor-patient matching method and device, electronic equipment and storage medium
CN112786194A (en) * 2021-01-28 2021-05-11 北京一脉阳光医学信息技术有限公司 Medical image diagnosis guide inspection system, method and equipment based on artificial intelligence
CN113111162A (en) * 2021-04-21 2021-07-13 康键信息技术(深圳)有限公司 Department recommendation method and device, electronic equipment and storage medium
CN113160954A (en) * 2021-04-07 2021-07-23 泰康保险集团股份有限公司 Medical resource allocation method and device, storage medium and electronic equipment
CN113220905A (en) * 2021-05-27 2021-08-06 哈尔滨理工大学 Service recommendation method fusing knowledge graph
CN113535901A (en) * 2021-07-08 2021-10-22 北京航空航天大学 E-commerce comment-based user-side commodity knowledge graph construction method
CN113611408A (en) * 2021-08-20 2021-11-05 泰康保险集团股份有限公司 Method, system, equipment and computer readable medium for interacting diagnosis and treatment information
CN113707335A (en) * 2021-09-06 2021-11-26 挂号网(杭州)科技有限公司 Method, device, electronic equipment and storage medium for determining target reception user
CN113764080A (en) * 2021-01-29 2021-12-07 北京京东拓先科技有限公司 Resource allocation method, device and storage medium
CN114093472A (en) * 2021-10-13 2022-02-25 阿里健康科技(杭州)有限公司 Triage information display method and client for Internet medical treatment
CN114201591A (en) * 2021-11-19 2022-03-18 北京三快在线科技有限公司 Method, device, equipment and storage medium for generating evaluation content of inquiry service
CN114220528A (en) * 2021-12-28 2022-03-22 深圳科卫机器人科技有限公司 Hospital department recommendation method and device, computer equipment and storage medium
CN114676390A (en) * 2022-05-27 2022-06-28 华南师范大学 Searching method, system, device and storage medium for persons with similar psychological characteristics
CN114783580A (en) * 2022-06-20 2022-07-22 武汉博科国泰信息技术有限公司 Medical data quality evaluation method and system
CN114840777A (en) * 2022-07-04 2022-08-02 杭州城市大脑有限公司 Multi-dimensional endowment service recommendation method and device and electronic equipment
CN115346654A (en) * 2022-07-14 2022-11-15 赵盛 Intelligent service system based on internet
CN115376668A (en) * 2022-08-30 2022-11-22 温州城市智慧健康有限公司 Big data business analysis method and system applied to intelligent medical treatment
CN115512859A (en) * 2022-11-21 2022-12-23 北京左医科技有限公司 Internet-based in-clinic quality management method, management device and storage medium
CN115547471A (en) * 2022-10-13 2022-12-30 上海清赟医药科技有限公司 Medical information recommendation method based on SCRM
CN115862830A (en) * 2023-02-21 2023-03-28 清华大学 Data processing method and device and electronic equipment
CN116798656A (en) * 2023-05-04 2023-09-22 华中科技大学同济医学院附属协和医院 Remote medical treatment and grading monitoring platform based on cloud-terminal cooperation
CN116976435A (en) * 2023-09-25 2023-10-31 浙江辰龙检测技术有限公司 Knowledge graph construction method based on network security
CN117476163A (en) * 2023-12-27 2024-01-30 万里云医疗信息科技(北京)有限公司 Method, apparatus and storage medium for determining disease conclusion
CN118116620A (en) * 2024-04-28 2024-05-31 支付宝(杭州)信息技术有限公司 Medical question answering method and device and electronic equipment
CN118155817A (en) * 2024-04-26 2024-06-07 旭辉卓越健康信息科技有限公司 Department and expert recommendation method and system based on GPT model
CN118471455A (en) * 2023-08-07 2024-08-09 温州医科大学 Intelligent recommending method, equipment and storage medium for hospital registration departments based on text mining technology
CN118588233A (en) * 2024-08-06 2024-09-03 中国石油大学(华东) Knowledge-graph-driven neurodegenerative disease drug recommendation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110085307A (en) * 2019-04-04 2019-08-02 华东理工大学 A kind of intelligent hospital guide's method and system based on the fusion of multi-source knowledge mapping
CN110489566A (en) * 2019-08-22 2019-11-22 上海软中信息系统咨询有限公司 A kind of hospital guide's method of intelligence hospital guide's service robot
US20200185102A1 (en) * 2018-12-11 2020-06-11 K Health Inc. System and method for providing health information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200185102A1 (en) * 2018-12-11 2020-06-11 K Health Inc. System and method for providing health information
CN110085307A (en) * 2019-04-04 2019-08-02 华东理工大学 A kind of intelligent hospital guide's method and system based on the fusion of multi-source knowledge mapping
CN110489566A (en) * 2019-08-22 2019-11-22 上海软中信息系统咨询有限公司 A kind of hospital guide's method of intelligence hospital guide's service robot

Cited By (36)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112530576A (en) * 2020-11-30 2021-03-19 百度健康(北京)科技有限公司 Online doctor-patient matching method and device, electronic equipment and storage medium
CN112528153A (en) * 2020-12-22 2021-03-19 北京百度网讯科技有限公司 Content recommendation method, device, equipment, storage medium and program product
CN112528153B (en) * 2020-12-22 2024-03-08 北京百度网讯科技有限公司 Content recommendation method, device, apparatus, storage medium, and program product
CN112786194A (en) * 2021-01-28 2021-05-11 北京一脉阳光医学信息技术有限公司 Medical image diagnosis guide inspection system, method and equipment based on artificial intelligence
CN112786194B (en) * 2021-01-28 2024-11-01 北京一脉阳光医学信息技术有限公司 Medical image diagnosis guiding and guiding system, method and equipment based on artificial intelligence
CN113764080A (en) * 2021-01-29 2021-12-07 北京京东拓先科技有限公司 Resource allocation method, device and storage medium
CN113160954A (en) * 2021-04-07 2021-07-23 泰康保险集团股份有限公司 Medical resource allocation method and device, storage medium and electronic equipment
CN113160954B (en) * 2021-04-07 2023-08-01 泰康保险集团股份有限公司 Medical resource allocation method and device, storage medium and electronic equipment
CN113111162A (en) * 2021-04-21 2021-07-13 康键信息技术(深圳)有限公司 Department recommendation method and device, electronic equipment and storage medium
WO2022222943A1 (en) * 2021-04-21 2022-10-27 康键信息技术(深圳)有限公司 Department recommendation method and apparatus, electronic device and storage medium
CN113220905A (en) * 2021-05-27 2021-08-06 哈尔滨理工大学 Service recommendation method fusing knowledge graph
CN113535901B (en) * 2021-07-08 2023-08-18 北京航空航天大学 Method for constructing user side commodity knowledge graph based on e-commerce comments
CN113535901A (en) * 2021-07-08 2021-10-22 北京航空航天大学 E-commerce comment-based user-side commodity knowledge graph construction method
CN113611408A (en) * 2021-08-20 2021-11-05 泰康保险集团股份有限公司 Method, system, equipment and computer readable medium for interacting diagnosis and treatment information
CN113707335A (en) * 2021-09-06 2021-11-26 挂号网(杭州)科技有限公司 Method, device, electronic equipment and storage medium for determining target reception user
CN114093472A (en) * 2021-10-13 2022-02-25 阿里健康科技(杭州)有限公司 Triage information display method and client for Internet medical treatment
CN114201591A (en) * 2021-11-19 2022-03-18 北京三快在线科技有限公司 Method, device, equipment and storage medium for generating evaluation content of inquiry service
CN114220528A (en) * 2021-12-28 2022-03-22 深圳科卫机器人科技有限公司 Hospital department recommendation method and device, computer equipment and storage medium
CN114676390A (en) * 2022-05-27 2022-06-28 华南师范大学 Searching method, system, device and storage medium for persons with similar psychological characteristics
CN114783580A (en) * 2022-06-20 2022-07-22 武汉博科国泰信息技术有限公司 Medical data quality evaluation method and system
CN114840777A (en) * 2022-07-04 2022-08-02 杭州城市大脑有限公司 Multi-dimensional endowment service recommendation method and device and electronic equipment
CN115346654A (en) * 2022-07-14 2022-11-15 赵盛 Intelligent service system based on internet
CN115376668A (en) * 2022-08-30 2022-11-22 温州城市智慧健康有限公司 Big data business analysis method and system applied to intelligent medical treatment
CN115376668B (en) * 2022-08-30 2024-03-08 温州城市智慧健康有限公司 Big data business analysis method and system applied to intelligent medical treatment
CN115547471A (en) * 2022-10-13 2022-12-30 上海清赟医药科技有限公司 Medical information recommendation method based on SCRM
CN115512859A (en) * 2022-11-21 2022-12-23 北京左医科技有限公司 Internet-based in-clinic quality management method, management device and storage medium
CN115862830A (en) * 2023-02-21 2023-03-28 清华大学 Data processing method and device and electronic equipment
CN116798656A (en) * 2023-05-04 2023-09-22 华中科技大学同济医学院附属协和医院 Remote medical treatment and grading monitoring platform based on cloud-terminal cooperation
CN118471455A (en) * 2023-08-07 2024-08-09 温州医科大学 Intelligent recommending method, equipment and storage medium for hospital registration departments based on text mining technology
CN116976435B (en) * 2023-09-25 2023-12-15 浙江辰龙检测技术有限公司 Knowledge graph construction method based on network security
CN116976435A (en) * 2023-09-25 2023-10-31 浙江辰龙检测技术有限公司 Knowledge graph construction method based on network security
CN117476163B (en) * 2023-12-27 2024-03-08 万里云医疗信息科技(北京)有限公司 Method, apparatus and storage medium for determining disease conclusion
CN117476163A (en) * 2023-12-27 2024-01-30 万里云医疗信息科技(北京)有限公司 Method, apparatus and storage medium for determining disease conclusion
CN118155817A (en) * 2024-04-26 2024-06-07 旭辉卓越健康信息科技有限公司 Department and expert recommendation method and system based on GPT model
CN118116620A (en) * 2024-04-28 2024-05-31 支付宝(杭州)信息技术有限公司 Medical question answering method and device and electronic equipment
CN118588233A (en) * 2024-08-06 2024-09-03 中国石油大学(华东) Knowledge-graph-driven neurodegenerative disease drug recommendation method

Similar Documents

Publication Publication Date Title
CN111897967A (en) Medical inquiry recommendation method based on knowledge graph and social media
US11748555B2 (en) Systems and methods for machine content generation
CN111415740B (en) Method and device for processing inquiry information, storage medium and computer equipment
US12001964B2 (en) Artificial intelligence advisory systems and methods for behavioral pattern matching and language generation
CN106776711B (en) Chinese medical knowledge map construction method based on deep learning
US20230252224A1 (en) Systems and methods for machine content generation
US11640403B2 (en) Methods and systems for automated analysis of behavior modification data
CN102663129A (en) Medical field deep question and answer method and medical retrieval system
Gao et al. SCOPE: the South Carolina psycholinguistic metabase
CN112352243A (en) Expert report editor
Moreno-Ortiz et al. Design and validation of annotation schemas for aspect-based sentiment analysis in the tourism sector
Aung et al. Personality prediction based on content of Facebook users: A literature review
CN113868387A (en) Word2vec medical similar problem retrieval method based on improved tf-idf weighting
CN116992002A (en) Intelligent care scheme response method and system
CN115858886B (en) Data processing method, device, equipment and readable storage medium
CN110489758A (en) The values calculation method and device of application program
CN113220985B (en) Service recommendation method based on embedded user portrait model in healthy endowment environment
US11586939B2 (en) Generating comparison information
Liu et al. An Emotion-fused Medical Knowledge Graph and its Application in Decision Support
CN113345557A (en) Data processing method and system
Liu et al. Automatic Acceptance Prediction for Answers in Online Healthcare Community
CN112257424A (en) Keyword extraction method and device, storage medium and equipment
CN110083785A (en) The Sex, Age method of discrimination and device of record are searched for based on user
US12147761B2 (en) Systems and methods for improved spell check
Hema et al. Fuzzy Clustering and Genetic Algorithm for Clinical Pratice Guideline Execution Engines

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20201106

WD01 Invention patent application deemed withdrawn after publication