CN116665922A

CN116665922A - Doctor-patient communication method and system

Info

Publication number: CN116665922A
Application number: CN202310949211.9A
Authority: CN
Inventors: 叶桄希; 陈巧林; 杨林; 吕佳忆; 王璐
Original assignee: Sichuan Tianfu Zhilian Health Technology Co ltd
Current assignee: Sichuan Tianfu Zhilian Health Technology Co ltd
Priority date: 2023-07-31
Filing date: 2023-07-31
Publication date: 2023-08-29

Abstract

The invention relates to the field of doctor-patient communication, in particular to a doctor-patient communication method and system, which greatly improve the accuracy and stability of a doctor-patient communication system. The doctor-patient communication method comprises the following steps: collecting medical information of a patient, wherein the medical information comprises basic information, medical history records, diagnosis results and treatment schemes of the patient; preprocessing the acquired medical information, wherein the preprocessing comprises cleaning, denoising and normalizing; extracting specific characteristics from the pretreated medical information, wherein the specific characteristics comprise the age, sex and severity of the illness state of the patient; training the extracted specific features by using a naive Bayesian algorithm, and establishing a prediction model of doctor-patient communication; evaluating the established prediction model of the doctor-patient communication by using a test data set to determine the prediction accuracy and stability of the model; and carrying out doctor-patient communication based on the estimated prediction model. The invention is suitable for communication between doctors and patients.

Description

Doctor-patient communication method and system

Technical Field

The invention relates to the field of doctor-patient communication, in particular to a doctor-patient communication method and system.

Background

The traditional doctor-patient communication modes comprise the following steps:

1. face-to-face communication: the patient can physically visit a hospital or clinic to communicate face to face with a doctor who can make diagnoses and advice based on the patient's condition and symptoms.

2. Telephone consultation: the patient can consult his own condition and symptoms via telephone to the doctor, who will make preliminary diagnoses and advice based on the information provided in the telephone.

3. Short message or mail consultation: the patient can consult the doctor for own illness and symptoms through short messages or mails, and the doctor can communicate and guide in a mode of replying to the short messages or mails.

4. On-line inquiry: the patient can be subjected to on-line inquiry through an Internet platform, and a doctor can diagnose and recommend the patient through text, voice or video and the like.

Face-to-face communication can be used for more directly knowing the illness and symptoms of a patient, but the patient is required to go to a hospital or clinic, a great deal of time is required, the efficiency is low, and the convenience is extremely high; telephone consultation and short message/mail consultation are convenient and quick, but patients may not be accurately described, doctors may not be able to comprehensively understand the illness state of the patients, and the accuracy of communication is low; the on-line inquiry can save time and cost, but needs a certain network skill and equipment for the patient, and doctors can only carry out unilateral diagnosis according to the description of the patient to give out patient advice, so that the on-line inquiry has low efficiency and low accuracy.

In the prior art, as disclosed in CN105812376a, in a doctor-patient multiparty instant messaging system constructed by using strophe, after the client program is started, the bottom layer communication module starts monitoring. The data packet is converted into a data packet with a structhe protocol format through a protocol analysis module, and the request is newly sent to a communication server side for processing through a bottom communication protocol. The instant communication system platform provides various functional interfaces for the instant communication system, is convenient for the client to demand, and the client expands corresponding functions according to own demand.

The embodiment of the scheme provides an instant communication message technology based on the strophe, and a doctor-patient multiparty instant communication mechanism management mechanism constructed by the strophe is applied in the communication process, so that the success rate of message updating can be greatly improved. The efficiency of doctor-patient communication is improved, but the accuracy and stability of doctor-patient communication are poor.

Disclosure of Invention

The invention aims to provide a doctor-patient communication method, which greatly improves the accuracy and stability of a doctor-patient communication system and ensures the efficiency of doctor-patient communication.

The invention adopts the following technical scheme to achieve the aim, and the doctor-patient communication method comprises the following steps:

step 1, acquiring medical information of a patient, wherein the medical information comprises basic information, medical history records, diagnosis results and treatment schemes of the patient;

step 2, preprocessing the acquired medical information, wherein the preprocessing comprises cleaning, denoising and normalizing;

step 3, extracting specific characteristics from the preprocessed medical information, wherein the specific characteristics comprise the age, sex and severity of the illness state of the patient;

step 4, training the extracted specific features by using a naive Bayesian algorithm, and establishing a prediction model of doctor-patient communication;

step 5, evaluating the established prediction model of the doctor-patient communication by using a test data set to determine the prediction accuracy and stability of the model;

and 6, performing doctor-patient communication based on the estimated prediction model, wherein the method specifically comprises the following steps of: and taking the input information of the patient as the input of the estimated predictive model, and presenting the output information of the estimated predictive model to a doctor, wherein the doctor replies the patient according to the output information.

Further, step 2 specifically includes:

step 201, in a data cleaning stage, removing duplicate data, missing data and abnormal values, specifically including: removing repeated data by using a drop_redundant function in a pandas library in Python, filling missing values by using a filter function, detecting abnormal values by using an outlier function, and processing;

step 202, removing noise and interference in signals in a data denoising stage, which specifically includes: using a signal module in a scipy library in Python to perform data denoising for the first time; performing median filtering by using a medfilt function, performing low-pass filtering by using an lfilter function, and performing secondary denoising and interference removal by the median filtering and the low-pass filtering;

step 203, in the data normalization stage, scaling the data ranges of different features to the same range, specifically including: data normalization processing is performed using the MinMaxScale class in the sklearn library in Python, and includes subtracting a minimum value from the data for each feature and then dividing by a maximum value to scale the data range to between 0 and 1.

The repeated data, the missing data and the abnormal values are removed in the data cleaning stage, the noise and the interference in the signals are removed in the data denoising stage, and the data ranges of different characteristics are scaled to be within the same range in the data normalization stage, so that the data quality is greatly improved, the data format is unified, the subsequent analysis difficulty is reduced, and the subsequent modeling efficiency is improved.

Further, extracting specific features from the preprocessed medical information specifically includes:

after the data normalization process is completed, the SelectKBest class in the scikit-learn library in Python is used for carrying out feature selection, the top k features are selected, the correlation coefficient or information gain index between the top k features and the target variable is calculated to determine which features are the most important, the most important features are taken as specific features, and k is an integer larger than 0. By the scheme, the accuracy of feature extraction is improved.

Further, training the extracted specific features by using a naive bayes algorithm specifically includes:

step 401, dividing the extracted specific features into a training set and a verification set, and determining the number of samples and the number of categories in the training set;

step 402, for each category, calculating the prior probability distribution according to the historical data, and calculating the posterior probability distribution of each category according to the characteristics and the category information by using the bayesian theorem, wherein the calculation formula is as follows:

wherein y is _i Representing the class of the sample, x _i Representing the characteristics of the sample, P (x _i |y _i ) Expressed in given y _i Feature x _i Probability of occurrence, P (y _i ) Representing class y _i Is a priori probability of P (x) _i ) Representing sample characteristics x _i Probability of occurrence in training set, P (y _i |x _i ) Representing posterior probability distribution;

step 403, building a classifier according to the posterior probability distribution obtained by calculation to predict, specifically comprising taking the posterior probability of each category as the weight of the category, weighting and summing the characteristics of all samples, mapping the result to between 0 and 1 through a softmax function, and finally, selecting the category with the highest probability as a prediction result.

Through the training process, for some data sets with a plurality of characteristics, the number of the characteristics can be effectively reduced, the efficiency and the accuracy of the model can be improved, and for the case of a few samples in certain categories, the data sets can be balanced by a weighting method and the like.

Further, the evaluating the established predictive model of the doctor-patient communication using the test data set specifically includes:

step 501, inputting each sample in the test data set into a prediction model, calculating the probability that the prediction model belongs to each category of the sample, selecting the category with the highest probability as a first prediction result of the sample, and selecting the category with the smallest probability as a second prediction result of the sample;

step 502, comparing a first prediction result and a second prediction result of the prediction model on the test data set with the real labels respectively, and calculating the accuracy of the prediction model on the test data set to obtain a corresponding first accuracy and a corresponding second accuracy;

step 503, multiplying the first accuracy rate by a first weight to obtain a first accuracy rate comparison reference value; multiplying the second accuracy rate by a second weight to obtain a second accuracy rate comparison reference value; the first weight is the weight of the category with the largest probability, and the second weight is the weight of the category with the smallest probability;

step 504, the first accuracy comparison reference value and the second accuracy comparison reference value are differenced, the difference value is compared with a set threshold range, and if the difference value is within the set threshold range, the accuracy of the prediction model is judged to be in a reasonable range.

By the above evaluation process, the accuracy of the evaluation process can be improved. By using the test dataset, the generalization ability of the model on unknown data can be verified, i.e., whether the model can correctly predict new data.

A doctor-patient communication system, the doctor-patient communication system comprising:

the data acquisition module is used for acquiring medical information of a patient, wherein the medical information comprises basic information, medical history records, diagnosis results and treatment schemes of the patient;

the data preprocessing module is used for preprocessing acquired medical information, and the preprocessing operation comprises cleaning, denoising and normalization;

the characteristic extraction module is used for extracting specific characteristics from the preprocessed medical information, wherein the specific characteristics comprise the age, sex and severity of the illness state of the patient;

the model training module is used for training the extracted specific features by using a naive Bayesian algorithm and establishing a prediction model of doctor-patient communication;

the model evaluation module is used for evaluating the established prediction model of the doctor-patient communication by using the test data set so as to determine the prediction accuracy and stability of the model evaluation module;

the communication module is used for carrying out doctor-patient communication based on the estimated prediction model, and specifically comprises the following steps: and taking the input information of the patient as the input of the estimated predictive model, and presenting the output information of the estimated predictive model to a doctor, wherein the doctor replies the patient according to the output information.

Further, the data preprocessing module is specifically configured to remove duplicate data, missing data, and outliers in a data cleaning stage, and specifically includes: removing repeated data by using a drop_redundant function in a pandas library in Python, filling missing values by using a filter function, detecting abnormal values by using an outlier function, and processing;

in the data denoising stage, removing noise and interference in signals, specifically including: using a signal module in a scipy library in Python to perform data denoising for the first time; performing median filtering by using a medfilt function, performing low-pass filtering by using an lfilter function, and performing secondary denoising and interference removal by the median filtering and the low-pass filtering;

in the data normalization stage, the data ranges of different features are scaled to be within the same range, and specifically comprises the following steps: data normalization processing is performed using the MinMaxScale class in the sklearn library in Python, and includes subtracting a minimum value from the data for each feature and then dividing by a maximum value to scale the data range to between 0 and 1.

Further, the feature extraction module is specifically configured to, after the data normalization process is completed, select the top k features by using the SelectKBest class in the scikit-learn library in Python, calculate correlation coefficients or information gain indexes between the top k features and the target variable, so as to determine which features are the most important features, take the most important features as specific features, and k is an integer greater than 0.

The model training module is specifically used for dividing the extracted specific features into a training set and a verification set, and determining the number of samples and the number of categories in the training set;

for each category, calculating the prior probability distribution according to the historical data, and calculating the posterior probability distribution of each category according to the characteristics and the category information by using the Bayesian theorem, wherein the calculation formula is as follows:

establishing a classifier according to the posterior probability distribution obtained by calculation to predict, specifically comprising taking the posterior probability of each category as the weight of the category, weighting and summing the characteristics of all samples, mapping the result to between 0 and 1 through a softmax function to obtain the probability that the samples belong to each category, and finally, selecting the category with the maximum probability as a prediction result.

Further, the model evaluation module is specifically configured to input each sample in the test data set into the prediction model, calculate a probability that the prediction model belongs to each class for the sample, select a class with a maximum probability as a first prediction result of the sample, and select a class with a minimum probability as a second prediction result of the sample;

comparing the first prediction result and the second prediction result of the prediction model on the test data set with the real labels respectively, and calculating the accuracy of the prediction model on the test data set to obtain corresponding first accuracy and second accuracy;

multiplying the first accuracy rate by first weight to obtain a first accuracy rate comparison reference value; multiplying the second accuracy rate by a second weight to obtain a second accuracy rate comparison reference value; the first weight is the weight of the category with the largest probability, and the second weight is the weight of the category with the smallest probability;

and comparing the difference between the first accuracy comparison reference value and the second accuracy comparison reference value, and comparing the difference value with a set threshold range, and if the difference value is within the set threshold range, judging that the accuracy of the prediction model is within a reasonable range.

The beneficial effects of the invention are as follows:

the naive Bayes-based doctor-patient communication can realize rapid automatic classification and prediction, and save time and energy of doctors.

The naive Bayesian algorithm provided by the invention assumes that sample data obeys Gaussian distribution, can effectively process unbalanced data sets, and has higher accuracy and stability.

The naive Bayes-based doctor-patient communication can conveniently perform model training and application expansion, and support large-scale data processing and analysis.

According to the naive Bayesian-based doctor-patient communication method, automatic classification and prediction can be realized, the requirement for manual intervention is reduced, and the doctor-patient communication efficiency and accuracy are improved.

Drawings

Fig. 1 is a flowchart of a method for communicating between a doctor and a patient according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention provides a doctor-patient communication method, as shown in fig. 1, comprising the following steps:

s1, collecting medical information of a patient

Wherein the medical information includes basic information of the patient, medical history record, diagnosis result and treatment scheme.

In one embodiment of the invention, the manner in which medical information of a patient is acquired includes:

determining the type of the acquired information: depending on the purpose and scope of the physician-patient communication, it is determined which types of medical information, such as basic information, medical history, diagnostic results, treatment regimen, etc., need to be collected.

Collecting patient basic information: including name, gender, age, contact, etc. Such information may be obtained by way of patient filled forms, physician inquiries, and the like.

Collecting a patient's medical history: the past medical history, family medical history, allergic history, etc. of the patient are collected by querying the patient or looking up a record of medical records, etc.

Confirming the diagnosis result of the patient: if the patient has received an examination and diagnosis from a hospital or clinic, the doctor or medical institution may be queried for the patient's diagnosis.

Treatment regimen for the collection patient: if the patient has begun to receive treatment, the patient's treatment regimen, medication, etc. may be collected from the doctor or medical facility.

Patient symptoms and feedback were recorded: in doctor-patient communication, a doctor may ask the patient for symptoms and feedback, and the information may help the doctor to better understand the condition and treatment effect of the patient.

S2, preprocessing the acquired medical information

The pretreatment operation comprises the modes of cleaning, denoising, normalization and the like.

In one embodiment of the present invention, the specific method for cleaning, denoising and normalizing the acquired medical information includes:

S3, extracting specific characteristics from the preprocessed medical information

Specific characteristics may include, among others, the age, sex, severity of the condition, etc. of the patient.

In one embodiment of the present invention, the method for extracting specific features from the preprocessed medical information specifically includes:

after the data normalization process is completed, the SelectKBest class in the scikit-learn library in Python is used for carrying out feature selection, the top k features are selected, the correlation coefficient or information gain index between the top k features and the target variable is calculated to determine which features are the most important, the most important features are taken as specific features, and k is an integer larger than 0.

S4, training the extracted specific features by using a naive Bayesian algorithm, and establishing a prediction model of doctor-patient communication

In one embodiment of the present invention, a method for training an extracted specific feature by using a naive bayes algorithm specifically includes:

S5, evaluating the established prediction model of the doctor-patient communication by using the test data set

The purpose of the evaluation is to determine the accuracy and stability of the predictive model.

In one embodiment of the present invention, the accuracy of the established predictive model of the doctor-patient communication using the test dataset may be evaluated as follows:

In this embodiment, the true label refers to a correct answer or a target value corresponding to the test data, which is also referred to as a true value or a true label. In machine learning, real labels are used to evaluate the performance and accuracy of a model, and the model is optimized and improved according to the evaluation result.

In one embodiment of the present invention, the stability evaluation of the established predictive model of the doctor-patient communication using the test dataset may be performed in the following manner:

and comparing the prediction result with the real label, and calculating the average absolute error of each sample. The average absolute error can reflect the fluctuation condition of the model in the prediction process, and smaller average absolute error indicates better prediction stability of the model.

The prediction stability of the model can also be determined by plotting ROC curves and comparing the performance of the model under different thresholds.

S6, performing doctor-patient communication based on the estimated prediction model

In the communication process of the doctor and the patient, the input information of the patient is used as the input of the estimated prediction model, the output information of the estimated prediction model is presented to the doctor, and the doctor replies the patient according to the output information.

For example, after the patient inputs the name, the age and the preliminary description of cold symptoms, the estimated prediction model correspondingly outputs relevant information such as the cold symptoms, the treatment scheme, the drug allergy and the like before the patient is output to a doctor, the doctor comprehensively analyzes the presented information, and communicates with the patient according to the analysis result, so that the working time of the doctor is greatly saved, and the communication efficiency and accuracy are improved.

The invention also provides a doctor-patient communication system for realizing the doctor-patient communication method according to the embodiment of the invention, wherein the doctor-patient communication comprises the following steps:

In one embodiment of the present invention, the data preprocessing module is specifically configured to remove duplicate data, missing data, and outliers during a data cleansing stage, and specifically includes: removing repeated data by using a drop_redundant function in a pandas library in Python, filling missing values by using a filter function, detecting abnormal values by using an outlier function, and processing;

In one embodiment of the present invention, the feature extraction module is specifically configured to, after the data normalization process is completed, perform feature selection using a SelectKBest class in a scikit-learn library in Python, select the top k features, and calculate correlation coefficients or information gain indexes between them and the target variable to determine which features are the most important features, and take the most important features as specific features, where k is an integer greater than 0.

In one embodiment of the present invention, the model training module is specifically configured to divide the extracted specific features into a training set and a verification set, and determine the number of samples and the number of categories in the training set;

In one embodiment of the present invention, the model evaluation module is specifically configured to input each sample in the test data set into the prediction model, calculate a probability that the prediction model belongs to each class for the sample, select a class with a maximum probability as a first prediction result of the sample, and select a class with a minimum probability as a second prediction result of the sample;

In conclusion, the accuracy and the stability of the doctor-patient communication system are greatly improved, and meanwhile, the efficiency of doctor-patient communication is guaranteed.

Claims

1. A doctor-patient communication method, characterized in that the doctor-patient communication method comprises:

2. The doctor-patient communication method according to claim 1, wherein step 2 specifically comprises:

3. The doctor-patient communication method according to claim 2, wherein extracting specific features from the preprocessed medical information specifically comprises:

4. The doctor-patient communication method according to claim 1, wherein training the extracted specific features using a naive bayes algorithm specifically includes:

5. The method of doctor-patient communication according to claim 1, wherein evaluating the established predictive model of the doctor-patient communication using the test data set specifically comprises:

6. A doctor-patient communication system for implementing the doctor-patient communication method according to any one of claims 1-5, wherein the doctor-patient communication system includes:

7. The doctor-patient communication system according to claim 6, wherein the data preprocessing module is specifically configured to remove duplicate data, missing data, and outliers during the data cleansing phase, and specifically includes: removing repeated data by using a drop_redundant function in a pandas library in Python, filling missing values by using a filter function, detecting abnormal values by using an outlier function, and processing;

8. The doctor-patient communication system according to claim 7, wherein the feature extraction module is specifically configured to, after the data normalization process is completed, perform feature selection using a SelectKBest class in a scikit-learn library in Python, select the top k features, and calculate correlation coefficients or information gain indexes between the top k features and the target variable to determine which features are most important, and take the most important features as specific features, where k is an integer greater than 0.

9. The doctor-patient communication system of claim 6, wherein the model training module is specifically configured to divide the extracted specific features into a training set and a verification set, and determine the number of samples and the number of categories in the training set;

10. The doctor-patient communication system according to claim 6, wherein the model evaluation module is specifically configured to input each sample in the test data set into the prediction model, calculate a probability that the prediction model belongs to each class for the sample, select a class with a highest probability as a first prediction result of the sample, and select a class with a smallest probability as a second prediction result of the sample;