CN118072975A

CN118072975A - Cough statistics and assessment method, system and storage medium

Info

Publication number: CN118072975A
Application number: CN202410272629.5A
Authority: CN
Inventors: 郗亚薇
Original assignee: Liangxiang Hospital Of Fangshan District Beijing
Current assignee: Liangxiang Hospital Of Fangshan District Beijing
Priority date: 2024-03-11
Filing date: 2024-03-11
Publication date: 2024-05-24

Abstract

The invention provides a cough statistics and assessment method, a cough statistics and assessment system and a storage medium, which comprise the following steps: s1, data acquisition-constructing a training set, a testing and verifying set and a characteristic sequence library; s2, constructing a CNN-LSTM evaluation model; s3, cleaning and preprocessing data; according to the invention, the daily cough data of the patient is obtained by utilizing the acquisition statistical unit, the data set is established according to the obtained data, when the cough of the patient is required to be evaluated or a medical strategy is formulated, the data set is input into the CNN-LSTM evaluation model, the data in the data set is subjected to characteristic extraction by the CNN-LSTM evaluation model, the extracted characteristics are matched with the characteristics in the characteristic sequence library, so that the illness state of the patient is accurately and effectively determined according to the characteristic matching rate, and the evaluation report is generated by retrieving the data of the characteristic sequence library, so that medical staff can analyze the follow-up development trend of the illness state of the patient according to the evaluation report, and the medical strategy is formulated, thereby effectively reducing the occurrence of misdiagnosis accidents.

Description

Cough statistics and assessment method, system and storage medium

Technical Field

The invention relates to a statistics and evaluation method, in particular to a cough statistics and evaluation method, and belongs to the technical field of disease prediction.

Background

Cough is a common symptom of respiratory tract, and is caused by inflammation, foreign body, physical or chemical stimulation of trachea, bronchus mucosa or pleura, and is characterized by glottic closure, respiratory muscle contraction, pulmonary internal pressure increase, glottic opening and air injection in the lung, which is usually accompanied by sound. Cough has protective effect of removing foreign body and secretion of respiratory tract. However, if the cough is not stopped, the acute cough is changed into chronic cough, and great pain is often brought to patients, such as chest distress, itching throat, wheezing and the like.

In the conventional process of evaluating and making a medical policy for cough diseases, medical staff is generally required to determine the disease condition of a patient according to self-described medical history, symptoms and physical examination of the patient and necessary laboratory or imaging examination, and determine the development trend of the subsequent disease condition according to personal experience, however, the patient cannot accurately describe the medical history and symptoms, so that misguidance is easily caused to the medical staff, misdiagnosis accidents are easily caused by misjudgment or insufficient experience only by means of personal experience, and therefore, the cough statistics and evaluation method, system and storage medium are provided.

Disclosure of Invention

In view of the foregoing, the present invention provides a cough statistics and assessment method, system and storage medium to solve or alleviate the technical problems of the prior art, at least providing a beneficial choice.

The technical scheme of the embodiment of the invention is realized as follows: a method of cough statistics and assessment comprising the steps of:

S1, data acquisition-constructing a training set, a testing and verifying set and a characteristic sequence library;

s2, constructing a CNN-LSTM evaluation model;

S3, cleaning and preprocessing data;

s4, model training, testing and evaluation verification-determining a final CNN-LSTM evaluation model;

S5, establishing an acquisition statistical unit, namely acquiring cough data of a patient in real time, and establishing a data set;

s6, cleaning and preprocessing data;

S7, introducing data into a CNN-LSTM evaluation model to perform feature extraction and feature matching;

S8, calculating a feature matching rate;

S9, corresponding data in the feature sequence library is called according to the feature matching rate, and an evaluation report is generated.

Further preferably, in the step S1, the data acquisition object is daily cough, disease type and disease development data of different patients within 3-24 months of hospital history;

Wherein the daily cough data includes cough frequency, duration, cough intensity, and cough event distribution;

the characteristic sequence library comprises frequency sequence characteristics, duration sequence characteristics, intensity sequence characteristics, event distribution sequence characteristics, disease type sequence characteristics and disease development sequence characteristics.

Further preferably, in S2, the CNN-LSTM estimation model is established based on the LSTM model in combination with a convolutional neural network.

Further preferably, in the step S3 and the step S6, data in the data set, the training set, and the test and verification set are cleaned and preprocessed, respectively, so as to improve quality and usability of the data;

the cleaning and preprocessing comprises missing value processing, abnormal value processing, data type conversion, data standardization, feature dimension reduction and data balancing.

Further preferably, in the step S4, the parameters of the model are iteratively optimized by introducing the data in the training set into the CNN-LSTM estimation model and using a back propagation algorithm according to the principle of minimizing the loss function;

Introducing the test and verification set into a trained CNN-LSTM evaluation model, outputting an evaluation result according to the characteristic matching rate through the CNN-LSTM evaluation model, and verifying the evaluation result according to the evaluation index;

the evaluation index comprises an accuracy rate, an F1 fraction and a recall rate.

Further preferably, in the step S5, the patient cough data is collected by using a collection statistics unit, and a data set is established by using the collected data;

the time for the cough data acquisition of the patient is 6-48 hours, and the data acquisition comprises audio data and time data acquisition.

Further preferably, in S7, the data set after data cleaning and preprocessing is imported into a CNN-LSTM evaluation model, and the feature extraction is performed on the data in the data set by using the CNN-LSTM evaluation model, and the feature is matched with the feature in the feature sequence library according to the extracted feature.

Further preferably, in the step S8, the feature matching rate is calculated according to the number of feature matches;

the expression of the feature matching rate is as follows:

in the step S9, the corresponding data in the feature sequence library are ordered by utilizing the feature matching rate, 3-7 pieces of data with the highest feature matching rate are extracted, and an evaluation report is generated;

Wherein the evaluation report is in the form of a graph, and the correctly matched features are marked in the evaluation report.

The embodiment of the invention also provides a cough statistics and assessment system, which comprises an acquisition statistics unit, a data preprocessing unit, a central processing unit, a characteristic sequence library and an assessment display unit, wherein the acquisition statistics unit is connected with the data preprocessing unit, the data preprocessing unit is connected with the central processing unit, the central processing unit is connected with the assessment display unit, and the characteristic sequence library is interactively connected with the central processing unit;

the central processing unit is used for loading the final CNN-LSTM evaluation model;

the data preprocessing unit is used for cleaning and preprocessing the data acquired by the acquisition statistics unit;

The characteristic sequence library is used for storing frequency sequence characteristics, duration sequence characteristics, intensity sequence characteristics, event distribution sequence characteristics, disease type sequence characteristics and disease development sequence characteristic data;

The evaluation display unit is used for displaying an evaluation report generated by the CNN-LSTM evaluation model;

The acquisition statistics unit comprises an audio data acquisition module, a time data statistics module and a data set generation module;

the audio data acquisition module is used for acquiring cough data of a patient;

the time data statistics module is used for counting time;

The data set generation module is used for generating a data set by utilizing the data acquired by the audio data acquisition module and the time data statistics module.

The embodiment of the invention also provides a storage medium, which stores a computer program, and the program is executed by a processor to realize the cough statistics and assessment method.

By adopting the technical scheme, the embodiment of the invention has the following advantages: according to the invention, the daily cough data of the patient is obtained by utilizing the acquisition statistical unit, the data set is established according to the obtained data, when the cough of the patient is required to be evaluated or a medical strategy is formulated, the data set is input into the CNN-LSTM evaluation model, the data in the data set is subjected to characteristic extraction by the CNN-LSTM evaluation model, the extracted characteristics are matched with the characteristics in the characteristic sequence library, so that the illness state of the patient is accurately and effectively determined according to the characteristic matching rate, and the evaluation report is generated by retrieving the data of the characteristic sequence library, so that medical staff can analyze the follow-up development trend of the illness state of the patient according to the evaluation report, and the medical strategy is formulated, thereby effectively reducing the occurrence of misdiagnosis accidents.

The foregoing summary is for the purpose of the specification only and is not intended to be limiting in any way. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features of the present invention will become apparent by reference to the drawings and the following detailed description.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of the steps of the present invention;

FIG. 2 is a system block diagram of the present invention;

FIG. 3 is a block diagram of an acquisition statistics unit of the present invention.

Detailed Description

Hereinafter, only certain exemplary embodiments are briefly described. As will be recognized by those of skill in the pertinent art, the described embodiments may be modified in various different ways without departing from the spirit or scope of the present invention. Accordingly, the drawings and description are to be regarded as illustrative in nature and not as restrictive.

It should be noted that the terms "first," "second," "symmetric," "array," and the like are used merely for distinguishing between description and location descriptions, and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of features indicated. Thus, a feature defining "first," "symmetry," or the like, may explicitly or implicitly include one or more such feature; also, where certain features are not limited in number by words such as "two," "three," etc., it should be noted that the feature likewise pertains to the explicit or implicit inclusion of one or more feature quantities.

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

Example 1

As shown in fig. 1, an embodiment of the present invention provides a cough statistics and assessment method, including the following steps:

s2, constructing a CNN-LSTM evaluation model;

S3, cleaning and preprocessing data;

s6, cleaning and preprocessing data;

S8, calculating a feature matching rate;

In one embodiment, in S1, the data acquisition object is daily cough, disease type and disease development data of different patients within 3 months of the hospital history;

The characteristic sequence library comprises frequency sequence characteristics, duration sequence characteristics, intensity sequence characteristics, event distribution sequence characteristics, disease type sequence characteristics and disease development sequence characteristics;

The characteristic sequence library is utilized to store frequency sequence characteristics, duration sequence characteristics, intensity sequence characteristics, event distribution sequence characteristics, disease type sequence characteristics and disease development sequence characteristic data so that characteristic matching can be carried out on a subsequent CNN-LSTM evaluation model, and corresponding data can be called according to a matching result.

In one embodiment, in S2, a CNN-LSTM evaluation model is built based on the LSTM model in combination with a convolutional neural network.

In one embodiment, in S3 and S6, the data in the data set, the training set, and the test and verification set are cleaned and preprocessed, respectively, to improve the quality and usability of the data;

The cleaning and preprocessing comprises missing value processing, abnormal value processing, data type conversion, data standardization, feature dimension reduction and data balancing;

the data is used for improving the quality and usability of the data by utilizing data cleaning and preprocessing, so that the data can be better applied to subsequent model training, testing, evaluation and final model data analysis;

wherein the missing value processing is to select to delete the sample containing the missing value, to fill the missing value by using the mean value or the median value, or to fill the missing value by using the interpolation method;

Outlier processing is by identifying outliers using statistical or model-based methods and deleting outliers or using reasonable surrogate values;

the data type conversion is to ensure the consistency and availability of the data by converting the data into the correct data type;

The data normalization is to normalize the numerical data to eliminate the dimension difference between different features;

feature dimension reduction is the selection of important features by using statistical methods, model-based methods, or feature importance assessment to reduce dimensions and improve model effectiveness;

data balancing is to balance the data distribution by using undersampling, oversampling, or generating composite samples, etc., to avoid model bias towards most classes.

In one embodiment, in S4, the parameters of the model are iteratively optimized by introducing the data in the training set into the CNN-LSTM evaluation model and utilizing a back propagation algorithm according to the principle of minimizing the loss function;

The evaluation index comprises an accuracy rate, an F1 fraction and a recall rate;

Model training, model testing, and model evaluation are performed on the CNN-LSTM evaluation model by using the training set, the test and the validation set.

In one embodiment, in S5, the patient cough data is collected using the collection statistics unit, and a dataset is created using the collected data;

The time for collecting cough data of a patient is 6 hours, and the data collection comprises audio data and time data collection;

the daily cough data of the collection statistics unit are utilized to collect, so that the characteristics of the patient such as cough frequency, single cough duration, cough intensity, cough event distribution and the like are analyzed according to the collected data by utilizing the CNN-LSTM evaluation model.

In one embodiment, in S7, the data set after data cleaning and preprocessing is imported into a CNN-LSTM estimation model, and the CNN-LSTM estimation model is used to perform feature extraction on the data in the data set, and match the extracted features with features in a feature sequence library;

the CNN is responsible for extracting local and global features of an image, the image is processed through a convolution layer and a pooling layer to obtain high-dimensional feature representation, the features are regarded as time sequence data, the time sequence data are transmitted to an LSTM model for time sequence modeling, the time sequence relation among the features of the image is captured, and then the extracted features are matched with features in a feature sequence library by using a feature matching algorithm;

the feature matching is a nearest neighbor matching algorithm, and for a given feature point or feature descriptor, the nearest feature point or feature descriptor is selected as matching by calculating the distance between the feature point or feature descriptor and other feature points or feature descriptors.

In one embodiment, in S8, calculating a feature matching rate according to the number of feature matches;

the expression of the feature matching rate is as follows:

S9, sorting the corresponding data in the feature sequence library by utilizing the feature matching rate, extracting 3 pieces of data with the highest feature matching rate, and generating an evaluation report;

And generating an evaluation report by taking 3 pieces of data with the highest feature matching rate, so as to ensure the accuracy of the evaluation report.

The evaluation report is in a chart form, the correctly matched features are marked on the evaluation report, and medical staff can quickly judge the current disease development degree of the patient and distinguish future development trends by marking the correctly matched features on the evaluation report.

Example two

s2, constructing a CNN-LSTM evaluation model;

S3, cleaning and preprocessing data;

s6, cleaning and preprocessing data;

S8, calculating a feature matching rate;

In one embodiment, in S1, the data acquisition object is daily cough, disease type and disease development data of different patients within 24 months of the hospital history;

The time for collecting cough data of a patient is 48 hours, and the data collection comprises audio data and time data collection;

the feature matching is a proportion test matching algorithm, and the reliability of matching is judged by calculating the distance proportion between the nearest neighbor and the next nearest neighbor on the basis of the nearest neighbor matching algorithm.

the expression of the feature matching rate is as follows:

S9, sorting the corresponding data in the feature sequence library by utilizing the feature matching rate, extracting 7 pieces of data with the highest feature matching rate, and generating an evaluation report;

and generating an evaluation report by taking 7 pieces of data with the highest feature matching rate, so as to ensure the accuracy of the evaluation report.

Example III

s2, constructing a CNN-LSTM evaluation model;

S3, cleaning and preprocessing data;

s6, cleaning and preprocessing data;

S8, calculating a feature matching rate;

In one embodiment, in S1, the data acquisition object is daily cough, disease type and disease development data of different patients within 12 months of the hospital history;

The time for collecting cough data of a patient is 24 hours, and the data collection comprises audio data and time data collection;

the feature matching is a local sensitive hash algorithm, and efficient approximate matching is realized by mapping similar data into the same hash bucket.

the expression of the feature matching rate is as follows:

s9, sorting the corresponding data in the feature sequence library by utilizing the feature matching rate, extracting 5 pieces of data with the highest feature matching rate, and generating an evaluation report;

and generating an evaluation report by taking 5 pieces of data with the highest feature matching rate, so as to ensure the accuracy of the evaluation report.

As shown in fig. 2-3, the embodiment of the invention further provides a cough statistics and assessment system, which comprises an acquisition statistics unit, a data preprocessing unit, a central processing unit, a feature sequence library and an assessment display unit, wherein the acquisition statistics unit is connected with the data preprocessing unit, the data preprocessing unit is connected with the central processing unit, the central processing unit is connected with the assessment display unit, and the feature sequence library is interactively connected with the central processing unit;

the time data statistics module is used for counting time;

The embodiment of the invention also provides a storage medium, wherein the storage medium stores a computer program, and the program is executed by a processor to realize the cough statistics and assessment method.

The foregoing is merely illustrative of the present invention, and the present invention is not limited thereto, and any person skilled in the art will readily recognize that various modifications and substitutions are possible within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method of cough statistics and assessment comprising the steps of:

s2, constructing a CNN-LSTM evaluation model;

S3, cleaning and preprocessing data;

s6, cleaning and preprocessing data;

S8, calculating a feature matching rate;

2. The cough statistic and assessment method according to claim 1, wherein: in the step S1, the data acquisition object is daily cough, illness type and illness development data of different patients within 3-24 months of hospital history;

3. The cough statistic and assessment method according to claim 1, wherein: in the S2, a CNN-LSTM evaluation model is established based on the combination of the LSTM model and a convolutional neural network.

4. The cough statistic and assessment method according to claim 1, wherein: in the step S3 and the step S6, data in the data set, the training set and the test and verification set are respectively cleaned and preprocessed, so that the quality and the usability of the data are improved;

5. The cough statistics and assessment method as claimed in claim 3, wherein: in the step S4, the data in the training set is introduced into a CNN-LSTM evaluation model, and the parameters of the model are iterated and optimized by utilizing a back propagation algorithm according to the principle of minimizing the loss function;

6. The cough statistic and assessment method according to claim 1, wherein: in the step S5, the cough data of the patient is acquired by utilizing an acquisition statistical unit, and a data set is established by utilizing the acquired data;

7. The cough statistic and assessment method according to claim 5, wherein: in the step S7, the data set after data cleaning and preprocessing is imported into a CNN-LSTM evaluation model, the CNN-LSTM evaluation model is utilized to conduct feature extraction on the data in the data set, and the extracted features are matched with features in a feature sequence library.

8. The cough statistic and assessment method according to claim 7, wherein: in the step S8, calculating the feature matching rate according to the feature matching quantity;

the expression of the feature matching rate is as follows:

9. The cough statistics and evaluation system comprises an acquisition statistics unit, a data preprocessing unit, a central processing unit, a characteristic sequence library and an evaluation display unit, and is characterized in that: the collection statistics unit is connected with the data preprocessing unit, the data preprocessing unit is connected with the central processing unit, the central processing unit is connected with the evaluation display unit, and the characteristic sequence library is interactively connected with the central processing unit;

the time data statistics module is used for counting time;

10. A storage medium, characterized by: the storage medium stores a computer program for execution by a processor to implement the cough statistics and assessment method of any one of claims 1-8.