WO2021229600A1 - Auscultation system for guiding a user to perform auscultation on a subject - Google Patents

Auscultation system for guiding a user to perform auscultation on a subject Download PDF

Info

Publication number
WO2021229600A1
WO2021229600A1 PCT/IN2021/050448 IN2021050448W WO2021229600A1 WO 2021229600 A1 WO2021229600 A1 WO 2021229600A1 IN 2021050448 W IN2021050448 W IN 2021050448W WO 2021229600 A1 WO2021229600 A1 WO 2021229600A1
Authority
WO
WIPO (PCT)
Prior art keywords
auscultation
stethoscope
subject
sites
guide device
Prior art date
Application number
PCT/IN2021/050448
Other languages
French (fr)
Inventor
Vikram ARIA NARAYAN
Original Assignee
Aria Narayan Vikram
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aria Narayan Vikram filed Critical Aria Narayan Vikram
Publication of WO2021229600A1 publication Critical patent/WO2021229600A1/en

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B7/00Instruments for auscultation
    • A61B7/02Stethoscopes

Definitions

  • the invention generally relates to auscultation in healthcare. Specifically, the invention relates to an Artificial Intelligence (Al)-enabled auscultation system for guiding a user to perform auscultation on a subject using Augmented Reality (AR) and to return a diagnosis and/or a screening report to the user.
  • Al Artificial Intelligence
  • AR Augmented Reality
  • Auscultation is the act of listening to sounds of a human body using a stethoscope, to diagnose diseases. Auscultation can be used to diagnose a wide variety of diseases. Additionally, auscultation is non-invasive and inexpensive compared to other diagnostic methods such as, but not limited to, electrocardiograms (ECGs or EKGs) and X-rays.
  • ECGs electrocardiograms
  • EKGs EKGs
  • X-rays X-rays
  • Auscultation is an art that requires substantial tacit knowledge that can only be gained with practical experience.
  • due to the growing population there is a paucity of healthcare workers in many parts of the world today.
  • FIG. 1 illustrates an auscultation system for guiding a user to perform auscultation on a subject in accordance with an embodiment of the invention.
  • FIG. 2 illustrates a flow diagram of various method steps involved in the working of the auscultation system in accordance with an embodiment of the invention.
  • Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of embodiments of the present invention.
  • the terms “a” or “an”, as used herein, are defined as one or more than one.
  • the term plurality, as used herein, is defined as two or more than two.
  • the term another, as used herein, is defined as at least a second or more.
  • the terms including and/or having, as used herein, are defined as comprising (i.e., open language).
  • the term coupled, as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically.
  • program, software application, and the like as used herein, are defined as a sequence of instructions designed for execution on a computer system.
  • a program, computer program, or software application may include a subroutine, a function, a procedure, an object method, an object implementation, an executable application, an applet, a servlet, a source code, an object code, a shared library/dynamic load library and/or other sequence of instructions designed for execution on a computer system.
  • the auscultation system includes a guide device which further includes a computer vision model configured to identify one or more auscultation sites on the subject.
  • the guide device also includes an augmented reality (AR) module configured to display the one or more auscultation sites on a screen of the guide device.
  • AR augmented reality
  • the AR module employs a deep learning model, TensorFlow Lite PoseNet, to detect key locations on the subject’s body to identify the one or more auscultation sites.
  • the AR module overlays the one or more auscultation sites on the subject’s body and the one or more auscultation sites are highlighted on the screen of the guide device.
  • the auscultation system also includes a stethoscope communicatively coupled to the guide device.
  • the stethoscope can be, but need not be limited to, a digital stethoscope and an acoustic stethoscope.
  • the stethoscope is configured to capture sound signals obtained from positioning the stethoscope on the subject at the one or more auscultation sites.
  • the guide device is configured to check if sound signals captured using the stethoscope are satisfactory based on location of the one or more auscultation sites.
  • the guide device recalculates position of the one or more auscultation sites using the computer vision model, and further directs the user to readjust the position of the stethoscope until the sound signals captured by the stethoscope are satisfactory.
  • the guide device further directs the user to place the stethoscope at the one or more auscultation sites again to test for vocal resonance after performing a preliminary auscultation on the subject.
  • Voice commands are used to direct the subject to pronounce a plurality of phrases and the corresponding sounds from the subject are recorded using the stethoscope.
  • the one or more auscultation sites are recalibrated and customized for accurate placement of the stethoscope based on movement of the stethoscope and the subject’s body dimensions.
  • the auscultation system further includes a diagnostic module configured to interpret sound signals collected from the stethoscope at the one or more auscultation sites using one or more machine learning models.
  • the one or more machine learning models classify the sound signals and predict various medical conditions/disorders.
  • the medical conditions/disorders can be, but need not be limited to, lung disorders, cardiovascular and gastrointestinal diseases or conditions.
  • the diagnostic module employs a K-nearest neighbors machine learning model to identify abnormalities in the sound signals recorded by the stethoscope and returns a diagnosis and/or a screening report to the user.
  • FIG. 1 illustrates an auscultation system 100 for guiding a user to perform auscultation on a subject in accordance with an embodiment of the invention.
  • auscultation system 100 comprises a memory 102 (such as, but not limited to, a non-transitory or a machine readable memory), and a processor 104 (such as, but not limited to, a programmable electronic microprocessor, microcontroller, or similar device) communicatively coupled to memory 102.
  • Memory 102 and processor 104 further communicate with various components of auscultation system 100 via a communication module 106.
  • Communication module 106 may be configured to transmit data between modules, engines, databases, memories, and other components of auscultation system 100 for use in performing the functions discussed herein.
  • Communication module 106 may include one or more communication types and utilizes various communication methods for communication within auscultation system 100.
  • Auscultation system 100 includes a guide device 108 that can be, but need not be limited to, a smartphone, and an augmented reality (AR) device such as mixed reality smart glasses.
  • Guide device 108 includes a computer vision model 110 configured to identify one or more auscultation sites on the subject.
  • AR augmented reality
  • Guide device 108 also includes an AR module 112 configured to display the one or more auscultation sites on a screen of guide device 108.
  • AR module 112 overlays the one or more auscultation sites on the subject’s body and the one or more auscultation sites are highlighted on the screen of guide device 108.
  • Auscultation system 100 includes a stethoscope 114 communicatively coupled to guide device 108.
  • Stethoscope 114 can be, but need not be limited to, a digital stethoscope and an acoustic stethoscope.
  • Stethoscope 114 is configured to capture sound signals obtained from positioning stethoscope 114 on the subject at the one or more auscultation sites using AR module 112 which guides placement of stethoscope 114 on the subject’s body.
  • the information/sound signals collected is relayed back to guide device 108 via appropriate communication technologies such as wired communication and wireless communication including, but not limited to, Bluetooth, Wi-Fi, and Near-Field Communication (NFC).
  • appropriate communication technologies such as wired communication and wireless communication including, but not limited to, Bluetooth, Wi-Fi, and Near-Field Communication (NFC).
  • Guide device 108 is further configured to check if sound signals captured using stethoscope 114 are satisfactory based on location of the one or more auscultation sites. If the sound signals captured are not satisfactory, guide device 108 recalculates position of the one or more auscultation sites using computer vision model 110, and further directs the user to readjust the position of stethoscope 114 until the sound signals captured by stethoscope 114 are satisfactory.
  • AR module 112 employs a deep learning model, TensorFlow Lite PoseNet, to detect key locations such as, but not limited to, hips and shoulders on the subject’s body, to identify the one or more auscultation sites.
  • AR module 112 based on detecting the key locations on the subject’s body, identifies nine lung auscultation sites. This helps in guiding the user or a volunteer in placing stethoscope 114 on the subject’s body and collecting lung sounds using stethoscope 114.
  • Guide device 108 further directs the user to place stethoscope 114 at the one or more auscultation sites again to test for vocal resonance after performing a preliminary auscultation on the subject.
  • Voice commands are used to direct the subject to pronounce a plurality of phrases (such as, but not limited to, “ninety-nine” and “blue balloons”) and the corresponding sounds from the subject are recorded using stethoscope 114.
  • the one or more auscultation sites are recalibrated and customized for accurate placement of stethoscope 114 based on movement of stethoscope 114 and the subject’s body dimensions.
  • the one or more auscultation sites are recalibrated using the deep learning model which recalculates positions of the one or more auscultation sites or points for each frame of a live video feed.
  • Auscultation system 100 further includes a diagnostic module 116 configured to interpret sound signals collected from stethoscope 114 at the one or more auscultation sites using one or more machine learning models 118.
  • One or more machine learning models 118 classify the sound signals and predict various medical conditions/disorders.
  • the medical conditions/disorders can be, but need not be limited to, lung disorders, cardiovascular and gastrointestinal diseases or conditions.
  • one or more machine learning models 118 classify the sounds, and make a diagnosis of “bronchophony”, “egophony” or “normal” and return this diagnosis to the user.
  • diagnostic module 116 employs a K- nearest neighbors machine learning model to identify abnormalities in the sound signals recorded by stethoscope 114 and returns the diagnosis to the user.
  • KNN K- nearest neighbors
  • the K- nearest neighbors (KNN) machine learning model is used to identify abnormalities such as, but not limited to, wheezes and crackles in the lung sounds and this diagnosis is returned to the user.
  • diagnostic module 116 diagnosis of lung diseases using diagnostic module 116 is disclosed. Once lung sounds are collected by stethoscope 114 and sent to diagnostic module 116 (implemented as an application (app), for example), diagnostic module 116 performs a screening of the sounds captured and classifies the sounds as being normal or abnormal.
  • diagnostic module 116 performs a screening of the sounds captured and classifies the sounds as being normal or abnormal.
  • one or more machine learning models 118 are trained using training data from an open sourced respiratory sounds database available in a web-based data science environment.
  • the training data includes 920 annotated recordings of varying length (10 seconds to 90 seconds)
  • the database has a total of 6898 respiratory cycles including normal breath sounds and adventitious sounds (crackles and wheezes).
  • the data also includes clear recordings as well as recordings with background noise in order to simulate real-life conditions.
  • the patients span all age groups which include children, adults and the elderly.
  • a KNN machine learning model or classifier which separates sounds based on their proximity to other sounds. This proximity is determined on the basis of statistics derived from Mel Frequency Cepstral Coefficients (MFCCs) which represent perceptually meaningful sound features.
  • MFCCs Mel Frequency Cepstral Coefficients
  • statistical features are taken from the extracted MFCCs. These act as features for the KNN classifier.
  • An audio data preprocessing pipeline for training the model is as follows.
  • Each sound file has an associated label file.
  • the label file contains the following information: start time of breath cycle, end time of breath cycle, whether crackles are present (represented by 0 or 1), and whether wheezes are present (represented by 0 or 1).
  • the sound files are then loaded into a numpy array format using Librosa, a Python package/audio library for music and audio analysis.
  • the sound files are then split based on breath cycles.
  • the sound clips and associated labels are split up into training and validation data (the training data consists of 70% of the total data and the validation data consists of 30% of the total data).
  • the validation and training data are split randomly. 50 MFCCs are obtained for each sound clip, using a built-in Librosa function.
  • the statistical mean and standard deviation measures are then derived from the MFCCs obtained above in order to reduce the time dependent frequencies into a single vector with 100 components.
  • the feature vector thus obtained is standardized by removing the mean and scaling the vector to unit variance.
  • the KNN model (with nearest neighbors parameter of 3) is trained on the training data and validated on the validation data, the model is used to make predictions on respiratory sounds recorded through stethoscope 114.
  • the windowing method used for each recording to avoid the need for annotation of breath cycles.
  • An audio sample is converted into a numpy array format using the Librosa audio library.
  • the audio file is then split into smaller time chunk windows disregarding length of breath cycles.
  • the time chunk windows vary in time lengths to simulate real-life breath cycle durations.
  • the audio clips are then passed through the following preprocessing pipeline which includes generating MFCCs, obtaining means and standard deviations in a single vector for these MFCCs, and transforming these MFCC statistics using the Standard Scaler created during training.
  • the vectors representing the sound clips created through the model predictor are then passed to the KNN model. If the KNN model predicted any of the clips as containing an adventitious breath sound, the whole recording is predicted as abnormal.
  • FIG. 2 illustrates a flow diagram of various method steps involved in the working of auscultation system 100 in accordance with an embodiment of the invention.
  • a volunteer uses stethoscope 114 (for example, a digital stethoscope) in conjunction with guide device 108.
  • the volunteer positions guide device 108 towards a patient.
  • various auscultation points are overlaid on the patient’s body. The auscultation points are highlighted on the screen of guide device 108 as well.
  • the deep learning model of AR module 112 identifies key points such as, but not limited to, shoulders and hip, on a supplied human image. For instance, using these pre-generated points, nine lung auscultation points are generated using calculations derived through consultation with medical professionals. The specific procedure is as follows.
  • the sound signals collected by stethoscope 114 are sent to guide device 108.
  • guide device 108 checks if the sound signals captured by stethoscope 114 are satisfactory based on the location. If the sound signals captured are not satisfactory, at step 208, the position of an auscultation point is recalculated, and the volunteer is directed to readjust the position of stethoscope 114 until the sound signals received are satisfactory.
  • the auscultation points are recalibrated using the deep learning model which recalculates positions of the auscultation points for each frame of a live video feed.
  • guide device 108 checks if all the auscultation points have been examined. If all the auscultation points have not been examined, the volunteer is directed to move on to the next auscultation point highlighted on the screen of guide device 108 for continuing the process of auscultation.
  • one or more machine learning models 118 are employed for processing the information to provide a diagnosis by predicting the presence or absence of medical conditions such as, but not limited to, heart murmurs, pneumonia, and abdominal bruits.
  • the present invention is advantageous in that it provides a system for AI- enabled auscultation using AR.
  • the use of AR with AI ensures that the spots for auscultation can be found easily, thus making the process of auscultation faster and more efficient.
  • the use of AR to guide a volunteer while he or she is performing auscultation on the patient allows even an untrained layperson to operate it.
  • Using a stethoscope connected to a smartphone app which uses AR and AI even untrained volunteers can perform screening for lung disorders with an accuracy comparable to that of a medical professional.
  • the invention allows for a wide range of diseases to be diagnosed with high accuracy. As additional data is collected, the machine learning model accuracy improves, and diagnosis becomes better over time. Therefore, the invention provides access to quality healthcare and an accuracy of diagnosis is independent of the skill and expertise of a person performing auscultation.
  • the system as described in the invention or any of its components may be embodied in the form of a computing device.
  • the computing device can be, for example, but not limited to, a general-purpose computer, a programmed microprocessor, a micro-controller, a peripheral integrated circuit element, and other devices or arrangements of devices, which are capable of implementing the steps that constitute the method of the invention.
  • the computing device includes a processor, a memory, a nonvolatile data storage, a display, and a user interface.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Acoustics & Sound (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention relates to an auscultation system for guiding a user to perform auscultation on a subject. The auscultation system includes a guide device which employs a computer vision model to identify one or more auscultation sites on the subject, and an augmented reality (AR) module to display the one or more auscultation sites on a screen of the guide device. The auscultation system also includes a stethoscope communicatively coupled to the guide device, to capture sound signals obtained from positioning the stethoscope on the subject at the one or more auscultation sites. The guide device is configured to recalculate position of the one or more auscultation sites using the computer vision model if the sound signals captured are not satisfactory, and to further direct the user to readjust the position of the stethoscope until the sound signals captured by the stethoscope are satisfactory.

Description

AUSCULTATION SYSTEM FOR GUIDING A USER TO PERFORM
AUSCULTATION ON A SUBJECT
FIELD OF THE INVENTION
[0001] The invention generally relates to auscultation in healthcare. Specifically, the invention relates to an Artificial Intelligence (Al)-enabled auscultation system for guiding a user to perform auscultation on a subject using Augmented Reality (AR) and to return a diagnosis and/or a screening report to the user.
BACKGROUND OF THE INVENTION
[0002] Auscultation is the act of listening to sounds of a human body using a stethoscope, to diagnose diseases. Auscultation can be used to diagnose a wide variety of diseases. Additionally, auscultation is non-invasive and inexpensive compared to other diagnostic methods such as, but not limited to, electrocardiograms (ECGs or EKGs) and X-rays.
[0003] Auscultation is an art that requires substantial tacit knowledge that can only be gained with practical experience. However, due to the growing population, there is a paucity of healthcare workers in many parts of the world today. Furthermore, there is an uneven distribution of health workers such that regions of higher poverty have lower numbers of healthcare workers. Owing to this, not many experienced professional healthcare workers are available to perform auscultation on patients.
[0004] Erstwhile auscultation methods employ traditional acoustic stethoscopes. Such acoustic stethoscopes transmit a signal that can sometimes be corrupted by noise leading to faulty diagnosis. Further improvements in this area has led to the rise of digital stethoscopes which help overcome the problem of faulty diagnosis. Additionally, the reliability of diagnosis inferred after auscultation depends on the expertise and hearing of the doctor. The diagnostician must be well-trained in positioning the stethoscope at various spots on the body. Moreover, the diagnostician must have significant experience in classifying these sounds. This skill requires significant time and mentorship to refine.
[0005] Recently, there has been a development of machine learning algorithms to automatically diagnose diseases when these models are supplied with digital signals from the stethoscope. However, the drawback with these solutions is that they still require a trained healthcare worker to operate the stethoscope.
[0006] Therefore, there exists a need in the field for a novel auscultation system which can be operated by any untrained volunteer, therefore, democratizing access to quality healthcare.
BRIEF DESCRIPTION OF THE FIGURES
[0007] The accompanying figures where like reference numerals refer to identical or functionally similar elements throughout the separate views and which together with the detailed description below are incorporated in and form part of the specification, serve to further illustrate various embodiments and to explain various principles and advantages all in accordance with the invention.
[0008] FIG. 1 illustrates an auscultation system for guiding a user to perform auscultation on a subject in accordance with an embodiment of the invention.
[0009] FIG. 2 illustrates a flow diagram of various method steps involved in the working of the auscultation system in accordance with an embodiment of the invention. [0010] Skilled artisans will appreciate that elements in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the figures may be exaggerated relative to other elements to help to improve understanding of embodiments of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
[0011] Before describing in detail embodiments that are in accordance with the invention, it should be observed that the embodiments reside primarily in combinations of method steps and system components for an Artificial Intelligence (Al)-enabled auscultation system for guiding a user to perform auscultation on a subject using Augmented Reality (AR) and to return a diagnosis and/or a screening report to the user.
[0012] Accordingly, the system components and method steps have been represented where appropriate by conventional symbols in the drawings, showing only those specific details that are pertinent to understanding the embodiments of the invention so as not to obscure the disclosure with details that will be readily apparent to those of ordinary skill in the art having the benefit of the description herein.
[0013] The terms “a” or “an”, as used herein, are defined as one or more than one. The term plurality, as used herein, is defined as two or more than two. The term another, as used herein, is defined as at least a second or more. The terms including and/or having, as used herein, are defined as comprising (i.e., open language). The term coupled, as used herein, is defined as connected, although not necessarily directly, and not necessarily mechanically. The terms program, software application, and the like as used herein, are defined as a sequence of instructions designed for execution on a computer system. A program, computer program, or software application may include a subroutine, a function, a procedure, an object method, an object implementation, an executable application, an applet, a servlet, a source code, an object code, a shared library/dynamic load library and/or other sequence of instructions designed for execution on a computer system.
[0014] Various embodiments of the invention disclose an auscultation system for guiding a user to perform auscultation on a subject. The auscultation system includes a guide device which further includes a computer vision model configured to identify one or more auscultation sites on the subject. The guide device also includes an augmented reality (AR) module configured to display the one or more auscultation sites on a screen of the guide device. In an embodiment, the AR module employs a deep learning model, TensorFlow Lite PoseNet, to detect key locations on the subject’s body to identify the one or more auscultation sites. The AR module overlays the one or more auscultation sites on the subject’s body and the one or more auscultation sites are highlighted on the screen of the guide device.
[0015] The auscultation system also includes a stethoscope communicatively coupled to the guide device. The stethoscope can be, but need not be limited to, a digital stethoscope and an acoustic stethoscope. The stethoscope is configured to capture sound signals obtained from positioning the stethoscope on the subject at the one or more auscultation sites. Further, the guide device is configured to check if sound signals captured using the stethoscope are satisfactory based on location of the one or more auscultation sites. If the sound signals captured are not satisfactory, the guide device recalculates position of the one or more auscultation sites using the computer vision model, and further directs the user to readjust the position of the stethoscope until the sound signals captured by the stethoscope are satisfactory.
[0016] In accordance with an embodiment, the guide device further directs the user to place the stethoscope at the one or more auscultation sites again to test for vocal resonance after performing a preliminary auscultation on the subject. Voice commands are used to direct the subject to pronounce a plurality of phrases and the corresponding sounds from the subject are recorded using the stethoscope. The one or more auscultation sites are recalibrated and customized for accurate placement of the stethoscope based on movement of the stethoscope and the subject’s body dimensions. [0017] The auscultation system further includes a diagnostic module configured to interpret sound signals collected from the stethoscope at the one or more auscultation sites using one or more machine learning models. The one or more machine learning models classify the sound signals and predict various medical conditions/disorders. The medical conditions/disorders can be, but need not be limited to, lung disorders, cardiovascular and gastrointestinal diseases or conditions. In an embodiment, the diagnostic module employs a K-nearest neighbors machine learning model to identify abnormalities in the sound signals recorded by the stethoscope and returns a diagnosis and/or a screening report to the user.
[0018] FIG. 1 illustrates an auscultation system 100 for guiding a user to perform auscultation on a subject in accordance with an embodiment of the invention.
[0019] As illustrated in FIG. 1, auscultation system 100, inter alia, comprises a memory 102 (such as, but not limited to, a non-transitory or a machine readable memory), and a processor 104 (such as, but not limited to, a programmable electronic microprocessor, microcontroller, or similar device) communicatively coupled to memory 102. Memory 102 and processor 104 further communicate with various components of auscultation system 100 via a communication module 106.
[0020] Communication module 106 may be configured to transmit data between modules, engines, databases, memories, and other components of auscultation system 100 for use in performing the functions discussed herein. Communication module 106 may include one or more communication types and utilizes various communication methods for communication within auscultation system 100.
[0021] Auscultation system 100 includes a guide device 108 that can be, but need not be limited to, a smartphone, and an augmented reality (AR) device such as mixed reality smart glasses. Guide device 108 includes a computer vision model 110 configured to identify one or more auscultation sites on the subject.
[0022] Guide device 108 also includes an AR module 112 configured to display the one or more auscultation sites on a screen of guide device 108. AR module 112 overlays the one or more auscultation sites on the subject’s body and the one or more auscultation sites are highlighted on the screen of guide device 108.
[0023] Auscultation system 100 includes a stethoscope 114 communicatively coupled to guide device 108. Stethoscope 114 can be, but need not be limited to, a digital stethoscope and an acoustic stethoscope. Stethoscope 114 is configured to capture sound signals obtained from positioning stethoscope 114 on the subject at the one or more auscultation sites using AR module 112 which guides placement of stethoscope 114 on the subject’s body.
[0024] Once the process of auscultation using stethoscope 114 is executed, the information/sound signals collected is relayed back to guide device 108 via appropriate communication technologies such as wired communication and wireless communication including, but not limited to, Bluetooth, Wi-Fi, and Near-Field Communication (NFC).
[0025] Guide device 108 is further configured to check if sound signals captured using stethoscope 114 are satisfactory based on location of the one or more auscultation sites. If the sound signals captured are not satisfactory, guide device 108 recalculates position of the one or more auscultation sites using computer vision model 110, and further directs the user to readjust the position of stethoscope 114 until the sound signals captured by stethoscope 114 are satisfactory. [0026] In accordance with an embodiment, AR module 112 employs a deep learning model, TensorFlow Lite PoseNet, to detect key locations such as, but not limited to, hips and shoulders on the subject’s body, to identify the one or more auscultation sites. For instance, AR module 112, based on detecting the key locations on the subject’s body, identifies nine lung auscultation sites. This helps in guiding the user or a volunteer in placing stethoscope 114 on the subject’s body and collecting lung sounds using stethoscope 114.
[0027] Guide device 108 further directs the user to place stethoscope 114 at the one or more auscultation sites again to test for vocal resonance after performing a preliminary auscultation on the subject. Voice commands are used to direct the subject to pronounce a plurality of phrases (such as, but not limited to, “ninety-nine” and “blue balloons”) and the corresponding sounds from the subject are recorded using stethoscope 114.
[0028] The one or more auscultation sites are recalibrated and customized for accurate placement of stethoscope 114 based on movement of stethoscope 114 and the subject’s body dimensions. In accordance with an embodiment, the one or more auscultation sites are recalibrated using the deep learning model which recalculates positions of the one or more auscultation sites or points for each frame of a live video feed.
[0029] Auscultation system 100 further includes a diagnostic module 116 configured to interpret sound signals collected from stethoscope 114 at the one or more auscultation sites using one or more machine learning models 118. One or more machine learning models 118 classify the sound signals and predict various medical conditions/disorders. The medical conditions/disorders can be, but need not be limited to, lung disorders, cardiovascular and gastrointestinal diseases or conditions. [0030] In accordance with an embodiment, one or more machine learning models 118 classify the sounds, and make a diagnosis of “bronchophony”, “egophony” or “normal” and return this diagnosis to the user.
[0031] In accordance with another embodiment, diagnostic module 116 employs a K- nearest neighbors machine learning model to identify abnormalities in the sound signals recorded by stethoscope 114 and returns the diagnosis to the user. For instance, the K- nearest neighbors (KNN) machine learning model is used to identify abnormalities such as, but not limited to, wheezes and crackles in the lung sounds and this diagnosis is returned to the user.
[0032] In accordance with an exemplary embodiment, diagnosis of lung diseases using diagnostic module 116 is disclosed. Once lung sounds are collected by stethoscope 114 and sent to diagnostic module 116 (implemented as an application (app), for example), diagnostic module 116 performs a screening of the sounds captured and classifies the sounds as being normal or abnormal.
[0033] In order to perform the screening and classification of the sounds, one or more machine learning models 118 are trained using training data from an open sourced respiratory sounds database available in a web-based data science environment. For instance, the training data includes 920 annotated recordings of varying length (10 seconds to 90 seconds) Taken from 126 patients, the database has a total of 6898 respiratory cycles including normal breath sounds and adventitious sounds (crackles and wheezes). The data also includes clear recordings as well as recordings with background noise in order to simulate real-life conditions. The patients span all age groups which include children, adults and the elderly.
[0034] In accordance with an embodiment, a KNN machine learning model or classifier is used which separates sounds based on their proximity to other sounds. This proximity is determined on the basis of statistics derived from Mel Frequency Cepstral Coefficients (MFCCs) which represent perceptually meaningful sound features. In order to analyze respiratory sounds, statistical features (mean and standard deviation) are taken from the extracted MFCCs. These act as features for the KNN classifier.
[0035] An audio data preprocessing pipeline for training the model is as follows.
[0036] Each sound file has an associated label file. The label file contains the following information: start time of breath cycle, end time of breath cycle, whether crackles are present (represented by 0 or 1), and whether wheezes are present (represented by 0 or 1).
[0037] The sound files are then loaded into a numpy array format using Librosa, a Python package/audio library for music and audio analysis.
[0038] The sound files are then split based on breath cycles. The sound clips and associated labels are split up into training and validation data (the training data consists of 70% of the total data and the validation data consists of 30% of the total data). The validation and training data are split randomly. 50 MFCCs are obtained for each sound clip, using a built-in Librosa function.
[0039] The statistical mean and standard deviation measures are then derived from the MFCCs obtained above in order to reduce the time dependent frequencies into a single vector with 100 components. The feature vector thus obtained is standardized by removing the mean and scaling the vector to unit variance.
[0040] Once the KNN model (with nearest neighbors parameter of 3) is trained on the training data and validated on the validation data, the model is used to make predictions on respiratory sounds recorded through stethoscope 114. [0041] Below is a brief description of the windowing method used for each recording to avoid the need for annotation of breath cycles.
[0042] An audio sample is converted into a numpy array format using the Librosa audio library.
[0043] The audio file is then split into smaller time chunk windows disregarding length of breath cycles. The time chunk windows vary in time lengths to simulate real-life breath cycle durations.
[0044] The audio clips are then passed through the following preprocessing pipeline which includes generating MFCCs, obtaining means and standard deviations in a single vector for these MFCCs, and transforming these MFCC statistics using the Standard Scaler created during training.
[0045] The vectors representing the sound clips created through the model predictor are then passed to the KNN model. If the KNN model predicted any of the clips as containing an adventitious breath sound, the whole recording is predicted as abnormal.
[0046] FIG. 2 illustrates a flow diagram of various method steps involved in the working of auscultation system 100 in accordance with an embodiment of the invention.
[0047] A volunteer uses stethoscope 114 (for example, a digital stethoscope) in conjunction with guide device 108. At step 202, the volunteer positions guide device 108 towards a patient. In accordance with an embodiment of the invention, using AR module 112, various auscultation points are overlaid on the patient’s body. The auscultation points are highlighted on the screen of guide device 108 as well. [0048] In order to identify the auscultation points, the deep learning model of AR module 112 identifies key points such as, but not limited to, shoulders and hip, on a supplied human image. For instance, using these pre-generated points, nine lung auscultation points are generated using calculations derived through consultation with medical professionals. The specific procedure is as follows.
[0049] Distances of the auscultation points from the pre-generated key points (such as hips and shoulders) are obtained by analyzing data from medical auscultation procedures. Statistical measures such as mean and standard deviation across auscultation sites, marked by different medical professionals, are then calculated. Finally, the auscultation sites or points are generated using the pre-generated key points in conjunction with the above statistical measures.
[0050] In an ensuing step 204, the sound signals collected by stethoscope 114 are sent to guide device 108. At step 206, guide device 108 checks if the sound signals captured by stethoscope 114 are satisfactory based on the location. If the sound signals captured are not satisfactory, at step 208, the position of an auscultation point is recalculated, and the volunteer is directed to readjust the position of stethoscope 114 until the sound signals received are satisfactory. The auscultation points are recalibrated using the deep learning model which recalculates positions of the auscultation points for each frame of a live video feed.
[0051] At step 210, guide device 108 checks if all the auscultation points have been examined. If all the auscultation points have not been examined, the volunteer is directed to move on to the next auscultation point highlighted on the screen of guide device 108 for continuing the process of auscultation.
[0052] Finally, if all the auscultation points have been examined, at step 212, one or more machine learning models 118 are employed for processing the information to provide a diagnosis by predicting the presence or absence of medical conditions such as, but not limited to, heart murmurs, pneumonia, and abdominal bruits.
[0053] The present invention is advantageous in that it provides a system for AI- enabled auscultation using AR. The use of AR with AI ensures that the spots for auscultation can be found easily, thus making the process of auscultation faster and more efficient. Moreover, the use of AR to guide a volunteer while he or she is performing auscultation on the patient allows even an untrained layperson to operate it. Using a stethoscope connected to a smartphone app which uses AR and AI, even untrained volunteers can perform screening for lung disorders with an accuracy comparable to that of a medical professional.
[0054] Furthermore, extraction of key points such as hips and shoulders before applying the algorithm for stethoscope placement allows for extremely accurate collection of sounds as auscultation sites are customized according to the subject’s body dimensions. This can be used by medical students for training and educational purposes. It can also be used by untrained volunteers as the entire end-to-end process of auscultation, from stethoscope placement to diagnosis, is largely automated by the system of the present invention.
[0055] Additionally, the invention allows for a wide range of diseases to be diagnosed with high accuracy. As additional data is collected, the machine learning model accuracy improves, and diagnosis becomes better over time. Therefore, the invention provides access to quality healthcare and an accuracy of diagnosis is independent of the skill and expertise of a person performing auscultation.
[0056] Those skilled in the art will realize that the above recognized advantages and other advantages described herein are merely exemplary and are not meant to be a complete rendering of all of the advantages of the various embodiments of the present invention.
[0057] The system, as described in the invention or any of its components may be embodied in the form of a computing device. The computing device can be, for example, but not limited to, a general-purpose computer, a programmed microprocessor, a micro-controller, a peripheral integrated circuit element, and other devices or arrangements of devices, which are capable of implementing the steps that constitute the method of the invention. The computing device includes a processor, a memory, a nonvolatile data storage, a display, and a user interface.
[0058] In the foregoing specification, specific embodiments of the present invention have been described. However, one of ordinary skill in the art appreciates that various modifications and changes can be made without departing from the scope of the present invention. Accordingly, the specification and figures are to be regarded in an illustrative rather than a restrictive sense, and all such modifications are intended to be included within the scope of the present invention.

Claims

I/We Claim:
1. An auscultation system (100) for guiding a user to perform auscultation on a subject, the auscultation system (100) comprising: a guide device (108), wherein the guide device (108) comprises: a computer vision model (110) configured to identify one or more auscultation sites on the subject; and an augmented reality module (112) configured to display the one or more auscultation sites on a screen of the guide device (108); and a stethoscope (114) communicatively coupled to the guide device (108), wherein the stethoscope (114) is configured to capture sound signals obtained from positioning the stethoscope (114) on the subject at the one or more auscultation sites, wherein the guide device (108) is configured to: check if sound signals captured using the stethoscope (114) are satisfactory based on location of the one or more auscultation sites; recalculate position of the one or more auscultation sites using the computer vision model (110) if the sound signals captured are not satisfactory; and direct the user to readjust the position of the stethoscope (114) until the sound signals captured by the stethoscope (114) are satisfactory.
2. The auscultation system (100) as claimed in claim 1, wherein the augmented reality module (112) employs a deep learning model, TensorFlow Lite PoseNet, to detect key locations on the subject’s body to identify the one or more auscultation sites.
3. The auscultation system (100) as claimed in claim 1, wherein the augmented reality module (112) overlays the one or more auscultation sites on the subject’s body and the one or more auscultation sites are highlighted on the screen of the guide device (108).
4. The auscultation system (100) as claimed in claim 1, wherein the stethoscope (114) is one of a digital stethoscope and an acoustic stethoscope.
5. The auscultation system (100) as claimed in claim 1, wherein the guide device (108) further directs the user to place the stethoscope (114) at the one or more auscultation sites again to test for vocal resonance after performing a preliminary auscultation on the subject, wherein voice commands are used to direct the subject to pronounce a plurality of phrases and the corresponding sounds from the subject are recorded using the stethoscope (114).
6. The auscultation system (100) as claimed in claim 1, wherein the one or more auscultation sites are recalibrated and customized for accurate placement of the stethoscope (114) based on at least one of movement of the stethoscope (114) and the subject’s body dimensions.
7. The auscultation system (100) as claimed in claim 1 further comprises a diagnostic module (116) configured to interpret sound signals collected from the stethoscope (114) at the one or more auscultation sites using one or more machine learning models (118), wherein the one or more machine learning models (118) classify the sound signals and predict various medical conditions/disorders.
8. The auscultation system (100) as claimed in claim 7, wherein the diagnostic module (116) employs a K-nearest neighbors machine learning model to identify abnormalities in the sound signals recorded by the stethoscope (114) and returns a diagnosis and/or a screening report to the user.
PCT/IN2021/050448 2020-05-12 2021-05-10 Auscultation system for guiding a user to perform auscultation on a subject WO2021229600A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
IN202041019944 2020-05-12
IN202041019944 2020-05-12

Publications (1)

Publication Number Publication Date
WO2021229600A1 true WO2021229600A1 (en) 2021-11-18

Family

ID=78525405

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IN2021/050448 WO2021229600A1 (en) 2020-05-12 2021-05-10 Auscultation system for guiding a user to perform auscultation on a subject

Country Status (1)

Country Link
WO (1) WO2021229600A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019040000A (en) * 2017-08-24 2019-03-14 国立大学法人千葉大学 Auscultation training system
US20190279768A1 (en) * 2018-03-06 2019-09-12 James Stewart Bates Systems and methods for audio medical instrument patient measurements
TW202011896A (en) * 2017-09-28 2020-04-01 聿信醫療器材科技股份有限公司 Electronic stethoscope systems, input unit and method for monitoring a biometric characteristic

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019040000A (en) * 2017-08-24 2019-03-14 国立大学法人千葉大学 Auscultation training system
TW202011896A (en) * 2017-09-28 2020-04-01 聿信醫療器材科技股份有限公司 Electronic stethoscope systems, input unit and method for monitoring a biometric characteristic
US20190279768A1 (en) * 2018-03-06 2019-09-12 James Stewart Bates Systems and methods for audio medical instrument patient measurements

Similar Documents

Publication Publication Date Title
US20210145306A1 (en) Managing respiratory conditions based on sounds of the respiratory system
Thiyagaraja et al. A novel heart-mobile interface for detection and classification of heart sounds
US20210030390A1 (en) Electronic stethoscope
JP5785187B2 (en) Signal processing apparatus and method for heart sound signal
US20080045805A1 (en) Method and System of Indicating a Condition of an Individual
JP6973800B2 (en) User interface for navigating through physiological data
CN114828743A (en) Automated and objective symptom severity scoring
Ramesh et al. Coughgan: Generating synthetic coughs that improve respiratory disease classification
US20210090734A1 (en) System, device and method for detection of valvular heart disorders
WO2021229600A1 (en) Auscultation system for guiding a user to perform auscultation on a subject
KR102453580B1 (en) Data input method at location of detected lesion during endoscope examination, computing device for performing the data input method
Alghamdi et al. A deep CNN-based acoustic model for the identification of lung diseases utilizing extracted MFCC features from respiratory sounds
KR102624676B1 (en) Respiratory disease prediction method using cough sound, breath sound, and voice sound measurement data collected by smartphone
JPWO2020136870A1 (en) Biometric information analyzer, biometric information analysis method, and biometric information analysis system
Islam et al. PARK: Parkinson’s Analysis with Remote Kinetic-tasks
Feier et al. Newborns' cry analysis classification using signal processing and data mining
US20240203552A1 (en) Video surgical report generation
KR102624637B1 (en) Respiratory disease prognosis prediction system and method through time-series cough sound, breathing sound, reading sound or vocal sound measurement
Jičínský et al. Speech Processing in Diagnosis of Vocal Chords Diseases
Usman et al. Research Article Speech as a Biomarker for COVID-19 Detection Using Machine Learning
WO2023205059A1 (en) Electronic stethoscope and diagnostic algorithm
KR20240037814A (en) Severity classification method for respiratory disease based on acoustic data collected by smartphone
Avila et al. Development of Android-Based Pulmonary Monitoring System for Automated Lung Auscultation Using Long Short-Term Memory (LSTM) Network with Post-Processing from Edge Impulse
Sharma et al. Enhancing Medical Diagnosis with AI: A Focus on Respiratory Disease Detection
IT202100013118A1 (en) Apparatus for forming a diagnosis model of medical images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21802914

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21802914

Country of ref document: EP

Kind code of ref document: A1