WO2016154139A1 - Dispositifs spirométriques à base de son, systèmes, et procédés utilisant des données audio transmises sur un canal de communication vocale - Google Patents

Dispositifs spirométriques à base de son, systèmes, et procédés utilisant des données audio transmises sur un canal de communication vocale Download PDF

Info

Publication number
WO2016154139A1
WO2016154139A1 PCT/US2016/023468 US2016023468W WO2016154139A1 WO 2016154139 A1 WO2016154139 A1 WO 2016154139A1 US 2016023468 W US2016023468 W US 2016023468W WO 2016154139 A1 WO2016154139 A1 WO 2016154139A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio data
lung function
indication
regression
communication device
Prior art date
Application number
PCT/US2016/023468
Other languages
English (en)
Inventor
Shwetak N. Patel
Mayank Goel
Elliot N. SABA
Original Assignee
University Of Washington
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University Of Washington filed Critical University Of Washington
Publication of WO2016154139A1 publication Critical patent/WO2016154139A1/fr

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/08Detecting, measuring or recording devices for evaluating the respiratory organs
    • A61B5/091Measuring volume of inspired or expired gases, e.g. to determine lung capacity
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/08Detecting, measuring or recording devices for evaluating the respiratory organs
    • A61B5/087Measuring breath flow
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/68Arrangements of detecting, measuring or recording means, e.g. sensors, in relation to patient
    • A61B5/6887Arrangements of detecting, measuring or recording means, e.g. sensors, in relation to patient mounted on external non-worn devices, e.g. non-medical devices
    • A61B5/6898Portable consumer electronic devices, e.g. music players, telephones, tablet computers
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B2562/00Details of sensors; Constructional details of sensor housings or probes; Accessories for sensors
    • A61B2562/02Details of sensors specially adapted for in-vivo measurements
    • A61B2562/0204Acoustic sensors

Definitions

  • Examples described herein relate to testing lung function; examples of evaluating lung function through a phone call are described.
  • Spirometry is a mainstay for measuring lung function and is central for the diagnosis of chronic lung impairments, such as asthma, chronic obstructive pulmonary disease (COPD), and cystic fibrosis.
  • chronic lung impairments such as asthma, chronic obstructive pulmonary disease (COPD), and cystic fibrosis.
  • COPD chronic obstructive pulmonary disease
  • cystic fibrosis a chronic obstructive pulmonary disease
  • the spirometer measures the instantaneous flow and cumulative volume of exhaled air. It then calculates various lung function measures, such as levels of obstruction or restriction, to help diagnose and manage various pulmonary conditions.
  • Spirometry is a widely employed pulmonary function test.
  • spirometers There are many different types of spirometers available, ranging from big, clinical spirometers to portable, home spirometers. Their cost also generally varies from $1,000 USD to $5,000 USD.
  • the spirometer measures the amount and speed of airflow and calculates various indications of lung function on the basis of the test.
  • Four example indications of lung function are:
  • a healthy individual's lung function measures are at least 80% of the values predicted based on their age, height, and gender.
  • Abnormal values of FEV1% may be (expressed as a percent of predicted value):
  • spirometers may also generate Flow vs. Time
  • FT Flow vs. Volume
  • VT Volume vs. Time
  • Figure 1 is an example flow vs. volume plot for normal, obstructive, and restrictive cases.
  • the plot 102 includes normal case 104, obstructive case 106, and restrictive case 108.
  • the plot 102 shows flow on the y-axis in liters/second, and volume on the x-axis in liters.
  • the plot 102 also illustrates the PEF indication of lung function (e.g. shown as a peak amplitude of the plot in any given case), the FEVl indication (e.g. shown as a volume reached on certain cases after one second), and the FVC indication (e.g. shown as total volume expelled in certain cases).
  • the PEF indication of lung function e.g. shown as a peak amplitude of the plot in any given case
  • the FEVl indication e.g. shown as a volume reached on certain cases after one second
  • the FVC indication e.g. shown as total volume expelled in certain cases.
  • the normal case 104 illustrates that a descending limb of the FV plot is almost a straight line.
  • the flow rate decreases faster than exponentially after reaching its maximum value (PEF). Therefore, it attains a curved or "scooped" slope in the obstractive case 106.
  • a restrictive lung disease such as cystic fibrosis
  • the respiratory muscles weaken and the patient's lung capacity (FVC) decreases as shown in the restrictive case 108.
  • Some example methods include receiving, from a communication device, through a voice communication channel, audio data obtained by a microphone of the communication device during inhalation and exhalation of a user, and transmitting, to the communication device, an indication of lung function of the user based on the audio data.
  • the audio data is compressed by the voice communication channel.
  • the audio data is compressed by the voice communication channel using linear predictive coding
  • the voice communication channel is a GSM channel.
  • the indication of lung function is provided using audio or short messaging sen' ice.
  • Some example methods further include extracting features from the audio data and determining the indication of lung function based, at least in part, on the features.
  • the features include linear predictive coding features, signal envelope, and resonance tracking features.
  • determining the indication of lung function includes performing regression on the features.
  • performing regression includes performing a linear regression, a least-angle regression, an elastic net regression, and a kN regression .
  • determining the indication of lung function includes taking a median of regressions performed.
  • Some examples further include estimating channel state information of the voice communication channel, and determining the indication of lung function based, at least in part, on the channel state information.
  • the indication of lung function includes forced vital capacity, forced expiratory volume in one second, peak expiratory flow, ratios of those, or combinations thereof.
  • Some example methods include holding a communication device at arm's length, fully inhaling and exhaling, transmitting audio data collected by the communication device during said fully inhaling and exhaling to a remote server through a voice communication channel, and receiving an indication of lung function corresponding to the audio data at the communication device through audio or short messaging service communication.
  • the voice communication channel is a GSM channel.
  • Some example methods further include dialing, with the communication device, a predetermined phone number for a call-in service prior to said transmitting audio data.
  • fully inhaling and exhaling includes performing a spirometry effort.
  • the indication of lung function is based, at least in part, on features extracted from the audio data.
  • the indication of iung function is based, at least in part, on a linear regression, a least-angle regression, an elastic net regression, and a kN regression of the features.
  • Some examples further include transmitting a set of pilot tones through the voice communication channel.
  • Some example systems include a communication device, and a computing system, the computing system including at least one processing unit and computer readable media encoded with instructions which, when executed by the at least one processing unit, cause the computing system to perform, actions including: receiving, from the communication device, through a voice communication channel, audio data obtained by a microphone of the communication device during inhalation and exhalation of a user, and transmitting, to the communication device, an indication of lung function of the user based on the audio data.
  • the computing system may be a first computing system and the communication device may be configured to provide the audio data to a second computing system through the voice communication channel.
  • the second computing system is configured to provide the audio data to the first computing system
  • the second computing system includes a voice server of a telecommunications network.
  • the indication of lung function is provided using audio or short messaging service.
  • the actions further include extracting features from the audio data and determining the indication of lung function based, at least in part, on the features.
  • the actions further include performing regression on the features including performing a linear regression, a least-angle regression, an elastic net regression, and a kNN regression.
  • the actions further include estimating channel state information of the voice communication channel, and determining the indication of lung function based, at least in part, on the channel state information,
  • FIG. 1 is an example flow vs. volume plot for normal, obstructive, and restrictive cases.
  • FIG. 2 is a schematic illustration of a system arranged in accordance with examples described herein.
  • FIG. 3 is a flowchart of a method arranged in accordance with examples described herein.
  • FIG. 4 is a flowchart of a method arranged in accordance with examples described herein.
  • FIG. 5 is a flowchart of an example method arranged in accordance with examples described herein.
  • Local smartphone applications may be provided that sample the smartphone microphone and send digital sound generated by the patient's vocal tracts as well as relevant features captured during the maneuver to a central server.
  • the server may then calculate the expiratory flo rate using a physiological model of the vocal tract and a model of the reverberation of sound around the user's head.
  • Some example applications are described, for example, in co-pending U.S. Application Serial Number 14/400,064, filed May 10, 2013 entitled ''Sound-based spirometric devices, systems, and methods ' which application is hereby incorporated by reference in its entirety for any purpose, and U .S. Provisional Application No. 61/645,176, filed May 10, 2012, which application is incorporated herein by reference in its entirety for any purpose.
  • Examples described herein include a call-in service that measures lung function using any phone without the need for an application running on the phone itself. Examples described herein may enable the user to estimate their lung function by holding the phone at arm's length, fully inhaling, and forcefully exhaling until ail the air is expelled.
  • the collected audio data may be transmitted to a server over a standard voice channel.
  • the server may calculate clinically relevant lung function measures, which can be reported back to the participants using audio during the call or over SMS.
  • the ability to use a server to analyze audio data transmitted from any mobile phone, be it a wired phone, feature phone or smartphone, may eliminate or reduce a need to develop a specialized application for every phone platform. This approach may make the client device, channel, and app agnostic and keep the intelligence on the central server.
  • examples described herein utilize a microphone on a user's communication device to obtain audio data relating to a user's spirometry maneuver (e.g. full and/or forced inhalation and exhalation).
  • No intermediate device or transducer may be used in some examples - e.g. the communication device may capture audio data resulting from an unaided inhalation and exhalation.
  • a whistle or other device may be used to convert the sounds of the spirometry maneuver into tones that are then captured by the microphone of the communication device.
  • Such a whistle or other device may be advantageous in users having a very low flow rate, such that the sound generated by the vocal tracts might be negligible.
  • Such patients may usually have very low lung function (e.g. FEV1% ⁇ 50%) and are generally willing to carry an additional lung function measurement device with them.
  • the whistle or other device may be provided as an attachment or accessory for a user's communication device.
  • FIG. 2 is a schematic illustration of a sy stem arranged in accordance with examples described herein.
  • the system 214 includes a communication device 210, voice communication channel 218, optional voice server 212, optional connection 220, and computing system 202.
  • the computing system 202 includes one or more processing unit(s) 204 and memory 206 which may encode executable instructions for measuring lung function 208.
  • the communication device 210 may record audio data from a user's spirometry effort.
  • the audio data may be provided over the voice communication channel 218 to the computing system. 202.
  • the audio data may be provided to the computing system 202 through use of one or more voice servers, such as the voice server 212.
  • the voice communication channel 218 may compress and/or encode the audio data, and accordingly the methods and algorithms employed by the computing system 202 (e.g. the instructions provided by the executable instructions for measuring lung function 208) are sufficiently robust in some examples to provide indications of lung function even in the presence of compression or other loss through the voice communication channel 218.
  • the communication device 210 may be implemented using generally any device suitable for placing a call over the voice communication channel 218. Examples include, but are not limited to, cell phones, smart phones, feature phones, mobile phones, landline phones, satellite phones, Internet phones, and combinations thereof. Other devices capable of communication over the voice communication channel 218 may be used including tablets, computers, servers, laptops, automobiles, appliances, wearable devices, virtual and augmented reality devices, and combinations thereof. While a single communication device 210 is shown in Figure 2, any number of communication devices may provide audio data to the computing system 202 over the voice communication channel 218 and/or other voice communication channels.
  • the communication device 210 generally includes a microphone to transduce analog sound waves into electronic signals.
  • the sounds produced by a user performing a spirometry effort e.g. full inhalation and exhalation and/or inhalation followed by forced exhalation
  • the communication device 210 generally further includes a transmitter for transmitting audio data (e.g. data indicative of the electronic signals provided by the microphone) over the voice communication channel 218.
  • the communication device 210 may obtain audio data during a spirometry effort of a user by capturing audio data during the spirometry effort.
  • the spirometry effort may in some examples be unaided by any further audio-generating apparatus (e.g. a mobile phone itself may be used to pick up the sounds of a user performing a spirometry effort).
  • the user may utilize a device, e.g. a whistle or other tone-producing device that may convert flow during a spirometry effort into audio tones that are then recorded by the communication device.
  • the voice communication channel 218 may be implemented using generally any communication channel used to carry voice communications.
  • Communication channels used in telephony networks may be used to implement the voice communication channel 218. Examples include, but are not limited to, GSM, UMTS, LTE, or PSTN communication channels. While voice communication channels associated with telephony networks may be used to implement the voice communication channel 218, in some examples, Wi-Fi may additionally or instead be used to transmit data between the communication device 210 and the computing system 202,
  • the computing system 202 may be implemented using, for example, a server (e.g. a remote server), a controller, a microcontroller, a desktop, a laptop, a tablet, or a smartphone.
  • the computing system 202 may include one or more processing unit(s) 204 (e.g. processors), circuitry, firmware) and computer readable media (e.g. Memory 206) which may be encoded with executable instructions for measuring lung function 208.
  • the computing system 202 may be programmed with the executable instructions for measuring lung function 208.
  • the computer readable media may be implemented using generally any electronic memory including, but not limited to, Flash, SSD, RAM, ROM, or disk drives. It should be understood that the arrangement of the computing system 202 is quite flexible. For example, the processing unit(s) 204 and memory 206 may in some examples be distributed and in electronic communication. The memory 206 (or other computer readable media included in the computing system 202) may store data used during the lung function analysis described herein, including, but not limited to, lung function parameters.
  • the computing system. 202 may include various other computing system, components, including but not limited to input and/or output device(s) such as display(s), keyboard(s), mice, touchscreen(s), virtual or augmented reality displays, or the like.
  • input and/or output device(s) such as display(s), keyboard(s), mice, touchscreen(s), virtual or augmented reality displays, or the like.
  • the computing system 202 may include a receiver to receive audio data from the communication device 210 over the voice communication channel 218. In some example, the computing system 202 may receive the audio data from the voice communication channel 218 directly.
  • the computing system 202 may include a transmitter to transmit indications of lung function back to the communication device 210 through the voice communication channel 218.
  • the indications of lung function may be provided, for example in audio or using short messaging sen' ice (SMS).
  • an optional voice server 212 may mediate communication from the voice communication channel 218 to the computing system 202 over connection 220.
  • the voice server 212 may be a voice server of a telecommunication network in some examples.
  • the connection 220 may be implemented, for example, using a Wi-Fi, Internet, or wired connection.
  • another voice communication channel e.g. using a telephony network may be used to implement the connection 220.
  • the computing system 202 may receive audio data obtained by a microphone of the communication device 2, 10 during performance of a spirometry effort by a user (e.g. an inhalation and exhalation).
  • the computing system 202 may process the audio data using the executable instructions for measuring lung function 208 to determine one or more indicators of lung function (e.g. PEF, FEV 1, FVC) based on the audio data.
  • the computing system 202 may provide one or more indications of lung function back to the communication device 210 over the voice communication channel 218 (in some examples indirectly through the voice server 212 or other intermediate server).
  • the computing system 202 may generally be configured (e .g. programmed) to implement machine learning techniques to determine indications of lung function based on received audio data,.
  • the executable instructions for measuring lung function 208 may- include instructions for performing machine learning techniques described herein.
  • the executable instructions for measuring lung function 208 may, when executed by the processing unit(s) 204, cause the computing system 202 to perform functions described herein, including the receipt of audio data from the communication device 210 and transmission of one or more indicators of lung function to the communication device 210.
  • the executable instructions for measuring lung function 208 including instructions for extracting features from the audio data and determining the indication of lung function based, at least in part, on the features.
  • the executable instructions for measuring lung function 208 include instructions for performing regression on the features including performing a linear regression, a least-angle regression, an elastic net regression, and a kNN regression .
  • the instructions further include instructions for estimating channel state information of the voice communication channel 2, 18 and determining the indication of lung function based, at least in part, on the channel state information,
  • the method 300 includes, at block 302, receiving, from a communication device, through a communication channel, audio data obtained by a microphone of the communication device during inhalation and exhalation of a user.
  • features are extracted from the audio data.
  • an indication of lung function is determined based, at least in part, on the features.
  • the indication of lung function is transmitted to the communication device.
  • the system 214 of Figure 2 may be used to implement the method 300.
  • the executable instructions for measuring lung function 208 may include instructions for causing the computing system 202 to perform the actions in blocks 302, 304, 306, and 308.
  • the audio data is received through a voice communication channel, such as the voice communication channel 218 of Figure 2.
  • the voice communication channel may compress the audio data.
  • the audio data may be compressed using linear predictive coding.
  • the voice communication channel compresses, and in some examples also or instead encodes, the audio data
  • the methods used to determine an indication of lung function based on the audio data are advantageously sufficiently robust to operate on the compressed and/or encoded data on receipt tiirough the communication channel.
  • LPC features may advantageously be preserved through the channel and used to determine indications of lung function.
  • GSM voice coding technologies use a source-filter model for speech, where representations of the filter created by the vocal tract and the source excitation created by the vocal cords are transmitted in lieu of the raw audio.
  • the most common method for separating out the source excitation from the vocal tract filter is to use LPC. It isolates all linear relationships up to a certain order in the signal, which include the resonances due to the vocal tract's filtering. These resonances are also used in methods described herein for determining indications of lung function.
  • the LPC encoding generally preserves much of the information in the signal used to determine indications of lung function in accordance with examples described herein.
  • the LPC features may be present in audio data received over a voice communication channel in some examples despite many smaller details being lost such as higher harmonics of the fundamental resonance and spectral energy above a threshold, such as 4 kHz.
  • the source excitation of the signal may be ⁇ -Law encoded and transmitted. This encoding may compand the amplitude of the signal (e.g. compresses before transmission, expands after reception), causing a reduction in effective dynamic range at large signal values. Examples of methods described herein for determining indications of lung function may be sufficiently robust to operate in the presence of the reduction in effective dynamic range.
  • low-energy components in the audio data may be suppressed by- transmission through the voice communication channel.
  • the energy of the signal may abruptly cut off in patches. Directly estimating FVC from the signal may therefore be more difficult than if low-energy signal components were not sacrificed. However measures such as FEV1 may still easily estimated, particularly in examples where the signal stays above a noise floor for the initial segment of a spirometry effort.
  • features may be extracted from the audio data, e.g. using the computing system 202 of Figure 2.
  • the features extracted may generally fall in three categories: temporal envelope detection, spectrogram processing, and linear predictive coding (LPC). Accordingly, signal envelope features may be extracted, resonance tracking features may be extracted, linear predictive coding features may be extracted, or combinations thereof.
  • the voice communication channel may be implemented using a channel, such as a GSM channel, which uses LPC to encode voice. Accordingly, in some examples features extracted in block 304 based on LPC remain largely preserved.
  • an indication of lung function may be determined based, at least in part, on the features.
  • Machine learning techniques may be employed to determine one or more indications of lung function based on the extracted features. For example, forced vital capacity (FVC), forced expiratory volume in one second (FEV1), peak expiratory flow (PEF), ratios of those, or combinations of tliose may be determined based, at least in part, on the features extracted in block 304.
  • FVC forced vital capacity
  • FEV1 forced expiratory volume in one second
  • PEF peak expiratory flow
  • ratios of those, or combinations of tliose may be determined based, at least in part, on the features extracted in block 304.
  • other indications of lung function may be determined (e.g. pass/fail in accordance with a selected metric).
  • Features may be weighted in performing a determination of lung function.
  • the weighting may be dependent on the voice communication channel through which the audio data was received. For example, GSM communication channels may preserv e LPC features well, while other features may be more distorted by the communication channel. Accordingly, when the voice communication channel used to receive the audio data is a GSM channel (or other channel employing LPC encoding techniques), the LPC features extracted in block 304 may be weighted more heavily in a determination of lung function in block 306.
  • weights for the various features may be determined through a calibration procedure.
  • calibration spirometry efforts may be performed using a particular type (e.g. make, model) of communication device over a particular type (e.g. telephony network) of voice communication channel. Those efforts may be compared with spirometry- efforts performed on a clinical device. The comparison may be used to determine a set of weights for extracted features which are best able to predict the indication of lung function given by the clinical device performance. Weightings may be stored (e.g. accessible to the computing system of Figure 2) and used for audio data received from a same or similar communication device type over the same or similar voice communication channel. ] In some examples, one or more regressions may be performed on the features.
  • regressions include but are not limited to, linear regression, least-angle regression, elastic net regression, and kNN regression.
  • a linear regression, a least-angle regression, an elastic net regression, and a kNN regression may be performed.
  • J A median of the regressions may be taken to determine one or more indications of lung function.
  • the indication of lung function may be transmitted to the communication device.
  • the indication of lung function may be provided, for example, using audio or short messaging service.
  • the indication of lung function may be read back to a user of the communication device in a return call, or in a same call session as initiated to receive the audio data from the communication device.
  • an ensemble of machine learning algorithms may be used to process the extracted features into a predicted values in block 306, the median of which may be taken to be the prediction for that particular indication (e.g. spirometry measure).
  • the machine learning algorithms generally may learn the relationship between the features and proper predicted values for the diagnostic measures; however these relationships can change depending on the phone model, cell earner and/or call quality used to provide the audio data from the user to the computing system used to perform the machine learning techniques.
  • a dynamic "channel estimation” algorithm may send a predetermined signal (e.g. a prerecorded audio file of noise, or pilot tones played through the speakers of another device) over the voice communication channel (e.g. cell network) to probe it and determine relative weightings of the features being calculated.
  • a predetermined signal e.g. a prerecorded audio file of noise, or pilot tones played through the speakers of another device
  • the voice communication channel e.g. cell network
  • the methods may include estimating channel state information of a voice communication channel.
  • the computing system 202 of Figure 2 may estimate channel state information of the voice communication channel 218.
  • Channel state information generally refers to information regarding perturbations that may be imposed on a signal by the channel itself.
  • Channel state information may, for example, refer to a frequency- response of the voice communication channel.
  • the executable instructions for measuring lung function 208 may include instructions for estimating channel state information (e.g. estimating the channel).
  • the channel state information may be estimated, for example, by having a user of a communication device (or another transmitting device) transmit a known signal (e.g. a tone or sequence of tones) through the voice communication channel .
  • a pilot tone is swept across frequencies (e.g. 0kHz to 20kHz) at a constant amplitude.
  • a computing system e.g. computing system 202 of Figure 2
  • the received signals may be measured, resulting in a measure of a frequency response of the voice communication channel .
  • the frequency response may be used to tune (e.g. adjust) frequency-domain-based features extracted from the audio data.
  • the channel state information may be used, for example, to select machine learning techniques and/or associations to be used in determining indication(s) of lung function from extracted features of the audio data.
  • Figure 4 is a flowchart of a method arranged in accordance with examples described herein.
  • the method 400 describes how one or more indications of lung function (e.g. Lung function estimate 420) may be determined based on audio data 402.
  • the method 400 may be used to implement, for example, block 306 of Figure 3.
  • the executable instructions for measuring lung function 208 may include instructions for performing the method 400 in some examples.
  • Audio data 402 may be collected by a communication device, e.g. communication device 210, during a spirometry effort and transmitted over a voice communication channel.
  • the audio data (e.g. audio samples) originally collected by the communication device may be a function of pressure because the microphone may be considered essentially an uncalibrated, AC-coupled pressure sensor. After compensating for pressure losses between the mouth and microphone, flow may be approximated from pressure and the AC-coupled artifacts reduced and/or removed.
  • Sound is generally measured in pressure. Without being bound by theory, sound may also be generated during a subject's spirometry effort (e.g. forced expiratory maneuver) by the resulting expiratory airflow. For example, sound is generated by the expiratory airflow as it passes through the subject's vocal tract, through the subject's mouth, and through the environment surrounding the subject.
  • audio data of sound generated during a subject's spirometry effort e.g. forced expiratory maneuver
  • Features may be extracted from the audio data 402,
  • the features may include, envelope features (e.g. Hiibert envelope 404), resonance tracking features (e.g. Resonance tracking 406), linear predictive coding features (e.g. Linear predictive coding 408), or combinations thereof. Examples of extraction of the features are described, for example, in co-pending U.S. Application Serial Number 14/400,064, filed May 10, 2013 entitled “Sound- based spirometric devices, systems, and methods," which application is hereby incorporated by reference in its entirety for any purpose, and U.S. Provisional Application No. 61/645,176, filed May 10, 2012, which application is incorporated herein by reference in its entirety for any purpose.
  • the sound of a subject's spirometry effort (e.g. forced expiratory maneuver) at the microphone may be influenced by the flow rate of air from the subject's lungs and may include superfluous sounds generated by the expiratory airflow, for example, as the expiratory airflow passes through the subject's vocal tract, through the subject's mouth, and through the subject's surrounding environment.
  • one or more superfluous sounds generated by the expiratory airflow may be modeled and removed from, the audio data 402, thereby modifying the audio data 402 to remove or reduce pressure fluctuations that are less directly relatable to the rate of the spirometry effort (e.g. expiratory airflow).
  • These modeled and removed superfluous sounds may be referred to herein as a first class of feature.
  • the sound of a subject's spirometry effort (e.g. forced expiratory maneuver) at the microphone may also include additional sounds that, in many examples, may be used to infer expiratory flow rate because the intensity of these additional sounds may be related to the rate of airflow.
  • additional sounds include items such as wind shear, vocal tract resonances, wheezes, and nasal resonances.
  • one or more of such additional sounds may be isolated and the intensity of the additional sound may be used as a feature.
  • sound pressure reductions that occur as the sound travels between the subject's mouth and the microphone and/or reverberation of sound in the subject's environment may be accounted for during processing of the audio data 402.
  • inverse radiation modeling e.g., a model of a spherical baffle in an infinite plane—also known as Flanagan's sound production model
  • the distance may be approximated from user's height and arm length, and may be adjusted by readings from an accelerometer (e.g., if a communication device including the microphone and the accelerometer is moved closer in).
  • the expiratory airflow rate (distance/time— e.g., meters/second) may be modeled from the combined features using non-parametric regression.
  • the volume of the expiratory airflow may be calculated by estimating the area of the subject's mouth opening and integrating the flow. For example, the flow rate of air can be measured in m/sec. The estimated area of the subject's mouth opening can be used to convert the flow rate of air to a volumetric flow rate (e.g., liters/second). The volumetric flow rate can be directly integrated to get volume.
  • Audio data 402 may accordingly represent audio data as received from a communication device through a voice communication channel.
  • the received data may be uncalibrated, AC-coupled measures of pressure, p(t), at the microphone of the communication device, and may have been modified by the voice communication channel (e.g. compressed and/or encoded).
  • Tire audio data 402 may be processed to, for example, compensate for pressure losses as the sound travels from the user's mouth to the microphone, convert the pressure values to an approximation of flow, and remove and/or reduce the effects of AC coupling.
  • pressure losses may be approximated using an inverse model of the sound reverberation around the user's head.
  • Turbulent airflow as it passes through a fixed opening (e.g., the user's mouth), has a characteristic pressure drop, which, in many examples, may be used for converting pressure into flow.
  • a characteristic pressure drop which, in many examples, may be used for converting pressure into flow.
  • at least one of: ( 1) signal power and frequency characteristics, and (2) models of the vocal tract may be used to remove and/or reduce effects of AC-coupling and refine the flow approximations.
  • regression may be used to combine these flow approximations and remove and/or reduce non-linearity.
  • processing of the audio data 402 may include compensation and/or feature extraction and machine learning linear regression 410
  • a first stage in processing the audio data 402 may be to use inverse radiation modeling to compensate for pressure losses sustained over the distance from the user's mouth to microphone and for reverberation/reflections caused in and around the user's body.
  • Any suitable inverse radiation modeling can be used.
  • the transfer function from the microphone to the user's mouth can be approximated by equation 1 , which is corresponds to a spherical baffle in an infinite plane.
  • D mm is the arm length (e.g., approximated from user's height); Chead is the head circumference (e.g., approximated from user's height); and c is the speed of sound.
  • the transfer function inverse is applied by converting it to the time domain, hj nv (t), and using Finite Impulse Response (FIR) filtering with the incoming audio data. Once applied, the output may be an approximation of the pressure at the lips, P'.j ps (t). ] The pressure at the lips (pjj ps (t)) may then be converted to a flow rate.
  • equation (2) is a non-linear equation that can be used to convert pressure drop across the lips to flow rate through the lips.
  • ru ps is the radius of the user's mouth opening (e.g. a constant resistance across frequency) .
  • Each measure, p(t), pu ps , (t), and uu ps (t), may represent a high frequency, AC-coupled signal, from which a separate volumetric flow rate may be approximated.
  • approximating volumetric flow rate from, these signals includes using three transformations of these signals, referring again to Figure 4, these may include: (1) envelope detection, e.g. Hilbert envelope 404 (2) spectrogram processing, e.g. resonance tracking 406 and (3) linear predictive coding (LPC), e .g. linear predictive coding 408.
  • LPC linear predictive coding
  • i ps (t)) can be assumed to be a reasonable approximation of the flow rate because it is a measure of the overall signal power (or amplitude) at low frequency.
  • Spectrogram processing may be used to extract resonances. In the frequency domain, resonances may be assumed to be amplitudes excited by reflections in the vocal tract and mouth opening and therefore should be proportional to the flow rate that causes them .
  • Linear prediction may then be used as a flow approximation. Linear prediction generally assumes that a signal can be divided into a source and a shaping filter and it estimates the source power and shaping filter coefficients.
  • the "filter” in examples described herein may be an approximation of the vocal tract.
  • the "source variance” may be an estimate of the white noise process exciting the vocal tract filter and is an approximation of the power of the flow rate from the lungs.
  • Envelope features such as Hilbert envelope 404, generally measure the energy of the audio data 402 over different frames.
  • the low frequency envelope of the audio data (proportional to power here) may be extracted by squaring the signal and low pass filtering at a sub-1 Hz cutoff.
  • the time domain envelope may also be taken using the Hilbert envelope.
  • the Hilbert transform of the signal may be taken and added back to the original signal. Low pass filtering may then be used to extract the envelope using cascaded second order system filters.
  • Each signal (p(t), pii ps ,(t), and uij ps (t)) can be down-sampled 42 to have the same sampling rate as the spectrogram and linear prediction models.
  • Different estimations of the envelopes can be ascertained from using slightly different low pass filters on the squared data and Hilbert transformed data. In one embodiment, 12, features are used.
  • the audio data 402 may be buffered, e.g. into 30 ms frames (with 50% overlap between frames).
  • a spirometry exhalation typically lasts from four to seven seconds, resulting in 250-500 frames per exhalation.
  • Each frame may then be windowed using a hamming window and the Fast Fourier Transform
  • the resonances can be extracted (resonance tracking 406) using local maxima in each FFT frame, calculated over a sliding window, resulting in resonance tracking features. Any maxima that is greater than a suitable threshold, for example 20% of the global maximum, can be saved.
  • any resonance less than a threshold e.g. 300 ms, can be discarded as noise.
  • the average resonance magnitude in each frame may be calculated and saved as resonance tracking features.
  • the audio data 402 may be again windowed into overlapping frames of, e.g. 30 ms.
  • a number of LPC models can be taken (e.g. Linear predictive coding 408), for example, with filters orders of 2, 4, 8, 16 and 32 (increasing vocal tract complexity).
  • the approximated "source power" that excites the filter can be saved for each frame as an approximation of the flow rate.
  • the LPC may be taken from using p(t), Piip s ,(t), and/or uu ps (t) .
  • Various combinations of LPC order can be used to ascertain different estimates of the vocal tract. In one embodiment this results in 34 different estimates.
  • the audio signal can be filtered by the inverse LPC model, which may leave only noise from random noises, such as wind shear. Additionally, the bandwidth and magnitude of the largest resonance from the LPC model can be calculated and used as another spectral estimate of the signal.
  • the reverberation of the lips can be estimated as the lips typically resonate in this frequency range. In one embodiment, this adds to the number of spectral features, totaling 50.
  • the approximated flow- rates may be denoised using a Savitsky-Golay polynomial filter, e.g. of order 3 and size 11.
  • a third order polynomial can be fit inside a moving window and is robust to many types of noise while keeping the relative shape of the most prominent signal intact.
  • the filtered and non-filtered signals may individually or both fed as features to the subsequent regression stage.
  • Feature extraction e.g. using Hilbert envelope 404, resonance tracking 406, and linear predictive coding 408, results in a number of uncalibrated approximations of the flow rate. These features may be used in a number of different regressions. In Figure 4, four regressions are used. The regressions may be carried out using, for example, the scikit-leam toolkit in Python and a leave-one-out cross-validation may be used in some examples to avoid overfitting.
  • channel state information may be used to select a number, type, and/or parameters for regressions that are used to generate mdication(s) of lung function.
  • One regression may be a linear regression 410 that tries to find a linear relationship between the features and groundtrath lung function value.
  • a second regression may use a least-angle regression 412 (LARS) .
  • LARS generally selects the most useful features using a variant of forward feature selection, but the underlying model is assumed linear.
  • the third regression, elastic net regression 414 uses the elastic net algorithm, which eliminates features m a slightly different way than LARS.
  • the elastic net regression 414 uses a combination of LASSO regression and ridge regression for regularizatson that is often more stable.
  • an encapsulated k-Nearest Neighbor regression (kNN regression 416) with (k 2) is used. This regression finds the convex hull of the data in the feature space and fits a locally linear regression.
  • the final regression estimate may be found by taking the median (e.g. Median 418) of these four regressions. While four regressions are shown in Figure 4, other numbers of regressions may be used in other examples, including 1, 2, 3, 5, 6, or more regressions.
  • the feature extraction, regressions, median, and/or lung function estimate stages shown in Figure 4 may be repeated for each desired indication of lung function (e.g. FEVl, FVC, and/or PEF indications).
  • certain features may be weighted more heavily in making a lung function estimate.
  • LPC features may be weighted more heavily when the audio data was received through a GSM or other voice communication channel employing LPC encoding techniques.
  • weights for the various features may be determined tlirough a calibration procedure, and used to adjust a combination of features used for estimating lung function.
  • FIG. 5 illustrates an example method arranged in accordance with examples described herein.
  • the method 500 includes optionally dialing a call-in service at block 502, performing a spirometry effort at block 504, transmitting audio data collected by a communication device during the spirometry effort to a server through a voice communication channel in block 506, and receiving an indication of lung function corresponding to the audio data at the communication device in block 508.
  • a user may dial a call-in sendee.
  • a user may dial a toll-free or other phone number specifying a call-in sendee that may perform spirometry testing.
  • the user may use a communication device (e.g. the communication device 210 of Figure 2) to call the call-in service.
  • the dialed number may place a user's communication device in communication with a computing system (e.g. a remote server, e.g. computing system 202) configured to determine indications of lung function.
  • the dialed number may place a user's communication device in communication with the computing system over a voice communication channel, and may be intermediated by other voice servers in some examples.
  • Providing a call-in service may in some examples improve accessibility of spirometry testing - e.g. make spirometry testing available to anyone with access to a GSM cell phone network.
  • a user may be prompted to enter, using their communication device, certain biographical information.
  • Biographical information that may be collected may include, but is not limited to, age, weight, gender, ethnicity, height, arm length, head circumference, health history, address, and combinations thereof.
  • the communication device may play one or more known sounds (e.g. prerecorded audio and/or pushing selected buttons on a phone designed to generate predetermined tones).
  • the known sounds may be provided to a server through the voice communication channel and may be used to estimate channel state information that may be used to guide the determination of one or more indication(s) of lung function as described herein.
  • the user may perform a spirometry effort.
  • the user may hold the communication device at arm's length and fully inhale and exhale.
  • the full inhalation and exhalation may in some examples include a forced exhalation.
  • Any suitable spirometry effort procedure may be used in some examples.
  • the communication device may capture audio data during the spirometry effort, e.g. using a microphone of the communication device.
  • the audio data collected by the communication device may be transmitted through the voice communication channel to a server.
  • the server may determine one or more indications of lung function based on the audio data, in accordance with examples described herein, such as those described with reference to Figure 2- Figure 4.
  • the user may receive, at their communication device, an indication of lung function.
  • the indication of lung function can, for example, be texted back to the communication device (e.g. using SMS) or read aloud during the call.
  • the communication device may receive feedback on the spirometry effort. For example, voice feedback may be provided if the amplitude of the audio data delivered by the spirometry effort was below a threshold, and generally not usable for the analysis described herein. In some examples, voice feedback may be provided that gave an indication to the user of how to improve the spirometry effort (e.g. breathe out fully, exhale longer, inhale more sharply, hold the phone closer). In some examples, audio coaching may be provided during the call (e.g. prior to and/or during the spirometry effort). The audio coaching may be provided by a remote coach and/or pre-recorded message and delivered to the communication device over the voice communication channel.
  • voice feedback may be provided if the amplitude of the audio data delivered by the spirometry effort was below a threshold, and generally not usable for the analysis described herein.
  • voice feedback may be provided that gave an indication to the user of how to improve the spirometry effort (e.g. breathe out fully
  • Examples described herein may facilitate implementation of sound-based spirometry over a telephone network, which may extend the availability of spirometry testing.
  • Indications of lung function determined using systems and methods described herein may be utilized to treat conditions including, but not limited to, chronic obstructive pulmonary disease (COPD) and asthma.
  • COPD chronic obstructive pulmonary disease
  • the groundtruth for the participants was collected on two FDA-approved clinical spirometers, the nSpire Koko Legend and the NDD EasyWare spirometer.
  • the two spirometers were used to answer two questions: (1) whether the participants got fatigued as the session progressed, and (2) how much variability exists between the output of the two devices, so we could use it as a benchmark for the system's performance.
  • the participants performed at least 15 spirometry efforts (3 each for: 2 clinical spirometers, 2 whistles, and 1 without whistle). Spirometry measurements are completely effort-dependent and some fatigue can build up when performing this many efforts. Therefore, we recorded efforts on one clinical spirometer at the beginning of the session and on another spirometer at the end of the session. The order was randomized for each participant.
  • the spirometry effort (e.g. forced expiratory maneuver) was explained to the participants and they were asked to practice using the spirometer. Once the participants were able to perform an acceptable maneuver according to the ATS criteria for reproducibility, tliree efforts were recorded using the spirometer. Next, the participants were introduced to the system. The study used a within-subjects 2x2x3 factorial design. The factors and levels were: [0127] ⁇ Phone Type: iPhone and non-iPhone. Two non-iPhone devices were used: Samsung Note and Sony Ericsson W580i. The W580i is a feature phone and was used to evaluate the performance of the system using an approximately 10-year-old device.
  • Channel Type Local recording or GSM.
  • the iPhone was kept consistent in both channels to analyze the performance of the system if only the channel is changed.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Veterinary Medicine (AREA)
  • Pulmonology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Physiology (AREA)
  • Multimedia (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

Des exemples décrits ici comprennent un service téléphonique qui mesure la fonction pulmonaire à l'aide de n'importe quel téléphone, sans qu'il soit nécessaire d'utiliser une application s'exécutant sur le téléphone lui-même. Des exemples décrits dans la présente invention peuvent permettre à l'utilisateur d'estimer sa fonction pulmonaire par mise en œuvre d'un effort de spirométrie, tandis qu'un dispositif de communication recueille des données audio relatives à l'effort. Les données audio recueillies peuvent être transmises à un serveur sur un canal vocal standard. Le serveur peut calculer des mesures de fonction pulmonaire cliniquement pertinentes, qui peuvent être rapportées aux participants à l'aide d'un signal audio durant l'appel ou par SMS.
PCT/US2016/023468 2015-03-20 2016-03-21 Dispositifs spirométriques à base de son, systèmes, et procédés utilisant des données audio transmises sur un canal de communication vocale WO2016154139A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201562136095P 2015-03-20 2015-03-20
US62/136,095 2015-03-20

Publications (1)

Publication Number Publication Date
WO2016154139A1 true WO2016154139A1 (fr) 2016-09-29

Family

ID=56978631

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2016/023468 WO2016154139A1 (fr) 2015-03-20 2016-03-21 Dispositifs spirométriques à base de son, systèmes, et procédés utilisant des données audio transmises sur un canal de communication vocale

Country Status (1)

Country Link
WO (1) WO2016154139A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018085583A1 (fr) * 2016-11-02 2018-05-11 Sparo, Inc. Appareils, procédés et systèmes pour motiver des manœuvres de spirométrie à domicile de qualité et une évaluation et un accompagnement automatisés
US10028675B2 (en) 2012-05-10 2018-07-24 University Of Washington Through Its Center For Commercialization Sound-based spirometric devices, systems and methods
CN110545359A (zh) * 2019-08-02 2019-12-06 国家计算机网络与信息安全管理中心 通信线路特征提取方法、通信线路识别方法及装置
CN114373373A (zh) * 2022-01-10 2022-04-19 北京易优联科技有限公司 一种肺功能检查人员考试方法和系统
WO2023237881A1 (fr) * 2022-06-07 2023-12-14 Eupnoos Ltd Test de fonction pulmonaire

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010055297A1 (en) * 2000-03-23 2001-12-27 Mathilde Benveniste Asymmetric measurement-based dynamic packet assignment system and method for wireless data services
US20020128804A1 (en) * 1998-03-03 2002-09-12 Jacob Geva Personal ambulatory cellular health monitor
US20030115051A1 (en) * 2001-12-14 2003-06-19 Microsoft Corporation Quantization matrices for digital audio
WO2013170131A1 (fr) * 2012-05-10 2013-11-14 University Of Washington Through Its Center For Commercialization Dispositifs, systèmes et procédés spirométriques basés sur un son
US20130317379A1 (en) * 2012-05-22 2013-11-28 Sparo Labs Spirometer system and methods of data analysis
WO2014037843A1 (fr) * 2012-09-05 2014-03-13 Countingapp Medical Ltd. Système et procédé servant à mesurer la capacité et la résistance pulmonaires d'un patient
US20140155708A1 (en) * 2012-05-14 2014-06-05 Lionsgate Technologies, Inc. Systems, methods and related apparatus for determining physiological parameters
US20140213925A1 (en) * 2011-09-20 2014-07-31 Isonea Limited Systems, methods and kits for measuring respiratory rate and dynamically predicting respiratory episodes
US20150005176A1 (en) * 2013-06-21 2015-01-01 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020128804A1 (en) * 1998-03-03 2002-09-12 Jacob Geva Personal ambulatory cellular health monitor
US20010055297A1 (en) * 2000-03-23 2001-12-27 Mathilde Benveniste Asymmetric measurement-based dynamic packet assignment system and method for wireless data services
US20030115051A1 (en) * 2001-12-14 2003-06-19 Microsoft Corporation Quantization matrices for digital audio
US20140213925A1 (en) * 2011-09-20 2014-07-31 Isonea Limited Systems, methods and kits for measuring respiratory rate and dynamically predicting respiratory episodes
WO2013170131A1 (fr) * 2012-05-10 2013-11-14 University Of Washington Through Its Center For Commercialization Dispositifs, systèmes et procédés spirométriques basés sur un son
US20140155708A1 (en) * 2012-05-14 2014-06-05 Lionsgate Technologies, Inc. Systems, methods and related apparatus for determining physiological parameters
US20130317379A1 (en) * 2012-05-22 2013-11-28 Sparo Labs Spirometer system and methods of data analysis
WO2014037843A1 (fr) * 2012-09-05 2014-03-13 Countingapp Medical Ltd. Système et procédé servant à mesurer la capacité et la résistance pulmonaires d'un patient
US20150005176A1 (en) * 2013-06-21 2015-01-01 Sequenom, Inc. Methods and processes for non-invasive assessment of genetic variations

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10028675B2 (en) 2012-05-10 2018-07-24 University Of Washington Through Its Center For Commercialization Sound-based spirometric devices, systems and methods
WO2018085583A1 (fr) * 2016-11-02 2018-05-11 Sparo, Inc. Appareils, procédés et systèmes pour motiver des manœuvres de spirométrie à domicile de qualité et une évaluation et un accompagnement automatisés
CN110545359A (zh) * 2019-08-02 2019-12-06 国家计算机网络与信息安全管理中心 通信线路特征提取方法、通信线路识别方法及装置
CN114373373A (zh) * 2022-01-10 2022-04-19 北京易优联科技有限公司 一种肺功能检查人员考试方法和系统
WO2023237881A1 (fr) * 2022-06-07 2023-12-14 Eupnoos Ltd Test de fonction pulmonaire

Similar Documents

Publication Publication Date Title
JP6272308B2 (ja) 音ベースの肺活量測定のデバイス、システムおよび方法
WO2016154139A1 (fr) Dispositifs spirométriques à base de son, systèmes, et procédés utilisant des données audio transmises sur un canal de communication vocale
US9002704B2 (en) Speaker state detecting apparatus and speaker state detecting method
AU2018266253B2 (en) System and method for determining cardiac rhythm and/or respiratory rate
Goel et al. Spirocall: Measuring lung function over a phone call
AU2017228552A1 (en) Spirometer system and methods of data analysis
KR101619611B1 (ko) 마이크로폰을 이용한 호흡률 추정 장치 및 기법
Larson et al. Tracking lung function on any phone
US20150201272A1 (en) Mobile device-based stethoscope system
JP2013518607A (ja) 携帯型モニタリングのための生理学的信号の品質を分類する方法およびシステム
CN112822976B (zh) 通过语音分析估计肺容量
US20130006630A1 (en) State detecting apparatus, communication apparatus, and storage medium storing state detecting program
US20140276165A1 (en) Systems and methods for identifying patient talking during measurement of a physiological parameter
JP6349613B2 (ja) 呼吸機能検出システム及びその検出方法
US10426426B2 (en) Methods and apparatus for performing dynamic respiratory classification and tracking
JP2005066044A (ja) 呼吸音データ処理装置及びプログラム
US20220409063A1 (en) Diagnosis of medical conditions using voice recordings and auscultation
CN113948109B (zh) 一种基于声音识别生理现象的系统
WO2021132289A1 (fr) Système d'analyse d'état pathologique, dispositif d'analyse d'état pathologique, procédé d'analyse d'état pathologique et programme d'analyse d'état pathologique

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16769516

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 26.01.2018)

122 Ep: pct application non-entry in european phase

Ref document number: 16769516

Country of ref document: EP

Kind code of ref document: A1