WO2022177501A1 - A system and method for measuring vital body signs - Google Patents

A system and method for measuring vital body signs Download PDF

Info

Publication number
WO2022177501A1
WO2022177501A1 PCT/SG2021/050366 SG2021050366W WO2022177501A1 WO 2022177501 A1 WO2022177501 A1 WO 2022177501A1 SG 2021050366 W SG2021050366 W SG 2021050366W WO 2022177501 A1 WO2022177501 A1 WO 2022177501A1
Authority
WO
WIPO (PCT)
Prior art keywords
nir
ppg signal
person
estimating
ppg
Prior art date
Application number
PCT/SG2021/050366
Other languages
French (fr)
Inventor
Venkata Raghava Krishna Vivek SAMAYAMANTRI
Sau Sheong CHANG
Original Assignee
Space Pte. Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Space Pte. Ltd. filed Critical Space Pte. Ltd.
Publication of WO2022177501A1 publication Critical patent/WO2022177501A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/0059Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
    • A61B5/0077Devices for viewing the surface of the body, e.g. camera, magnifying lens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
    • A61B5/0205Simultaneously evaluating both cardiovascular conditions and different types of body conditions, e.g. heart and respiratory condition
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
    • A61B5/024Detecting, measuring or recording pulse rate or heart rate
    • A61B5/02416Detecting, measuring or recording pulse rate or heart rate using photoplethysmograph signals, e.g. generated by infrared radiation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/0059Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
    • A61B5/0075Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence by spectroscopy, i.e. measuring spectra, e.g. Raman spectroscopy, infrared absorption spectroscopy
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
    • A61B5/021Measuring pressure in heart or blood vessels
    • A61B5/02108Measuring pressure in heart or blood vessels from analysis of pulse wave characteristics
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/08Detecting, measuring or recording devices for evaluating the respiratory organs
    • A61B5/0816Measuring devices for examining respiratory frequency
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/145Measuring characteristics of blood in vivo, e.g. gas concentration, pH value; Measuring characteristics of body fluids or tissues, e.g. interstitial fluid, cerebral tissue
    • A61B5/1455Measuring characteristics of blood in vivo, e.g. gas concentration, pH value; Measuring characteristics of body fluids or tissues, e.g. interstitial fluid, cerebral tissue using optical sensors, e.g. spectral photometrical oximeters
    • A61B5/14551Measuring characteristics of blood in vivo, e.g. gas concentration, pH value; Measuring characteristics of body fluids or tissues, e.g. interstitial fluid, cerebral tissue using optical sensors, e.g. spectral photometrical oximeters for measuring blood gases

Definitions

  • This invention relates to a system and method of estimating body vitals like Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person. Specifically, this invention relates to a system and method of estimating body vitals like Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person based on images from a colour camera and an active depth sensor simultaneously.
  • Prior Art
  • Photoplethysmography is an optically obtained plethysmogram used to detect blood volume changes in the microvascular bed of tissue.
  • Remote Photoplethysmography allows estimation of blood volume changes without skin contact by using ambient light sources and video cameras. Under proper lighting conditions, minute variations in skin colour and temperature due to blood volume changes can be observed. This allows estimating blood volume changes without skin contact.
  • ambient light sources slight change in the surrounding would inevitably lead to inaccurate results from the plethysmogram.
  • a first advantage of the system and method in accordance with this invention is that the system and method allows non-contact measurement of vital signs.
  • a second advantage of the system and method in accordance with this invention is that the system and method is able to estimate vital signs within a short period of time, at less than 5 seconds, with an accuracy of more than 90%.
  • a third advantage of the system and method in accordance with this invention is that the system and method is independent of the ambient light.
  • a fourth advantage of the system and method in accordance with this invention is that the system and method utilises devices that are readily available. This allows remote monitoring of patient's vital signs with ease.
  • a first aspect of the disclosure describes a system for estimating vitals of a person.
  • the system comprises an image sensor adapted to capture a sequence of color (RGB) and near infrared (NIR) images, a processing unit comprising a processor, memory and instructions stored on the memory and executable by the processor to: estimating vitals of the person from the sequence of RGB and NIR images.
  • RGB color
  • NIR near infrared
  • the instruction to estimating vitals of the person from the sequence of RGB and NIR images comprises instructions to: time synchronising the sequence of RGB and NIR images; detecting region of interest within the sequence of RGB and NIR images; extracting raw signals from the detected region of interest; extracting photoplethysmography (PPG) signals from the raw signals; and estimating vitals of the person from the extracted PPG signals.
  • PPG photoplethysmography
  • the instruction to detecting region of interest within the sequence of RGB and NIR images comprises instructions to: detecting a face of the person within each of the sequence of NIR images; in response to detecting the face of the person, detecting at least one region of interest within the detected face of the person; and mapping the detected region of interest in each of the sequence of NIR images to the corresponding sequence of RGB images.
  • the instruction to detecting at least one region of interest within the detected face of the person comprises instructions to: applying facial landmark detection where 81 prominent feature points on the face the person are detected; and overlaying a box at a top of the detected 81 prominent feature points between facial edge and eyebrows, wherein the box is one region of interest.
  • the instruction to detecting at least one region of interest within the detected face of the person further comprises instructions to: overlaying two boxes between facial edge and nose, wherein the two box are two other region of interests.
  • the instruction to detecting region of interest within the sequence of RGB and NIR images comprises instructions to: computing an average pixel intensity of the detected region of interest for each of a plurality of frequency bands including Red, Green and Blue from the sequence of RGB images and NIR frequency band from the sequence of NIR images; and appending the average pixel intensity value of each frequency band to the pixel intensity value of corresponding frequency band in respective RGB and NIR images to form a time series for each frequency band.
  • the instruction to detecting region of interest within the sequence of RGB and NIR images further comprises instructions to: determining if a length of the time series is equal to a predetermined length, /.; in response to the length of the time series being equal to the predetermined length, L, normalising the pixel intensity value in the time series for each frequency band.
  • ICA Independent Component Analysis
  • the instruction to applying Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band comprises instructions to: grouping the detrended pixel intensity value in the time series for each frequency band into a set of Red PPG signal and a set of NIR PPG signals wherein the Red PPG signal is derived from the frequency bands of Red, Green and Blue and the NIR PPG signal is derived from the frequency bands of NIR, Green and Blue; applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the Red PPG signal; applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the NIR PPG signal.
  • ICA Independent Component Analysis
  • the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: computing Ratio (R) of time-domain range for NIR PPG signal and Red PPG signal wherein the Ratio (R) can be expressed with the following expression, where amplitude range of Red PPG signal relates to difference between minimum and maximum values of the amplitude of the Red PPG signal in time domain, arithmetic mean of Red PPG signal relates to arithmetic mean of the Red PPG signal in between minimum and maximum values of the amplitude of the Red PPG signal in time domain, amplitude range of NIR PPG signal relates to difference between minimum and maximum values of the amplitude of the NIR PPG signal in time domain and arithmetic mean of NIR PPG signal relates to arithmetic mean of the NIR PPG signal in between minimum and maximum values of the amplitude of the NIR PPG signal in time domain; estimating the Blood Oxygen Saturation with the following expression,
  • the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: detecting Systolic Peak (P s ), Diastolic peak (P d ), and global minima (P m ) from the time-domain NIR PPG signal; estimating the Systolic and Diastolic Blood Pressure with the following expressions,
  • Y 1 , Y 2 , Y 3 , Y 4 , Y 5 , Y 6 are parameters of the regression curve which maps the
  • Y 1 and Y 4 are frequency-slope parameters for systolic and diastolic blood pressures respectively
  • Y 2 and Y 5 are amplitude-slope parameters for systolic and diastolic blood pressures respectively
  • Y 3 and Y 6 are off-set parameters for systolic and diastolic blood pressures respectively.
  • the instruction to detecting Systolic Peak (P s ), Diastolic peak (P d ), and global minima (P m ) from the time-domain NIR PPG signal comprises instructions to: applying a peak detection algorithm on the NIR PPG signal to obtain a second order differential; identifying a global maxima in the NIR PPG signal as the Systolic Peak (P s ), a peak in the NIR PPG signal corresponding to a minima following the Systolic Peak (P s ) in the second order derivative as Diastolic peak (P d ), a maxima immediately after the systolic peak in the second order derivative as the Diatric notch, a minima of the entire NIR PPG signal as the global minima (P m ).
  • the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: applying fast fourier transform to the NIR PPG signal to transform the NIR PPG signal from time-domain to frequency domain; computing a plurality of local maxima of a power spectral density (PSD) of the NIR PPG signal in frequency domain; in response to a local maxima between the frequency range of 0.8 and 2 Hz, estimating a Heart Rate (HR) with the following expression,
  • Heart Rate ( HR ) 60 * f hr
  • f hr refers to the local maxima that is between the frequency range of 0.8
  • Respiration Rate ( RR ) 60 * f rr
  • f rr refers to the local maxima that is between the frequency range of 0.1 and 0.5 Hz.
  • the instruction to estimating vitals of the person from the extracted PPG signals comprises instruction to applying a machine learning model based on a linear regression model.
  • a second aspect of the disclosure describes a method for estimating vitals of a person.
  • the method comprises estimating vitals of the person from a sequence of RGB and NIR images.
  • the step of estimating vitals of the person from the sequence of RGB and NIR images comprises: time synchronising the sequence of RGB and NIR images; detecting region of interest within the sequence of RGB and NIR images; extracting raw signals from the detected region of interest; extracting photoplethysmography (PPG) signals from the raw signals; and estimating vitals of the person from the extracted PPG signals.
  • PPG photoplethysmography
  • the step of detecting region of interest within the sequence of RGB and NIR images comprises: detecting a face of the person within each of the sequence of NIR images; in response to detecting the face of the person, detecting at least one region of interest within the detected face of the person; and mapping the detected region of interest in each of the sequence of NIR images to the corresponding sequence of RGB images.
  • the step of detecting at least one region of interest within the detected face of the person comprises instructions to: applying facial landmark detection where 81 prominent feature points on the face the person are detected; and overlaying a box at a top of the detected 81 prominent feature points between facial edge and eyebrows, wherein the box is one region of interest.
  • the step of detecting at least one region of interest within the detected face of the person further comprises instructions to: overlaying two boxes between facial edge and nose, wherein the two box are two other region of interests.
  • the step of detecting region of interest within the sequence of RGB and NIR images comprises instructions to: computing an average pixel intensity of the detected region of interest for each of a plurality of frequency bands including Red, Green and Blue from the sequence of RGB images and NIR frequency band from the sequence of NIR images; and appending the average pixel intensity value of each frequency band to the pixel intensity value of corresponding frequency band in respective RGB and NIR images to form a time series for each frequency band.
  • the step of detecting region of interest within the sequence of RGB and NIR images further comprises instructions to: determining if a length of the time series is equal to a predetermined length, /.; in response to the length of the time series being equal to the predetermined length, L, normalising the pixel intensity value in the time series for each frequency band.
  • ICA Independent Component Analysis
  • ICA Independent Component Analysis
  • the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: computing Ratio (R) of time-domain range for NIR PPG signal and Red PPG signal wherein the Ratio (R) can be expressed with the following expression, where amplitude range of Red PPG signal relates to difference between minimum and maximum values of the amplitude of the Red PPG signal in time domain, arithmetic mean of Red PPG signal relates to arithmetic mean of the Red PPG signal in between minimum and maximum values of the amplitude of the Red PPG signal in time domain, amplitude range of NIR PPG signal relates to difference between minimum and maximum values of the amplitude of the NIR PPG signal in time domain and arithmetic mean of NIR PPG signal relates to arithmetic mean of the NIR PPG signal in between minimum and maximum values of the amplitude of the NIR PPG signal in time domain; estimating the Blood Oxygen Saturation with the following expression,
  • the step of estimating vitals of the person from the extracted PPG signals comprises instructions to: detecting Systolic Peak (P s ), Diastolic peak (P d ), and global minima (P m ) from the time-domain NIR PPG signal; estimating the Systolic and Diastolic Blood Pressure with the following expressions,
  • the step of detecting Systolic Peak (P s ), Diastolic peak (P d ), and global minima (P m ) from the time-domain NIR PPG signal comprises instructions to: applying a peak detection algorithm on the NIR PPG signal to obtain a second order differential; identifying a global maxima in the NIR PPG signal as the Systolic Peak (P s ), a peak in the NIR PPG signal corresponding to a minima following the Systolic Peak (P s ) in the second order derivative as Diastolic peak (P d ), a maxima immediately after the systolic peak in the second order derivative as the Diatric notch, a minima of the entire NIR PPG signal as the global minima (P m ).
  • the step of estimating vitals of the person from the extracted PPG signals comprises instructions to: applying fast fourier transform to the NIR PPG signal to transform the NIR PPG signal from time-domain to frequency domain; computing a plurality of local maxima of a power spectral density (PSD) of the NIR PPG signal in frequency domain; in response to a local maxima between the frequency range of 0.8 and 2 Hz, estimating a Heart Rate (HR) with the following expression,
  • Heart Rate ( HR ) 60 * f hr
  • f hr refers to the local maxima that is between the frequency range of 0.8 and 2 Hz
  • RR Respiration Rate
  • Respiration Rate ( RR ) 60 * f rr
  • f rr refers to the local maxima that is between the frequency range of 0.1
  • the step of estimating vitals of the person from the extracted PPG signals comprises applying a machine learning model based on a linear regression model.
  • Figure 1 illustrating a system for estimating vitals of a person in accordance with an embodiment of this invention
  • Figure 2 illustrating a process flow that is executable by a processing unit of the system in figure 1 in accordance with an embodiment of this invention
  • FIG. 3 illustrating a sample image of colour image and NIR image in accordance with an embodiment of this invention
  • Figure 4 illustrating a process flow for extracting raw signal in accordance with an embodiment of this invention
  • FIG. 5 illustrating a process flow for extracting PPG signal in accordance with an embodiment of this invention
  • FIG. 6 illustrating a process flow for estimating blood oxygen saturation of a person from the extracted PPG signal in accordance of this invention
  • FIG. 7 illustrating a process flow for estimating Systolic and Diastolic Blood Pressure of a person from the extracted PPG signal in accordance of this invention
  • FIG 8 illustrating a process flow for estimating Heart Rate and Respiration Rate of a person from the extracted PPG signal in accordance of this invention
  • Figure 9 illustrating the NIR PPG signal and the second order differential of the NIR
  • Figure 11 illustrating a linear regression curve that can be used for estimation of blood pressure.
  • This invention relates to a system and method of estimating body vitals like Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person. Specifically, this invention relates to a system and method of estimating body vitals like Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person based on images from a Colour camera and depth sensor simultaneously.
  • Figure 1 illustrates a system 100 for estimating body vitals such as Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person.
  • the system 100 comprises a depth sensor 110, an image sensor 120 and a processing unit 130.
  • the depth sensor 110 is an active depth sensor that works in the near infrared (NIR) region.
  • NIR near infrared
  • absorption of the light by Hemoglobin is lowest in the NIR region.
  • the skin cells and Hemoglobin in blood absorbs less NIR light (800- 1000 nm) compared to visible light and hence the blood flow changes underneath the skin are sensed with greater contrast compared to visible light. Due to low absorption, the PPG signal will have the higher signal to noise ratio. Since the active depth sensor works in the NIR region, we can use active depth sensor 110 to shine NIR light on the skin of a person and measure its absorption to assist in estimating the vitals of the person.
  • the image sensor 120 is a typical colour camera for capturing sequence of images. Such image sensor 120 is commonly known in the art and a detailed description of the image sensor 120 would be omitted for brevity.
  • the processing unit 130 is a typical computing system that comprises a processor, memory and instructions stored on the memory and executable by the processor.
  • the processor may be a processor, microprocessor, microcontroller, application specific integrated circuit, digital signal processor (DSP), programmable logic circuit, or other data processing device that executes instructions to perform the processes in accordance with the present invention.
  • the processor has the capability to execute various applications that are stored in the memory.
  • the memory may include read-only memory (ROM), random- access memory (RAM), electrically erasable programmable ROM (EEPROM), flash cards, or any storage medium.
  • Instructions are computing codes, software applications that are stored on the memory and executable by the processor to perform the processes in accordance with this invention. Such computing system is well known in the art and hence only briefly described herein.
  • the instructions can developed in C++ language (or any other known programming language) and can be run on System on Chip (SoC) like Raspberry Pi or/and mobile devices like cell phones or tablet PCs.
  • SoC System on Chip
  • FIG. 2 illustrates a process flow 200 that is executed by the instructions of the processing unit to estimate vital signs of a person in accordance with this invention.
  • both active depth sensor 110 and image sensor 120 would pointing in the same direction. Images captured by both sensors 110 and 120 would be transmitted to the processing unit 130.
  • Process 200 begins with step 205 where the images from the active depth sensor 110 and image sensor 120 are time synchronised. Essentially, there should not be any time shift between the images from both sensors.
  • process 200 detects the region of interest.
  • the region of interest includes, forehead, cheek and any other exposed skin.
  • the region of interest would include the forehead and cheek.
  • Figure 3 illustrates an example of an image taken from the active depth sensor 110 and image sensor 120. Specifically, the left image 310 is taken from the active depth sensor 110 and the right image 320 is taken from the image sensor 120.
  • the process first detects the face 311 and thereafter detects whether the person is wearing a mask. If the person is not wearing a mask, the process would detect the forehead region 312 and the cheek regions 313. If the person is wearing a mask, the process would detect the forehead region 312 only.
  • One method of detecting the Rol is by using facial landmark detection where 81 prominent feature points on the face are detected.
  • the 81 prominent feature points include eyebrow, eye, nose, mouth and facial edge.
  • the process would be able to determine if the person of interest is facing the camera (i.e. frontal, profile, or semi-frontal) and whether the person is wearing a mask (i.e. mouth and part of nose and bottom facial edges would be obscured).
  • a mask i.e. mouth and part of nose and bottom facial edges would be obscured.
  • the region of interest is computed for all the images from the active depth sensor 110 and mapped to the images from image sensor 120 using the calibration.
  • the Rol detection is executed on the captured images to form a set of time synchronised images.
  • the set of time synchronised images is a length of no more than 5 seconds.
  • both active depth sensor 110 and image sensor 120 are calibrated using standard Tsai camera calibration technique to know the pixel mapping between them so that when we detect the region of interest in one image, it can be mapped to the other image without additional computations.
  • process 200 extracts raw signals within the detected region(s) of the images from both the active depth sensor 110 and image sensor 120. Further details will be described below with reference to figure 4.
  • step 220 process 200 extracts PPG signal from the raw signals. Further details will be described below with reference to figure 5.
  • step 225 process 200 estimates the vital signs including Respiration Rate (RR),
  • Process 200 ends after step 225.
  • Figure 4 illustrates a process flow 400 that is executed by the instructions of the processing unit to extract raw signals within the detected region(s) of the images from both the active depth sensor of step 215 in process 200 in accordance with this invention.
  • the raw signals are pixel intensity values of each frequency band in the RGB images and NIR images.
  • Process 400 begins with step 405 to compute the average pixel intensity of the detected region(s) for each of the frequency bands including Red, Green and Blue from the image of the image sensor 220 and NIR from the depth sensor 210.
  • Each detected Region of Interest (Rol) is a rectangular matrix of the pixel intensity values as identified as 312 and 313 in Figure 3.
  • the arithmetic mean of all the pixels (average pixel intensity) within the region of interest 312 and 313 is computed in step 405.
  • the average pixel intensity value for each frequency band computed in step 405 is then appended to the time series of the corresponding frequency band in step 410.
  • process 400 appends the average pixel intensity value to form a time series corresponding to the frequency band. Specifically, process 400 appends the average pixel intensity value of each frequency band to the pixel intensity value of corresponding frequency band in respective RGB and NIR images to form a time series for each frequency band, the frequency band includes R, G B and NIR.
  • the time series relate to the time synchronised video for estimating the vital signs. Each set of time series is no longer than 5 seconds of video clip.
  • the average pixel intensity values are appended in the following manner:
  • the average pixel intensity values of the Red frequency band computed in step 405 is appended to the pixel intensity values of Red frequency band in each image from the video clip to form the red time series;
  • the average pixel intensity values of the Green frequency band computed in step 405 is appended to the pixel intensity values of Green frequency band in each image from the video clip to form the green time series; 3.
  • the average pixel intensity values of the Blue frequency band computed in step 405 is appended to the pixel intensity values of Blue frequency band in each image from the video clip to form the blue time series; and 4.
  • the average pixel intensity values of the NIR frequency band computed in step 405 is appended to the pixel intensity values of NIR frequency band in each image from the video clip to form the NIR time series.
  • process 400 generates a table as illustrated below.
  • Each column of Red, Blue, Green and NIR corresponds to a time series where the average pixel intensity value has been appended to respective frequency band.
  • Below table only illustrates four time stamps, at 0 second, 0.036 second, 0.072 second and 0.108 second.
  • the time series should contain no more than 5 seconds worth of data.
  • the minimum would be 104 frames; and if camera frame capture rate is more than 30 FPS, the minimum would be 128 frames.
  • step 415 process 400 determines if the time series length is equal to a predetermined length, L
  • the predetermined length, L is no more than 5 seconds, which is computed based on the frame rate of the image sensor. If the length of the time series is equal to the predetermined length, L, process 400 proceeds to step 425. If the time series is not equal to the predetermined length, L, process 400 repeats from step 205. Essentially, in order to accurately estimate the vital signs, the system requires a minimum length of data and this length is no more than 5 seconds.
  • step 415 may be modified to determine if the time series length is less than or not more than the predetermined length, L
  • process 400 normalises the pixel intensity value in the time series for each frequency band. Specifically, the pixel intensity value from each frequency band in the time series are normalised to restrict the amplitude to be within [0, 1].
  • normalising the pixel intensity value in each frequency band in the time series is by subtracting the pixel intensity value in each frequency band by their arithmetic mean of the pixel intensity value and divided by their standard deviation the pixel intensity value.
  • Process 400 ends after step 425.
  • Figure 5 illustrates a process flow 500 that is executed by the instructions of the processing unit to extract PPG signal from the raw signals of step 220 in process 200 in accordance with this invention.
  • Process 500 begins with step 505 by applying a moving average filter for each frequency band. Specifically, process 505 applies a moving average filter to the pixel intensity value in the time series for each frequency band.
  • the moving average filter has a window length of a t * L, to remove sudden amplitude changes in the raw signal caused due to movement of the person across the time series. The choice of a t and L are shown in the table below.
  • process 500 applies a detrending algorithm to the up-sampled pixel intensity value in the time series for each frequency band.
  • a detrending algorithm based on smoothness priors’ formulation is applied to remove noise in the up-sampled raw signal while preserving the frequency bands corresponding to respiration rate and heart rate. Further details of the detrending algorithm can be found in Tarvainen, M.P., Ranta-Aho, P.O. and Karjalainen, P.A., 2002.
  • An advanced detrending method with application to HRV analysis IEEE Transactions on Biomedical Engineering, 49(2), pp.172-175.
  • the above expression automates the tuning of the smoothness factor, L, and eliminates the need for re-tuning the algorithm when the length of raw signal is varied.
  • process 500 applies Independent Component Analysis (ICA) to extract PPG signal from the detrended raw signal (i.e. detrended pixel intensity value in the time series for each frequency band).
  • the detrended raw signal is grouped into two sets based on the frequency bands to extract Red and NIR PPG signals using ICA.
  • the frequency bands of Red, Green and Blue are used to extract Red PPG signal in step 530 and the frequency bands of NIR, Green and Blue are used to extract NIR PPG signal in step 540.
  • ICA is a computational method for separating a multivariate signal into additive subcomponents. It is a standard technique well known in the field of signal processing and hence details of ICA is omitted for brevity.
  • a band pass filter with cutoff frequency between 0.16 and 2 Hz is applied to the Red PPG signal to remove noise.
  • a band pass filter with cutoff frequency between 0.16 and 2 Hz is applied to the NIR PPG signal to remove noise.
  • the cutoff frequency is set to 0.16 and 2 Hz which covers the frequency range of average human respiration rate and heart rate.
  • the band pass filter is a Finite Impulse Response (FIR) filter which is a standard time domain filtering technique.
  • the cutoff frequency can be inversed to translate to time domain.
  • the values of a 1 , a 2 , a 3 &.L are calibrated based on the imaging devices used to capture video.
  • the imaging device s video capture frame rate plays a key role in the calibration process, and this is dependent on the device and the processing unit 130.
  • These calibrated values are used to automate the up-sampling and detrending algorithms. Below are two examples of known devices with the calibrated values.
  • Process 500 ends after step 535 and 545.
  • FIG. 6 illustrates a process flow 600 that is executed by the instructions of the processing unit to estimate Blood Oxygen Saturation (SPO 2 ) in step 225 in process 200 in accordance with this invention.
  • Process 600 begins with step 605 by computing Ratio (R) of time-domain range for NIR PPG signal and Red PPG signal.
  • Ratio (R) can be computed with the following expression,
  • amplitude range of Red PPG signal refers to difference between minimum and maximum values of the amplitude of the Red PPG signal in time domain
  • arithmetic mean of Red PPG signal refers to arithmetic mean of the Red PPG signal in between minimum and maximum values of the amplitude of the Red PPG signal in time domain
  • amplitude range of NIR PPG signal refers to difference between minimum and maximum values of the amplitude of the NIR PPG signal in time domain
  • arithmetic mean of NIR PPG signal refers to arithmetic mean of the NIR PPG signal in between minimum and maximum values of the amplitude of the NIR PPG signal in time domain.
  • step 610 process 600 estimates the Blood Oxygen Saturation with the following expression.
  • Blood Oxygen Saturation ( SP0 2 ) b + ⁇ 2 * R
  • b 1 refers to off-set parameter and ⁇ 2 refers to slope parameter of the linear regression curve which maps the Ratio (R) to the blood oxygen saturation (SPO 2 ).
  • /3 ⁇ 4 and ⁇ 2 are variables that can be derived from the linear regression curve.
  • Figure 10 illustrates a linear regression curve that can be used for estimation of SPO 2 from Ratio (R).
  • Process 600 ends after step 610.
  • FIG. 7 illustrates a process flow 700 that is executed by the instructions of the processing unit to estimate Systolic and Diastolic Blood Pressure in step 225 in process 200 in accordance with this invention.
  • Process 700 begins with step 705 by detecting Systolic Peak (P s ), Diastolic peak (P d ), and global minima (P m ) from the time-domain NIR PPG signal.
  • P s Systolic Peak
  • P d Diastolic peak
  • P m global minima
  • the Systolic peak times can be detected accurately from the blood volume pulse (BVP) waveform as they are maxima within the signal.
  • BVP waveform refers to the NIR PPG signal.
  • Maxima refers to global maxima which is the maxima of the entire NIR PPG signal.
  • the amplitude of the global maxima is identified as the Systolic Peak (P s ) 910. From figure 9, the largest minima within the second order derivative correspond to the Systolic Peaks (P s ) 910 and the minima following these typically correspond to the diastolic peaks/inflections. Based on the location of the diastolic peak in the second order derivative, the amplitude of the corresponding point on the NIR PPG signal is identified as Diastolic peak (P d ) 930. The location corresponding to the Diatric notch 920 is identified as the maxima immediately after the location identified as the systolic peak in the second order derivative. The minima of the entire NIR PPG signal would be the global minima (P m ) 940. In step 610, process 600 estimates the Systolic and Diastolic Blood Pressure with the following expressions,
  • Y 1 , Y 2 , Y 3 , Y 4 , Y 5 , Y 6 are parameters of the regression curve which maps the
  • Y 1 and Y 4 are frequency-slope parameters for systolic and diastolic blood pressures respectively
  • Y 2 and Y 5 are amplitude-slope parameters for systolic and diastolic blood pressures respectively
  • Y 3 and Y 1 are off-set parameters for systolic and diastolic blood pressures respectively.
  • Figure 11 illustrates a linear regression curve that can be used for estimation of blood pressure.
  • the log in figure 11 refers to logarithm base-10 and In refers to logarithm base-e (natural log).
  • Process 700 ends after step 710.
  • FIG. 8 illustrates a process flow 800 that is executed by the instructions of the processing unit to estimate Heart Rate (HR) and Respiration Rate (RR) in step 225 in process 200 in accordance with this invention.
  • Process 800 begins with step 805 by applying fast fourier transform to the NIR PPG signal to transform the NIR PPG signal from time-domain to frequency domain.
  • process 800 compute the local maxima of power spectral density (PSD). Specifically, PSD is computed by dividing the NIR PPG in frequency domain by the length of the NIR PPG signal (L). The local maxima of PSD is then computed by using any generic peak finder algorithms.
  • PSD power spectral density
  • process 800 determines the frequency range of the local maxima to use for computing the heart rate and respiration rate.
  • the local maxima within the frequency range of between 0.8 and 2 Hz is used.
  • the local maxima within the frequency range of between 0.1 and 0.5 Hz is used.
  • process 800 proceeds to step 830 to estimate the Heart Rate (HR) using the local maxima that is that is within the frequency range of between 0.8 and 2 Hz and to step 840 to estimate the Respiration Rate (RR) using the local maxima that is within the frequency range of between 0.1 and 0.5 Hz.
  • HR Heart Rate
  • RR Respiration Rate
  • process 800 estimates the Heart Rate (HR) with the following expression,
  • f hr refers to the local maxima that is within the frequency range of between 0.8 and 2 Hz.
  • step 840 process 800 estimates the Respiration Rate (RR) with the following expression,
  • Respiration Rate ( RR ) 60 * f rr
  • f rr refers to the local maxima that is within the frequency range of between 0.1 and 0.5 Hz.
  • Process 800 ends after steps 830 and 840.
  • the Heart rate, Respiration rate and Blood Pressure are estimated using NIR PPG signal derived from data captured by active depth camera. These estimates do not vary with the ambient lighting conditions.
  • the blood pressure model works on the premise that the systolic and diastolic peaks identified which correspond to the respective heart activities. This method gives greater accuracy compared to just using the global maxima in the model.
  • the estimation of Blood Oxygen Saturation is based on the ratio of range of Red PPG signal to range of NIR PPG signal.
  • the method according to this invention involves using an active depth camera to capture simultaneous, time synchronized, video frames of both Color (RGB) and NIR wavelengths. From the video frames, a person’s face is detected and the region corresponding to their forehead is extracted (region of interest) and the pixel intensity values are averaged to form a time domain raw signal across multiple video frames. The time domain raw signal is detrended to remove higher order harmonics and resampled and filtered using Independent Component Analysis (ICA) and FIR bandpass filter to obtain the PPG signal. Vitals such as Pulse rate, Respiration Rate, Blood Oxygen saturation and Blood Pressure are estimated from the PPG signal using frequency analysis using FFT for the former two vitals and mathematical models for the latter two vitals. Processes 600-800 illustrate one embodiment of estimating Pulse rate, Respiration
  • Pulse rate, Respiration Rate, Blood Oxygen saturation and Blood Pressure based on the Red PPG signal and NIR PPG signal Another embodiment of estimating Pulse rate, Respiration Rate, Blood Oxygen saturation and Blood Pressure based on the Red PPG signal and NIR PPG signal is the use of a machine learning model to estimate vitals of the person from the the Red PPG signal and NIR PPG signal.
  • machine learning model is the linear regression model.
  • processes 600-800 would be used for estimating Pulse rate, Respiration Rate, Blood Oxygen saturation and Blood Pressure based on the Red PPG signal and NIR PPG signal for at least a predetermined number of datasets is obtained in order to train the model for the machine learning model before the machine learning model takes over from processes 600-800.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Cardiology (AREA)
  • Physiology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Veterinary Medicine (AREA)
  • Surgery (AREA)
  • Public Health (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Pulmonology (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)

Abstract

This invention relates to a system and method for measuring vital body signs. The method comprises time synchronising a sequence of RGB and NIR images, detecting region of interest within the sequence of RGB and NIR images, extracting raw signals from the detected region of interest, extracting photoplethysmography (PPG) signals from the raw signals and estimating vitals of the person from the extracted PPG signals.

Description

A System And Method For Measuring Vital Body Signs
Field of the Invention
This invention relates to a system and method of estimating body vitals like Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person. Specifically, this invention relates to a system and method of estimating body vitals like Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person based on images from a colour camera and an active depth sensor simultaneously. Prior Art
Photoplethysmography (PPG) is an optically obtained plethysmogram used to detect blood volume changes in the microvascular bed of tissue. Remote Photoplethysmography allows estimation of blood volume changes without skin contact by using ambient light sources and video cameras. Under proper lighting conditions, minute variations in skin colour and temperature due to blood volume changes can be observed. This allows estimating blood volume changes without skin contact. However, due to the use of ambient light sources, slight change in the surrounding would inevitably lead to inaccurate results from the plethysmogram.
Therefore, those skilled in the art are striving to provide an improved method and system of using PPG to remotely obtain vital signs of patients accurately.
Summary of the Invention
The above and other problems are solved and an advance in the art is made by a system and method in accordance with this invention. A first advantage of the system and method in accordance with this invention is that the system and method allows non-contact measurement of vital signs. A second advantage of the system and method in accordance with this invention is that the system and method is able to estimate vital signs within a short period of time, at less than 5 seconds, with an accuracy of more than 90%. A third advantage of the system and method in accordance with this invention is that the system and method is independent of the ambient light. A fourth advantage of the system and method in accordance with this invention is that the system and method utilises devices that are readily available. This allows remote monitoring of patient's vital signs with ease.
A first aspect of the disclosure describes a system for estimating vitals of a person. The system comprises an image sensor adapted to capture a sequence of color (RGB) and near infrared (NIR) images, a processing unit comprising a processor, memory and instructions stored on the memory and executable by the processor to: estimating vitals of the person from the sequence of RGB and NIR images.
In an embodiment of the first aspect of the disclosure, the instruction to estimating vitals of the person from the sequence of RGB and NIR images comprises instructions to: time synchronising the sequence of RGB and NIR images; detecting region of interest within the sequence of RGB and NIR images; extracting raw signals from the detected region of interest; extracting photoplethysmography (PPG) signals from the raw signals; and estimating vitals of the person from the extracted PPG signals.
In an embodiment of the first aspect of the disclosure, the instruction to detecting region of interest within the sequence of RGB and NIR images comprises instructions to: detecting a face of the person within each of the sequence of NIR images; in response to detecting the face of the person, detecting at least one region of interest within the detected face of the person; and mapping the detected region of interest in each of the sequence of NIR images to the corresponding sequence of RGB images.
In an embodiment of the first aspect of the disclosure, the instruction to detecting at least one region of interest within the detected face of the person comprises instructions to: applying facial landmark detection where 81 prominent feature points on the face the person are detected; and overlaying a box at a top of the detected 81 prominent feature points between facial edge and eyebrows, wherein the box is one region of interest. In an embodiment of the first aspect of the disclosure, the instruction to detecting at least one region of interest within the detected face of the person further comprises instructions to: overlaying two boxes between facial edge and nose, wherein the two box are two other region of interests. In an embodiment of the first aspect of the disclosure, the instruction to detecting region of interest within the sequence of RGB and NIR images comprises instructions to: computing an average pixel intensity of the detected region of interest for each of a plurality of frequency bands including Red, Green and Blue from the sequence of RGB images and NIR frequency band from the sequence of NIR images; and appending the average pixel intensity value of each frequency band to the pixel intensity value of corresponding frequency band in respective RGB and NIR images to form a time series for each frequency band.
In an embodiment of the first aspect of the disclosure, the instruction to detecting region of interest within the sequence of RGB and NIR images further comprises instructions to: determining if a length of the time series is equal to a predetermined length, /.; in response to the length of the time series being equal to the predetermined length, L, normalising the pixel intensity value in the time series for each frequency band.
In an embodiment of the first aspect of the disclosure, the instruction to extracting PPG signals from the raw signals comprises instructions to: applying a moving average filter to the pixel intensity value in the time series for each frequency band, the moving average filter has a window length of ax * L; up-sampling the pixel intensity value in the time series for each frequency band using cubic spline interpolation method to form a new length, L2, where L2 = a2 * L; applying a detrending algorithm based on smoothness priors’ formulation for the upsampled pixel intensity value in the time series for each frequency band wherein a smoothness factor, L, of the detrending algorithm is set based on the following expression, l = a3 * L2; and applying Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band, wherein values of <¾,<¾,<¾ &L are calibrated based on the imaging sensor.
In an embodiment of the first aspect of the disclosure, the instruction to applying Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band comprises instructions to: grouping the detrended pixel intensity value in the time series for each frequency band into a set of Red PPG signal and a set of NIR PPG signals wherein the Red PPG signal is derived from the frequency bands of Red, Green and Blue and the NIR PPG signal is derived from the frequency bands of NIR, Green and Blue; applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the Red PPG signal; applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the NIR PPG signal.
In an embodiment of the first aspect of the disclosure, the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: computing Ratio (R) of time-domain range for NIR PPG signal and Red PPG signal wherein the Ratio (R) can be expressed with the following expression,
Figure imgf000005_0001
where amplitude range of Red PPG signal relates to difference between minimum and maximum values of the amplitude of the Red PPG signal in time domain, arithmetic mean of Red PPG signal relates to arithmetic mean of the Red PPG signal in between minimum and maximum values of the amplitude of the Red PPG signal in time domain, amplitude range of NIR PPG signal relates to difference between minimum and maximum values of the amplitude of the NIR PPG signal in time domain and arithmetic mean of NIR PPG signal relates to arithmetic mean of the NIR PPG signal in between minimum and maximum values of the amplitude of the NIR PPG signal in time domain; estimating the Blood Oxygen Saturation with the following expression,
Blood Oxygen Saturation ( SP02 ) = bi + β2 * R where β1 refers to off-set parameter and β2 refers to slope parameter of the linear regression curve which maps the Ratio (R) to the blood oxygen saturation (SPO2).
In an embodiment of the first aspect of the disclosure, the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: detecting Systolic Peak (Ps), Diastolic peak (Pd), and global minima (Pm) from the time-domain NIR PPG signal; estimating the Systolic and Diastolic Blood Pressure with the following expressions,
Systolic Blood Pressure
Figure imgf000006_0001
( )
Diastolic Blood Pressure
Figure imgf000006_0002
^ ^ where Y1, Y2, Y3, Y4, Y5, Y6 are parameters of the regression curve which maps the
Heart Rate and NIR PPG signal amplitude features to the systolic and diastolic blood pressure, Y1 and Y4 are frequency-slope parameters for systolic and diastolic blood pressures respectively, Y2 and Y5 are amplitude-slope parameters for systolic and diastolic blood pressures respectively, Y3 and Y6 are off-set parameters for systolic and diastolic blood pressures respectively.
In an embodiment of the first aspect of the disclosure, the instruction to detecting Systolic Peak (Ps), Diastolic peak (Pd), and global minima (Pm) from the time-domain NIR PPG signal comprises instructions to: applying a peak detection algorithm on the NIR PPG signal to obtain a second order differential; identifying a global maxima in the NIR PPG signal as the Systolic Peak (Ps), a peak in the NIR PPG signal corresponding to a minima following the Systolic Peak (Ps) in the second order derivative as Diastolic peak (Pd), a maxima immediately after the systolic peak in the second order derivative as the Diatric notch, a minima of the entire NIR PPG signal as the global minima (Pm).
In an embodiment of the first aspect of the disclosure, the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: applying fast fourier transform to the NIR PPG signal to transform the NIR PPG signal from time-domain to frequency domain; computing a plurality of local maxima of a power spectral density (PSD) of the NIR PPG signal in frequency domain; in response to a local maxima between the frequency range of 0.8 and 2 Hz, estimating a Heart Rate (HR) with the following expression,
Heart Rate ( HR ) = 60 * fhr where fhr refers to the local maxima that is between the frequency range of 0.8 and
2 Hz; and in response to a local maxima between the frequency range of 0.1 and 0.5 Hz, estimating a Respiration Rate (RR) with the following expression,
Respiration Rate ( RR ) = 60 * frr where frr refers to the local maxima that is between the frequency range of 0.1 and 0.5 Hz.
In an embodiment of the first aspect of the disclosure, the instruction to estimating vitals of the person from the extracted PPG signals comprises instruction to applying a machine learning model based on a linear regression model.
A second aspect of the disclosure describes a method for estimating vitals of a person. The method comprises estimating vitals of the person from a sequence of RGB and NIR images.
In an embodiment of the second aspect of the disclosure, the step of estimating vitals of the person from the sequence of RGB and NIR images comprises: time synchronising the sequence of RGB and NIR images; detecting region of interest within the sequence of RGB and NIR images; extracting raw signals from the detected region of interest; extracting photoplethysmography (PPG) signals from the raw signals; and estimating vitals of the person from the extracted PPG signals.
In an embodiment of the second aspect of the disclosure, the step of detecting region of interest within the sequence of RGB and NIR images comprises: detecting a face of the person within each of the sequence of NIR images; in response to detecting the face of the person, detecting at least one region of interest within the detected face of the person; and mapping the detected region of interest in each of the sequence of NIR images to the corresponding sequence of RGB images.
In an embodiment of the second aspect of the disclosure, the step of detecting at least one region of interest within the detected face of the person comprises instructions to: applying facial landmark detection where 81 prominent feature points on the face the person are detected; and overlaying a box at a top of the detected 81 prominent feature points between facial edge and eyebrows, wherein the box is one region of interest.
In an embodiment of the second aspect of the disclosure, the step of detecting at least one region of interest within the detected face of the person further comprises instructions to: overlaying two boxes between facial edge and nose, wherein the two box are two other region of interests.
In an embodiment of the second aspect of the disclosure, the step of detecting region of interest within the sequence of RGB and NIR images comprises instructions to: computing an average pixel intensity of the detected region of interest for each of a plurality of frequency bands including Red, Green and Blue from the sequence of RGB images and NIR frequency band from the sequence of NIR images; and appending the average pixel intensity value of each frequency band to the pixel intensity value of corresponding frequency band in respective RGB and NIR images to form a time series for each frequency band. In an embodiment of the second aspect of the disclosure, the step of detecting region of interest within the sequence of RGB and NIR images further comprises instructions to: determining if a length of the time series is equal to a predetermined length, /.; in response to the length of the time series being equal to the predetermined length, L, normalising the pixel intensity value in the time series for each frequency band. In an embodiment of the second aspect of the disclosure, the step of extracting
PPG signals from the raw signals comprises instructions to: applying a moving average filter to the pixel intensity value in the time series for each frequency band, the moving average filter has a window length of a1 * L; up-sampling the pixel intensity value in the time series for each frequency band using cubic spline interpolation method to form a new length, L2, where L2 a2 * L; applying a detrending algorithm based on smoothness priors’ formulation for the upsampled pixel intensity value in the time series for each frequency band wherein a smoothness factor, L, of the detrending algorithm is set based on the following expression, λ = α3 * L2; and applying Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band, wherein values of α1, α2, α3 & L are calibrated based on the imaging sensor. In an embodiment of the second aspect of the disclosure, the step of applying
Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band comprises instructions to: grouping the detrended pixel intensity value in the time series for each frequency band into a set of Red PPG signal and a set of NIR PPG signals wherein the Red PPG signal is derived from the frequency bands of Red, Green and Blue and the NIR PPG signal is derived from the frequency bands of NIR, Green and Blue; applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the Red PPG signal; applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the NIR PPG signal.
In an embodiment of the first aspect of the disclosure, the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: computing Ratio (R) of time-domain range for NIR PPG signal and Red PPG signal wherein the Ratio (R) can be expressed with the following expression,
Figure imgf000009_0001
where amplitude range of Red PPG signal relates to difference between minimum and maximum values of the amplitude of the Red PPG signal in time domain, arithmetic mean of Red PPG signal relates to arithmetic mean of the Red PPG signal in between minimum and maximum values of the amplitude of the Red PPG signal in time domain, amplitude range of NIR PPG signal relates to difference between minimum and maximum values of the amplitude of the NIR PPG signal in time domain and arithmetic mean of NIR PPG signal relates to arithmetic mean of the NIR PPG signal in between minimum and maximum values of the amplitude of the NIR PPG signal in time domain; estimating the Blood Oxygen Saturation with the following expression,
Blood Oxygen Saturation ( SPO2 ) = β1 + β2 * R where β1 refers to off-set parameter and β2 refers to slope parameter of the linear regression curve which maps the Ratio (R) to the blood oxygen saturation (SPO2).
In an embodiment of the second aspect of the disclosure, the step of estimating vitals of the person from the extracted PPG signals comprises instructions to: detecting Systolic Peak (Ps), Diastolic peak (Pd), and global minima (Pm) from the time-domain NIR PPG signal; estimating the Systolic and Diastolic Blood Pressure with the following expressions,
Systolic Blood Pressure (SBP) =
Diastolic Blood Pressure (DBP) =
Figure imgf000010_0001
where Y1, Y2, Y3, Y4, Y5, Y6 are parameters of the regression curve which maps the Heart Rate and NIR PPG signal amplitude features to the systolic and diastolic blood pressure, y1 and y4 are frequency-slope parameters for systolic and diastolic blood pressures respectively, y2 and Y5 are amplitude-slope parameters for systolic and diastolic blood pressures respectively, y3 and Y6 are off-set parameters for systolic and diastolic blood pressures respectively.
In an embodiment of the second aspect of the disclosure, the step of detecting Systolic Peak (Ps), Diastolic peak (Pd), and global minima (Pm) from the time-domain NIR PPG signal comprises instructions to: applying a peak detection algorithm on the NIR PPG signal to obtain a second order differential; identifying a global maxima in the NIR PPG signal as the Systolic Peak (Ps), a peak in the NIR PPG signal corresponding to a minima following the Systolic Peak (Ps) in the second order derivative as Diastolic peak (Pd), a maxima immediately after the systolic peak in the second order derivative as the Diatric notch, a minima of the entire NIR PPG signal as the global minima (Pm). In an embodiment of the second aspect of the disclosure, the step of estimating vitals of the person from the extracted PPG signals comprises instructions to: applying fast fourier transform to the NIR PPG signal to transform the NIR PPG signal from time-domain to frequency domain; computing a plurality of local maxima of a power spectral density (PSD) of the NIR PPG signal in frequency domain; in response to a local maxima between the frequency range of 0.8 and 2 Hz, estimating a Heart Rate (HR) with the following expression,
Heart Rate ( HR ) = 60 * fhr where fhr refers to the local maxima that is between the frequency range of 0.8 and 2 Hz; and in response to a local maxima between the frequency range of 0.1 and 0.5 Hz, estimating a Respiration Rate (RR) with the following expression,
Respiration Rate ( RR ) = 60 * frr where frr refers to the local maxima that is between the frequency range of 0.1 and
0.5 Hz.
In an embodiment of the second aspect of the disclosure, the step of estimating vitals of the person from the extracted PPG signals comprises applying a machine learning model based on a linear regression model.
Brief Description of the Drawings
The above and other features and advantages in accordance with this invention are described in the following detailed description and are shown in the following drawings:
Figure 1 illustrating a system for estimating vitals of a person in accordance with an embodiment of this invention; Figure 2 illustrating a process flow that is executable by a processing unit of the system in figure 1 in accordance with an embodiment of this invention;
Figure 3 illustrating a sample image of colour image and NIR image in accordance with an embodiment of this invention; Figure 4 illustrating a process flow for extracting raw signal in accordance with an embodiment of this invention;
Figure 5 illustrating a process flow for extracting PPG signal in accordance with an embodiment of this invention;
Figure 6 illustrating a process flow for estimating blood oxygen saturation of a person from the extracted PPG signal in accordance of this invention;
Figure 7 illustrating a process flow for estimating Systolic and Diastolic Blood Pressure of a person from the extracted PPG signal in accordance of this invention;
Figure 8 illustrating a process flow for estimating Heart Rate and Respiration Rate of a person from the extracted PPG signal in accordance of this invention; Figure 9 illustrating the NIR PPG signal and the second order differential of the NIR
PPG signal in accordance of this invention;
Figure 10 illustrating a linear regression curve that can be used for estimation of SPO2 from Ratio (R); and
Figure 11 illustrating a linear regression curve that can be used for estimation of blood pressure.
Detailed Description
This invention relates to a system and method of estimating body vitals like Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person. Specifically, this invention relates to a system and method of estimating body vitals like Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person based on images from a Colour camera and depth sensor simultaneously. Figure 1 illustrates a system 100 for estimating body vitals such as Heart Rate, Respiration Rate, Blood Oxygen Saturation and Blood Pressure of a person. The system 100 comprises a depth sensor 110, an image sensor 120 and a processing unit 130.
The depth sensor 110 is an active depth sensor that works in the near infrared (NIR) region. By using active depth sensor 110, we can eliminate error due to variations in ambient light. It is also observed that absorption of the light by Hemoglobin is lowest in the NIR region. Specifically, the skin cells and Hemoglobin in blood absorbs less NIR light (800- 1000 nm) compared to visible light and hence the blood flow changes underneath the skin are sensed with greater contrast compared to visible light. Due to low absorption, the PPG signal will have the higher signal to noise ratio. Since the active depth sensor works in the NIR region, we can use active depth sensor 110 to shine NIR light on the skin of a person and measure its absorption to assist in estimating the vitals of the person.
The image sensor 120 is a typical colour camera for capturing sequence of images. Such image sensor 120 is commonly known in the art and a detailed description of the image sensor 120 would be omitted for brevity.
The processing unit 130 is a typical computing system that comprises a processor, memory and instructions stored on the memory and executable by the processor. The processor may be a processor, microprocessor, microcontroller, application specific integrated circuit, digital signal processor (DSP), programmable logic circuit, or other data processing device that executes instructions to perform the processes in accordance with the present invention. The processor has the capability to execute various applications that are stored in the memory. The memory may include read-only memory (ROM), random- access memory (RAM), electrically erasable programmable ROM (EEPROM), flash cards, or any storage medium. Instructions are computing codes, software applications that are stored on the memory and executable by the processor to perform the processes in accordance with this invention. Such computing system is well known in the art and hence only briefly described herein. The instructions can developed in C++ language (or any other known programming language) and can be run on System on Chip (SoC) like Raspberry Pi or/and mobile devices like cell phones or tablet PCs.
Figure 2 illustrates a process flow 200 that is executed by the instructions of the processing unit to estimate vital signs of a person in accordance with this invention. To use the system 100, both active depth sensor 110 and image sensor 120 would pointing in the same direction. Images captured by both sensors 110 and 120 would be transmitted to the processing unit 130.
Process 200 begins with step 205 where the images from the active depth sensor 110 and image sensor 120 are time synchronised. Essentially, there should not be any time shift between the images from both sensors.
In step 210, process 200 detects the region of interest. The region of interest (Rol) includes, forehead, cheek and any other exposed skin. In one embodiment, when the system is arranged in a manner to capture the face of the person, the region of interest would include the forehead and cheek. Figure 3 illustrates an example of an image taken from the active depth sensor 110 and image sensor 120. Specifically, the left image 310 is taken from the active depth sensor 110 and the right image 320 is taken from the image sensor 120. In step 210, the process first detects the face 311 and thereafter detects whether the person is wearing a mask. If the person is not wearing a mask, the process would detect the forehead region 312 and the cheek regions 313. If the person is wearing a mask, the process would detect the forehead region 312 only. One method of detecting the Rol is by using facial landmark detection where 81 prominent feature points on the face are detected. The 81 prominent feature points include eyebrow, eye, nose, mouth and facial edge. Based on the 81 prominent detected points, the process would be able to determine if the person of interest is facing the camera (i.e. frontal, profile, or semi-frontal) and whether the person is wearing a mask (i.e. mouth and part of nose and bottom facial edges would be obscured). Using the 81 prominent feature points detected, we can simply overlay a box on the top of the detected 81 prominent feature points between the facial edge and eyebrows. If the person is not wearing a mask, another two boxes would be overlay at the cheeks which are between the facial edge and nose.
Based on the detected region, similar region would be marked on the image from the image sensor. The region of interest is computed for all the images from the active depth sensor 110 and mapped to the images from image sensor 120 using the calibration. The Rol detection is executed on the captured images to form a set of time synchronised images. The set of time synchronised images is a length of no more than 5 seconds. In one embodiment, both active depth sensor 110 and image sensor 120 are calibrated using standard Tsai camera calibration technique to know the pixel mapping between them so that when we detect the region of interest in one image, it can be mapped to the other image without additional computations. For example, if we detect forehead and cheek regions in the image from the active depth camera, we can map the regions corresponding to the forehead and cheek regions in the image from the image sensor by using the calibration. In step 215, process 200 extracts raw signals within the detected region(s) of the images from both the active depth sensor 110 and image sensor 120. Further details will be described below with reference to figure 4.
In step 220, process 200 extracts PPG signal from the raw signals. Further details will be described below with reference to figure 5. In step 225, process 200 estimates the vital signs including Respiration Rate (RR),
Heart Rate (HR), Systolic Blood Pressure (SBP), Diastolic Blood Pressure (DBP) and Blood Oxygen Saturation (SPO2). Further details will be described below with reference to figure 6.
Process 200 ends after step 225. Figure 4 illustrates a process flow 400 that is executed by the instructions of the processing unit to extract raw signals within the detected region(s) of the images from both the active depth sensor of step 215 in process 200 in accordance with this invention. The raw signals are pixel intensity values of each frequency band in the RGB images and NIR images. Process 400 begins with step 405 to compute the average pixel intensity of the detected region(s) for each of the frequency bands including Red, Green and Blue from the image of the image sensor 220 and NIR from the depth sensor 210. Each detected Region of Interest (Rol) is a rectangular matrix of the pixel intensity values as identified as 312 and 313 in Figure 3. For each of the frequency bands in the images (Red, Green, Blue and NIR), the arithmetic mean of all the pixels (average pixel intensity) within the region of interest 312 and 313 is computed in step 405. The average pixel intensity value for each frequency band computed in step 405 is then appended to the time series of the corresponding frequency band in step 410.
In step 410, process 400 appends the average pixel intensity value to form a time series corresponding to the frequency band. Specifically, process 400 appends the average pixel intensity value of each frequency band to the pixel intensity value of corresponding frequency band in respective RGB and NIR images to form a time series for each frequency band, the frequency band includes R, G B and NIR. The time series relate to the time synchronised video for estimating the vital signs. Each set of time series is no longer than 5 seconds of video clip. The average pixel intensity values are appended in the following manner:
1 . For Red time series, the average pixel intensity values of the Red frequency band computed in step 405 is appended to the pixel intensity values of Red frequency band in each image from the video clip to form the red time series;
2. For Green time series, the average pixel intensity values of the Green frequency band computed in step 405 is appended to the pixel intensity values of Green frequency band in each image from the video clip to form the green time series; 3. For Blue time series, the average pixel intensity values of the Blue frequency band computed in step 405 is appended to the pixel intensity values of Blue frequency band in each image from the video clip to form the blue time series; and 4. For NIR time series, the average pixel intensity values of the NIR frequency band computed in step 405 is appended to the pixel intensity values of NIR frequency band in each image from the video clip to form the NIR time series.
At the end of the step 410, process 400 generates a table as illustrated below. Each column of Red, Blue, Green and NIR corresponds to a time series where the average pixel intensity value has been appended to respective frequency band. Below table only illustrates four time stamps, at 0 second, 0.036 second, 0.072 second and 0.108 second. One skilled in the art will recognise that more time stamps are required for each successful estimation of the vital signs. As mentioned previously, the time series should contain no more than 5 seconds worth of data. Preferably, if the camera frame capture rate is less than or equal to 30 FPS, the minimum would be 104 frames; and if camera frame capture rate is more than 30 FPS, the minimum would be 128 frames.
Figure imgf000017_0001
In step 415, process 400 determines if the time series length is equal to a predetermined length, L The predetermined length, L, is no more than 5 seconds, which is computed based on the frame rate of the image sensor. If the length of the time series is equal to the predetermined length, L, process 400 proceeds to step 425. If the time series is not equal to the predetermined length, L, process 400 repeats from step 205. Essentially, in order to accurately estimate the vital signs, the system requires a minimum length of data and this length is no more than 5 seconds. One skilled in the art will recognise that other conditions may be applied in step 415 without departing from the invention. For example, step 415 may be modified to determine if the time series length is less than or not more than the predetermined length, L
In step 425, process 400 normalises the pixel intensity value in the time series for each frequency band. Specifically, the pixel intensity value from each frequency band in the time series are normalised to restrict the amplitude to be within [0, 1]. One example of normalising the pixel intensity value in each frequency band in the time series is by subtracting the pixel intensity value in each frequency band by their arithmetic mean of the pixel intensity value and divided by their standard deviation the pixel intensity value.
Process 400 ends after step 425. Figure 5 illustrates a process flow 500 that is executed by the instructions of the processing unit to extract PPG signal from the raw signals of step 220 in process 200 in accordance with this invention. Process 500 begins with step 505 by applying a moving average filter for each frequency band. Specifically, process 505 applies a moving average filter to the pixel intensity value in the time series for each frequency band. The moving average filter has a window length of at * L, to remove sudden amplitude changes in the raw signal caused due to movement of the person across the time series. The choice of atand L are shown in the table below.
In step 510, process 500 up-samples the raw signal using cubic spline interpolation method to a new length, L2, where L2 = a2 * L. Specifically, process 500 up-samples the pixel intensity value in the time series for each frequency band using cubic spline interpolation method to form the new length. Up-sampling would involve adding zero samples between the raw signal to increase the sampling rate. By using cubic spline interpolation method, the added zero samples would be reconstructed to non-zero samples which are typically approximate to adjacent samples of the raw signal. This increases the resolution of the respiration rate and heart rate and thus the accuracy of the estimate.
In step 515, process 500 applies a detrending algorithm to the up-sampled pixel intensity value in the time series for each frequency band. For example, an advanced detrending algorithm based on smoothness priors’ formulation is applied to remove noise in the up-sampled raw signal while preserving the frequency bands corresponding to respiration rate and heart rate. Further details of the detrending algorithm can be found in Tarvainen, M.P., Ranta-Aho, P.O. and Karjalainen, P.A., 2002. An advanced detrending method with application to HRV analysis. IEEE Transactions on Biomedical Engineering, 49(2), pp.172-175. The smoothness factor, L, of the detrending algorithm is set based on the following expression, l = CC-j * Z*2
The above expression automates the tuning of the smoothness factor, L, and eliminates the need for re-tuning the algorithm when the length of raw signal is varied.
In step 520, process 500 applies Independent Component Analysis (ICA) to extract PPG signal from the detrended raw signal (i.e. detrended pixel intensity value in the time series for each frequency band). The detrended raw signal is grouped into two sets based on the frequency bands to extract Red and NIR PPG signals using ICA. Specifically, the frequency bands of Red, Green and Blue are used to extract Red PPG signal in step 530 and the frequency bands of NIR, Green and Blue are used to extract NIR PPG signal in step 540. ICA is a computational method for separating a multivariate signal into additive subcomponents. It is a standard technique well known in the field of signal processing and hence details of ICA is omitted for brevity. In step 535, a band pass filter with cutoff frequency between 0.16 and 2 Hz is applied to the Red PPG signal to remove noise. Similarly in step 535, a band pass filter with cutoff frequency between 0.16 and 2 Hz is applied to the NIR PPG signal to remove noise. The cutoff frequency is set to 0.16 and 2 Hz which covers the frequency range of average human respiration rate and heart rate. The band pass filter is a Finite Impulse Response (FIR) filter which is a standard time domain filtering technique. The cutoff frequency can be inversed to translate to time domain. The values of a1, a2, a3 &.L are calibrated based on the imaging devices used to capture video. The imaging device’s video capture frame rate plays a key role in the calibration process, and this is dependent on the device and the processing unit 130. These calibrated values are used to automate the up-sampling and detrending algorithms. Below are two examples of known devices with the calibrated values.
Figure imgf000020_0002
Process 500 ends after step 535 and 545.
Figure 6 illustrates a process flow 600 that is executed by the instructions of the processing unit to estimate Blood Oxygen Saturation (SPO2) in step 225 in process 200 in accordance with this invention. Process 600 begins with step 605 by computing Ratio (R) of time-domain range for NIR PPG signal and Red PPG signal. Ratio (R) can be computed with the following expression,
Figure imgf000020_0001
Where amplitude range of Red PPG signal refers to difference between minimum and maximum values of the amplitude of the Red PPG signal in time domain, arithmetic mean of Red PPG signal refers to arithmetic mean of the Red PPG signal in between minimum and maximum values of the amplitude of the Red PPG signal in time domain, amplitude range of NIR PPG signal refers to difference between minimum and maximum values of the amplitude of the NIR PPG signal in time domain and arithmetic mean of NIR PPG signal refers to arithmetic mean of the NIR PPG signal in between minimum and maximum values of the amplitude of the NIR PPG signal in time domain.
In step 610, process 600 estimates the Blood Oxygen Saturation with the following expression. Blood Oxygen Saturation ( SP02 ) = b + β2 * R
Where b1 refers to off-set parameter and β2 refers to slope parameter of the linear regression curve which maps the Ratio (R) to the blood oxygen saturation (SPO2). /¾ and β2 are variables that can be derived from the linear regression curve. Figure 10 illustrates a linear regression curve that can be used for estimation of SPO2 from Ratio (R). Process 600 ends after step 610.
Figure 7 illustrates a process flow 700 that is executed by the instructions of the processing unit to estimate Systolic and Diastolic Blood Pressure in step 225 in process 200 in accordance with this invention. Process 700 begins with step 705 by detecting Systolic Peak (Ps), Diastolic peak (Pd), and global minima (Pm) from the time-domain NIR PPG signal. From figure 9, the Systolic peak times can be detected accurately from the blood volume pulse (BVP) waveform as they are maxima within the signal. Here BVP waveform refers to the NIR PPG signal. Maxima refers to global maxima which is the maxima of the entire NIR PPG signal. The amplitude of the global maxima is identified as the Systolic Peak (Ps) 910. From figure 9, the largest minima within the second order derivative correspond to the Systolic Peaks (Ps) 910 and the minima following these typically correspond to the diastolic peaks/inflections. Based on the location of the diastolic peak in the second order derivative, the amplitude of the corresponding point on the NIR PPG signal is identified as Diastolic peak (Pd) 930. The location corresponding to the Diatric notch 920 is identified as the maxima immediately after the location identified as the systolic peak in the second order derivative. The minima of the entire NIR PPG signal would be the global minima (Pm) 940. In step 610, process 600 estimates the Systolic and Diastolic Blood Pressure with the following expressions,
Systolic Blood Pressure
Figure imgf000022_0002
( )
Diastolic Blood Pressure
Figure imgf000022_0001
Where Y1, Y2, Y3, Y4, Y5, Y6 are parameters of the regression curve which maps the
Heart Rate and NIR PPG signal amplitude features to the systolic and diastolic blood pressure, Y1 and Y4 are frequency-slope parameters for systolic and diastolic blood pressures respectively, Y2 and Y5 are amplitude-slope parameters for systolic and diastolic blood pressures respectively, Y3 and Y1 are off-set parameters for systolic and diastolic blood pressures respectively. Figure 11 illustrates a linear regression curve that can be used for estimation of blood pressure. The log in figure 11 refers to logarithm base-10 and In refers to logarithm base-e (natural log).
Process 700 ends after step 710.
Figure 8 illustrates a process flow 800 that is executed by the instructions of the processing unit to estimate Heart Rate (HR) and Respiration Rate (RR) in step 225 in process 200 in accordance with this invention. Process 800 begins with step 805 by applying fast fourier transform to the NIR PPG signal to transform the NIR PPG signal from time-domain to frequency domain.
In step 810, process 800 compute the local maxima of power spectral density (PSD). Specifically, PSD is computed by dividing the NIR PPG in frequency domain by the length of the NIR PPG signal (L). The local maxima of PSD is then computed by using any generic peak finder algorithms.
In step 820, process 800 determines the frequency range of the local maxima to use for computing the heart rate and respiration rate. For Heart Rate computations, the local maxima within the frequency range of between 0.8 and 2 Hz is used. For Respiration Rate computations, the local maxima within the frequency range of between 0.1 and 0.5 Hz is used. Hence, process 800 proceeds to step 830 to estimate the Heart Rate (HR) using the local maxima that is that is within the frequency range of between 0.8 and 2 Hz and to step 840 to estimate the Respiration Rate (RR) using the local maxima that is within the frequency range of between 0.1 and 0.5 Hz.
In step 830, process 800 estimates the Heart Rate (HR) with the following expression,
Heart Rate ( HR ) = 60 * fhr
Where fhr refers to the local maxima that is within the frequency range of between 0.8 and 2 Hz.
In step 840, process 800 estimates the Respiration Rate (RR) with the following expression,
Respiration Rate ( RR ) = 60 * frr
Where frr refers to the local maxima that is within the frequency range of between 0.1 and 0.5 Hz.
Process 800 ends after steps 830 and 840. The Heart rate, Respiration rate and Blood Pressure are estimated using NIR PPG signal derived from data captured by active depth camera. These estimates do not vary with the ambient lighting conditions. The blood pressure model works on the premise that the systolic and diastolic peaks identified which correspond to the respective heart activities. This method gives greater accuracy compared to just using the global maxima in the model. The estimation of Blood Oxygen Saturation is based on the ratio of range of Red PPG signal to range of NIR PPG signal.
In short, the method according to this invention involves using an active depth camera to capture simultaneous, time synchronized, video frames of both Color (RGB) and NIR wavelengths. From the video frames, a person’s face is detected and the region corresponding to their forehead is extracted (region of interest) and the pixel intensity values are averaged to form a time domain raw signal across multiple video frames. The time domain raw signal is detrended to remove higher order harmonics and resampled and filtered using Independent Component Analysis (ICA) and FIR bandpass filter to obtain the PPG signal. Vitals such as Pulse rate, Respiration Rate, Blood Oxygen saturation and Blood Pressure are estimated from the PPG signal using frequency analysis using FFT for the former two vitals and mathematical models for the latter two vitals. Processes 600-800 illustrate one embodiment of estimating Pulse rate, Respiration
Rate, Blood Oxygen saturation and Blood Pressure based on the Red PPG signal and NIR PPG signal. Another embodiment of estimating Pulse rate, Respiration Rate, Blood Oxygen saturation and Blood Pressure based on the Red PPG signal and NIR PPG signal is the use of a machine learning model to estimate vitals of the person from the the Red PPG signal and NIR PPG signal. One example of such machine learning model is the linear regression model. In another embodiment, processes 600-800 would be used for estimating Pulse rate, Respiration Rate, Blood Oxygen saturation and Blood Pressure based on the Red PPG signal and NIR PPG signal for at least a predetermined number of datasets is obtained in order to train the model for the machine learning model before the machine learning model takes over from processes 600-800.
The above is a description of exemplary embodiments of a system and method in accordance with this disclosure. It is foreseeable that those skilled in the art can and will design alternative system and method based on this disclosure.

Claims

Claims
1 . A system for estimating vitals of a person comprising: an image sensor adapted to capture a sequence of color (RGB) and near infrared
(NIR) images; a processing unit comprising a processor, memory and instructions stored on the memory and executable by the processor to: estimating vitals of the person from the sequence of RGB and NIR images.
2. The system according to claim 1 wherein the instruction to estimating vitals of the person from the sequence of RGB and NIR images comprises instructions to: time synchronising the sequence of RGB and NIR images; detecting region of interest within the sequence of RGB and NIR images; extracting raw signals from the detected region of interest; extracting photoplethysmography (PPG) signals from the raw signals; and estimating vitals of the person from the extracted PPG signals.
3. The system according to claim 2 wherein the instruction to detecting region of interest within the sequence of RGB and NIR images comprises instructions to: detecting a face of the person within each of the sequence of NIR images; in response to detecting the face of the person, detecting at least one region of interest within the detected face of the person; and mapping the detected region of interest in each of the sequence of NIR images to the corresponding sequence of RGB images.
4 The system according to claim 3 wherein the instruction to detecting at least one region of interest within the detected face of the person comprises instructions to: applying facial landmark detection where 81 prominent feature points on the face the person are detected; and overlaying a box at a top of the detected 81 prominent feature points between facial edge and eyebrows, wherein the box is one region of interest.
5 The system according to claim 4 wherein the instruction to detecting at least one region of interest within the detected face of the person further comprises instructions to: overlaying two boxes between facial edge and nose, wherein the two box are two other region of interests.
6. The system according to any one of claims 3-5 wherein the instruction to detecting region of interest within the sequence of RGB and NIR images comprises instructions to: computing an average pixel intensity of the detected region of interest for each of a plurality of frequency bands including Red, Green and Blue from the sequence of RGB images and NIR frequency band from the sequence of NIR images; and appending the average pixel intensity value of each frequency band to the pixel intensity value of corresponding frequency band in respective RGB and NIR images to form a time series for each frequency band.
7. The system according to claim 6 wherein the instruction to detecting region of interest within the sequence of RGB and NIR images further comprises instructions to: determining if a length of the time series is equal to a predetermined length, /.; in response to the length of the time series being equal to the predetermined length, L, normalising the pixel intensity value in the time series for each frequency band.
8. The system according to claim 7 wherein the instruction to extracting PPG signals from the raw signals comprises instructions to: applying a moving average filter to the pixel intensity value in the time series for each frequency band, the moving average filter has a window length of <¾ * L; up-sampling the pixel intensity value in the time series for each frequency band using cubic spline interpolation method to form a new length, Lå, where L2 = a2 * L applying a detrending algorithm based on smoothness priors’ formulation for the upsampled pixel intensity value in the time series for each frequency band wherein a smoothness factor, L, of the detrending algorithm is set based on the following expression, L — oc2 * L2 , and applying Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band, wherein values of alr a2, a3 &.L are calibrated based on the imaging sensor.
9. The system according to claim 8 wherein the instruction to applying Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band comprises instructions to: grouping the detrended pixel intensity value in the time series for each frequency band into a set of Red PPG signal and a set of NIR PPG signals wherein the Red PPG signal is derived from the frequency bands of Red, Green and Blue and the NIR PPG signal is derived from the frequency bands of NIR, Green and Blue; applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the
Red PPG signal; and applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the NIR PPG signal.
10. The system according to claim 9 wherein the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to use a machine learning model based on a linear regression model.
11. The system according to claim 9 wherein the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: computing Ratio (R) of time-domain range for NIR PPG signal and Red PPG signal wherein the Ratio (R) can be expressed with the following expression,
Figure imgf000028_0001
where amplitude range of Red PPG signal relates to difference between minimum and maximum values of the amplitude of the Red PPG signal in time domain, arithmetic mean of Red PPG signal relates to arithmetic mean of the Red PPG signal in between minimum and maximum values of the amplitude of the Red PPG signal in time domain, amplitude range of NIR PPG signal relates to difference between minimum and maximum values of the amplitude of the NIR PPG signal in time domain and arithmetic mean of NIR PPG signal relates to arithmetic mean of the NIR PPG signal in between minimum and maximum values of the amplitude of the NIR PPG signal in time domain; and estimating the Blood Oxygen Saturation with the following expression,
Blood Oxygen Saturation ( SPO2 ) = β1 + β2 * R refers to off-set parameter and β2 refers to slope parameter of the linear regression curve which maps the Ratio (R) to the blood oxygen saturation (SPO2).
12. The system according to claim 9 or 10 wherein the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: detecting Systolic Peak (Ps), Diastolic peak (Pd), and global minima (Pm) from the time-domain NIR PPG signal; estimating the Systolic and Diastolic Blood Pressure with the following expressions, Systolic Blood Pressure
Figure imgf000028_0002
Diastolic Blood Pressure (
Figure imgf000028_0003
where Y1, Y2, Y3, Y4, Y5, Y6 are parameters of the regression curve which maps the Heart Rate and NIR PPG signal amplitude features to the systolic and diastolic blood pressure, Y1 and Y4 are frequency-slope parameters for systolic and diastolic blood pressures respectively, Y2 and Y5 are amplitude-slope parameters for systolic and diastolic blood pressures respectively, Y3 and Y6 are off-set parameters for systolic and diastolic blood pressures respectively.
13. The system according to claim 12 wherein the instruction to detecting Systolic Peak (Ps), Diastolic peak (Pd), and global minima (Pm) from the time-domain NIR PPG signal comprises instructions to: applying a peak detection algorithm on the NIR PPG signal to obtain a second order differential; identifying a global maxima in the NIR PPG signal as the Systolic Peak (Ps), a peak in the NIR PPG signal corresponding to a minima following the Systolic Peak (Ps) in the second order derivative as Diastolic peak (Pd), a maxima immediately after the systolic peak in the second order derivative as the Diatric notch, a minima of the entire NIR PPG signal as the global minima (Pm).
14. The system according to any one of claims 7-13 wherein the instruction to estimating vitals of the person from the extracted PPG signals comprises instructions to: applying fast fourier transform to the NIR PPG signal to transform the NIR PPG signal from time-domain to frequency domain; computing a plurality of local maxima of a power spectral density (PSD) of the NIR PPG signal in frequency domain; in response to a local maxima between the frequency range of 0.8 and 2 Hz, estimating a Heart Rate (HR) with the following expression,
Heart Rate ( HR ) = 60 * fhr where fhr refers to the local maxima that is between the frequency range of 0.8 and 2 Hz; and in response to a local maxima between the frequency range of 0.1 and 0.5 Hz, estimating a Respiration Rate (RR) with the following expression, Respiration Rate ( RR ) = 60 * frr where frr refers to the local maxima that is between the frequency range of 0.1 and
0.5 Hz.
15. A method of estimating vitals of a person from a sequence of RGB and NIR images comprising: time synchronising the sequence of RGB and NIR images; detecting region of interest within the sequence of RGB and NIR images; extracting raw signals from the detected region of interest; extracting photoplethysmography (PPG) signals from the raw signals; and estimating vitals of the person from the extracted PPG signals.
16. The method according to claim 15 wherein the step of detecting region of interest within the sequence of RGB and NIR images comprises: detecting a face of the person within each of the sequence of NIR images; in response to detecting the face of the person, detecting at least one region of interest within the detected face of the person; and mapping the detected region of interest in each of the sequence of NIR images to the corresponding sequence of RGB images.
17. The method according to claim 16 wherein the step of detecting at least one region of interest within the detected face of the person comprises: applying facial landmark detection where 81 prominent feature points on the face the person are detected; and overlaying a box at a top of the detected 81 prominent feature points between facial edge and eyebrows, wherein the box is one region of interest.
18. The method according to claim 17 wherein the step of detecting at least one region of interest within the detected face of the person further comprises: overlaying two boxes between facial edge and nose, wherein the two box are two other region of interests.
19. The method according to any one of claims 16-18 wherein the step of detecting region of interest within the sequence of RGB and NIR images comprises: computing an average pixel intensity of the detected region of interest for each of a plurality of frequency bands including Red, Green and Blue from the sequence of RGB images and NIR frequency band from the sequence of NIR images; and appending the average pixel intensity value of each frequency band to the pixel intensity value of corresponding frequency band in respective RGB and NIR images to form a time series for each frequency band.
20. The method according to claim 19 wherein the step of detecting region of interest within the sequence of RGB and NIR images further comprises: determining if a length of the time series is equal to a predetermined length, /.; in response to the length of the time series being equal to the predetermined length, L, normalising the pixel intensity value in the time series for each frequency band.
21. The method according to claim 20 wherein the step of extracting PPG signals from the raw signals comprises: applying a moving average filter to the pixel intensity value in the time series for each frequency band, the moving average filter has a window length of <¾ * L; up-sampling the pixel intensity value in the time series for each frequency band using cubic spline interpolation method to form a new length, Lå, where L2 = a2 * L applying a detrending algorithm based on smoothness priors’ formulation for the upsampled pixel intensity value in the time series for each frequency band wherein a smoothness factor, L, of the detrending algorithm is set based on the following expression, L — oc2 * L2 , and applying Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band, wherein values of alr a2, a3 &.L are calibrated based on the imaging sensor.
22. The method according to claim 21 wherein the step of applying Independent Component Analysis (ICA) to extract PPG signal from the detrended pixel intensity value in the time series for each frequency band comprises: grouping the detrended pixel intensity value in the time series for each frequency band into a set of Red PPG signal and a set of NIR PPG signals wherein the Red PPG signal is derived from the frequency bands of Red, Green and Blue and the NIR PPG signal is derived from the frequency bands of NIR, Green and Blue; applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the
Red PPG signal; and applying a band pass filter with cutoff frequency of between 0.16 and 2 Hz to the NIR PPG signal.
23. The method according to claim 22 wherein the step of estimating vitals of the person from the extracted PPG signals comprises: computing Ratio (R) of time-domain range for NIR PPG signal and Red PPG signal wherein the Ratio (R) can be expressed with the following expression,
Figure imgf000033_0001
where amplitude range of Red PPG signal relates to difference between minimum and maximum values of the amplitude of the Red PPG signal in time domain, arithmetic mean of Red PPG signal relates to arithmetic mean of the Red PPG signal in between minimum and maximum values of the amplitude of the Red PPG signal in time domain, amplitude range of NIR PPG signal relates to difference between minimum and maximum values of the amplitude of the NIR PPG signal in time domain and arithmetic mean of NIR PPG signal relates to arithmetic mean of the NIR PPG signal in between minimum and maximum values of the amplitude of the NIR PPG signal in time domain; estimating the Blood Oxygen Saturation with the following expression,
Blood Oxygen Saturation ( SPO2 ) = β1 + β2 * R where β1 refers to off-set parameter and β2 refers to slope parameter of the linear regression curve which maps the Ratio (R) to the blood oxygen saturation (SPO2).
24. The method according to claim 22 or 23 wherein the step of estimating vitals of the person from the extracted PPG signals comprises: detecting Systolic Peak (Ps), Diastolic peak ( Pd ), and global minima (Pm) from the time-domain NIR PPG signal; estimating the Systolic and Diastolic Blood Pressure with the following expressions, Systolic Blood Pressure (SBP)
Figure imgf000033_0002
Diastolic Blood Pressure (DBP)
Figure imgf000033_0003
where Y1; Y2- Y3- Y4- Y5- Y6 are parameters of the regression curve which maps the Heart Rate and NIR PPG signal amplitude features to the systolic and diastolic blood pressure, yt and y4 are frequency-slope parameters for systolic and diastolic blood pressures respectively, y2 and g5 are amplitude-slope parameters for systolic and diastolic blood pressures respectively, y3 and y6 are off-set parameters for systolic and diastolic blood pressures respectively.
25. The method according to claim 24 wherein the step of detecting Systolic Peak (Ps), Diastolic peak (Pd), and global minima (Pm) from the time-domain NIR PPG signal comprises: applying a peak detection algorithm on the NIR PPG signal to obtain a second order differential; identifying a global maxima in the NIR PPG signal as the Systolic Peak (Ps), a peak in the NIR PPG signal corresponding to a minima following the Systolic Peak (Ps) in the second order derivative as Diastolic peak (Pd), a maxima immediately after the systolic peak in the second order derivative as the Diatric notch, a minima of the entire NIR PPG signal as the global minima (Pm).
26. The method according to any one of claims 20-25 wherein the step of estimating vitals of the person from the extracted PPG signals comprises: applying fast fourier transform to the NIR PPG signal to transform the NIR PPG signal from time-domain to frequency domain; computing a plurality of local maxima of a power spectral density (PSD) of the NIR PPG signal in frequency domain; in response to a local maxima between the frequency range of 0.8 and 2 Hz, estimating a Heart Rate (HR) with the following expression, Heart Rate ( HR ) = 60 * fhr where fhr refers to the local maxima that is between the frequency range of 0.8 and
2 Hz; and in response to a local maxima between the frequency range of 0.1 and 0.5 Hz, estimating a Respiration Rate (RR) with the following expression,
Respiration Rate ( RR ) = 60 * frr where frr refers to the local maxima that is between the frequency range of 0.1 and 0.5 Hz.
27. The method according to claim 22 wherein the step of estimating vitals of the person from the extracted PPG signals comprises applying a machine learning model based on a linear regression model.
PCT/SG2021/050366 2021-02-16 2021-06-24 A system and method for measuring vital body signs WO2022177501A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG10202101561W 2021-02-16
SG10202101561W 2021-02-16

Publications (1)

Publication Number Publication Date
WO2022177501A1 true WO2022177501A1 (en) 2022-08-25

Family

ID=82932307

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/SG2021/050366 WO2022177501A1 (en) 2021-02-16 2021-06-24 A system and method for measuring vital body signs

Country Status (1)

Country Link
WO (1) WO2022177501A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE2250253A1 (en) * 2022-02-25 2023-08-26 Detectivio Ab Non-contact oxygen saturation estimation

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110251493A1 (en) * 2010-03-22 2011-10-13 Massachusetts Institute Of Technology Method and system for measurement of physiological parameters
KR20120083994A (en) * 2011-01-19 2012-07-27 임경근 Hairstyle simulation system and method
CN109008964A (en) * 2018-06-27 2018-12-18 浏阳市安生智能科技有限公司 A kind of method and device that physiological signal extracts
US20190008402A1 (en) * 2014-10-04 2019-01-10 Government Of The United States, As Represented By The Secretary Of The Air Force Non-Contact Assessment of Cardiovascular Function using a Multi-Camera Array
WO2019145142A1 (en) * 2018-01-24 2019-08-01 Koninklijke Philips N.V. Device, system and method for determining at least one vital sign of a subject
CN110279406A (en) * 2019-05-06 2019-09-27 苏宁金融服务(上海)有限公司 A kind of touchless pulse frequency measurement method and device based on camera
US20200178809A1 (en) * 2017-08-08 2020-06-11 Koninklijke Philips N.V. Device, system and method for determining a physiological parameter of a subject
WO2020126713A1 (en) * 2018-12-19 2020-06-25 Koninklijke Philips N.V. System and method for determining at least one vital sign of a subject
US20200367773A1 (en) * 2017-08-08 2020-11-26 Koninklijke Philips N.V. Device, system and method for determining a physiological parameter of a subject
WO2021100994A1 (en) * 2019-11-21 2021-05-27 주식회사 지비소프트 Non-contact method for measuring biological index

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110251493A1 (en) * 2010-03-22 2011-10-13 Massachusetts Institute Of Technology Method and system for measurement of physiological parameters
KR20120083994A (en) * 2011-01-19 2012-07-27 임경근 Hairstyle simulation system and method
US20190008402A1 (en) * 2014-10-04 2019-01-10 Government Of The United States, As Represented By The Secretary Of The Air Force Non-Contact Assessment of Cardiovascular Function using a Multi-Camera Array
US20200178809A1 (en) * 2017-08-08 2020-06-11 Koninklijke Philips N.V. Device, system and method for determining a physiological parameter of a subject
US20200367773A1 (en) * 2017-08-08 2020-11-26 Koninklijke Philips N.V. Device, system and method for determining a physiological parameter of a subject
WO2019145142A1 (en) * 2018-01-24 2019-08-01 Koninklijke Philips N.V. Device, system and method for determining at least one vital sign of a subject
CN109008964A (en) * 2018-06-27 2018-12-18 浏阳市安生智能科技有限公司 A kind of method and device that physiological signal extracts
WO2020126713A1 (en) * 2018-12-19 2020-06-25 Koninklijke Philips N.V. System and method for determining at least one vital sign of a subject
CN110279406A (en) * 2019-05-06 2019-09-27 苏宁金融服务(上海)有限公司 A kind of touchless pulse frequency measurement method and device based on camera
WO2021100994A1 (en) * 2019-11-21 2021-05-27 주식회사 지비소프트 Non-contact method for measuring biological index

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ALGHOUL KARIM; ALHARTHI SAEED; AL OSMAN HUSSEIN; EL SADDIK ABDULMOTALEB: "Heart Rate Variability Extraction From Videos Signals: ICA vs. EVM Comparison", IEEE ACCESS, IEEE, USA, vol. 5, 1 January 1900 (1900-01-01), USA , pages 4711 - 4719, XP011646924, DOI: 10.1109/ACCESS.2017.2678521 *
F. TSALAKANIDOU ; S. MALASSIOTIS: "Robust facial action recognition from real-time 3D streams", COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, 2009. CVPR WORKSHOPS 2009. IEEE COMPUTER SOCIETY CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 20 June 2009 (2009-06-20), Piscataway, NJ, USA , pages 4 - 11, XP031606940, ISBN: 978-1-4244-3994-2 *
TARVAINEN M. P. ET AL.: "An advanced detrending method with application to HRV analysis", IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, vol. 49, no. 2, 7 August 2002 (2002-08-07), pages 172 - 175, [retrieved on 20211006], DOI: 10.1109/10.979357 *
XIONG P. ET AL.: "Combining Local and Global Features for 3D Face Tracking", PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV, 29 October 2017 (2017-10-29), pages 2529 - 2536, XP033303724, [retrieved on 20211006], DOI: 10.1109/ICCVW.2017.297 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SE2250253A1 (en) * 2022-02-25 2023-08-26 Detectivio Ab Non-contact oxygen saturation estimation
WO2023163644A1 (en) * 2022-02-25 2023-08-31 Detectivio Ab Non-contact oxygen saturation estimation using ambient light
SE545755C2 (en) * 2022-02-25 2024-01-02 Detectivio Ab Non-contact oxygen saturation estimation

Similar Documents

Publication Publication Date Title
Tasli et al. Remote PPG based vital sign measurement using adaptive facial regions
McDuff et al. Remote detection of photoplethysmographic systolic and diastolic peaks using a digital camera
AU2016201690B2 (en) Method and system for noise cleaning of photoplethysmogram signals
US9659229B2 (en) Method and system for signal analysis
US10448900B2 (en) Method and apparatus for physiological monitoring
US20110251493A1 (en) Method and system for measurement of physiological parameters
Sinhal et al. An overview of remote photoplethysmography methods for vital sign monitoring
KR101738278B1 (en) Emotion recognition method based on image
KR102215557B1 (en) Heart rate estimation based on facial color variance and micro-movement
US20200015688A1 (en) Blood pressure measurement method, device and storage medium
JP6717424B2 (en) Heart rate estimation device
Alnaggar et al. Video-based real-time monitoring for heart rate and respiration rate
WO2022177501A1 (en) A system and method for measuring vital body signs
EP3318179B1 (en) Method for measuring respiration rate and heart rate using dual camera of smartphone
Kurita et al. Non-contact video based estimation for heart rate variability spectrogram using ambient light by extracting hemoglobin information
Jensen et al. Camera-based heart rate monitoring
US20230000376A1 (en) System and method for physiological measurements from optical data
US20230397826A1 (en) Operation method for measuring biometric index of a subject
US20200155008A1 (en) Biological information detecting apparatus and biological information detecting method
Wu et al. Peripheral oxygen saturation measurement using an RGB camera
Wuerich et al. A feature-based approach on contact-less blood pressure estimation from video data
CN114869260A (en) Heart rate detection method for compressed video
Ozawa et al. Improving the accuracy of noncontact blood pressure sensing using near-infrared light
CN112784731A (en) Method for detecting physiological indexes of driver and establishing model
JP2021023490A (en) Biological information detection device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21926958

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21926958

Country of ref document: EP

Kind code of ref document: A1