CN111597891B - Heart rate detection method based on multi-scale video - Google Patents

Heart rate detection method based on multi-scale video Download PDF

Info

Publication number
CN111597891B
CN111597891B CN202010285626.7A CN202010285626A CN111597891B CN 111597891 B CN111597891 B CN 111597891B CN 202010285626 A CN202010285626 A CN 202010285626A CN 111597891 B CN111597891 B CN 111597891B
Authority
CN
China
Prior art keywords
scale
heart rate
signal
roi
video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010285626.7A
Other languages
Chinese (zh)
Other versions
CN111597891A (en
Inventor
赵昶辰
韩蔚然
冯远静
赵志明
梅培义
居峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202010285626.7A priority Critical patent/CN111597891B/en
Publication of CN111597891A publication Critical patent/CN111597891A/en
Application granted granted Critical
Publication of CN111597891B publication Critical patent/CN111597891B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
    • A61B5/024Detecting, measuring or recording pulse rate or heart rate
    • A61B5/02416Detecting, measuring or recording pulse rate or heart rate using photoplethysmograph signals, e.g. generated by infrared radiation
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4084Scaling of whole images or parts thereof, e.g. expanding or contracting in the transform domain, e.g. fast Fourier transform [FFT] domain scaling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30101Blood vessel; Artery; Vein; Vascular
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Surgery (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Biophysics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Cardiology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Artificial Intelligence (AREA)
  • Pathology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Biomedical Technology (AREA)
  • Physiology (AREA)
  • Signal Processing (AREA)
  • Psychiatry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Ultra Sonic Daignosis Equipment (AREA)
  • Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)

Abstract

A heart rate detection method based on multi-scale video comprises the following steps: step 1, establishing a video pyramid: on the basis of the original tracking frame, the size of the region of interest (ROI) is reduced downwards, and the size of the region of interest is enlarged upwards; step 2, blood volume pulse BVP signal extraction: the blood volume pulse signals need to be extracted from a multi-scale channel, the multi-scale signal fusion algorithm is popularization and optimization of a skin tone orthogonal plane method, and when the scale features of the pyramid are in one layer, the proposed heart rate extraction method is the skin tone orthogonal plane method; step 3, multi-scale signal fusion: and the heart rate signal characteristics of the channels with multiple scales are subjected to signal fusion by adopting Gaussian prior convex combination, and the heart rate value is finally obtained by signal processing of the fused signals with multiple scales. The invention extracts rich heart rate characteristics on different scales by changing the frequency, and fuses the characteristics so as to improve the heart rate detection precision.

Description

Heart rate detection method based on multi-scale video
Technical Field
The invention relates to the fields of video heart rate detection, computer vision and signal processing.
Background
In recent years, remote photoplethysmography (rpg) based on optical and physiological principles has been rapidly developed, which is a technique for measuring Blood Volume Pulse (BVP) and heart rate in a non-contact manner, and has a very wide range of applications. In visible light, the remote photoplethysmography adopts a consumer-level digital camera to measure heart rate, thereby expanding the application range of pulse measurement. The selection of a facial region of interest (ROI) is a key issue for the system, and the selection of the location of the region of interest directly affects the quality of the measurement signal. Existing studies indicate that the cheek, lip and chin regions of the face contain more abundant capillaries and have better pulse signal intensity than other regions. However, areas of higher pulsation intensity are not necessarily suitable for original signal extraction for remote photoplethysmography, as these areas may be disturbed by non-rigid movements like blinking, speaking, smiling, etc. Existing video heart rate detection methods all extract heart rate signals based on a single region of interest, however the heart rate signal characteristics of a single region of interest are always limited.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a heart rate detection method based on a multi-scale video, and the basic idea of the method is to extract heart rate signal characteristics on different scales in the video by changing frequency and fuse the characteristics so as to improve heart rate detection accuracy.
The invention adopts the technical scheme that:
a heart rate detection method based on multi-scale video, the method comprising the steps of:
step 1, establishing a video pyramid
On the basis of the original tracking frame, on one hand, the size of the region of interest (ROI) is reduced downwards, and on the other hand, the size of the region of interest (ROI) is enlarged upwards;
step 2, blood volume pulse BVP Signal extraction
The blood volume pulse signals need to be extracted from a multi-scale channel, a multi-scale signal fusion algorithm is popularization and optimization of a skin tone orthogonal plane method POS, and when the scale characteristics of the pyramid are in the condition of only one layer, the proposed heart rate extraction method is POS;
step 3, multi-scale signal fusion
And the heart rate signal characteristics of the channels with multiple scales are subjected to signal fusion by adopting Gaussian prior convex combination, and the heart rate value is finally obtained by signal processing of the fused signals with multiple scales.
Further, in the step 1, the key of creating the video pyramid is a multi-scale region of interest of the face, and ω is set l 、h l For the width and height of the level I facial region of interest ROI, the multi-scale facial ROI is defined as follows:
the number of levels of the pyramid, l= -1,0,1 …, where ω 0 、h 0 For the width and height of the layer 0 region of interest ROI, the facial ROIs of different scales share the same center point (C x ,C y ) The ROI of layer 0 is defined as a bounding box immediately surrounding the face contour, which can be obtained by commonly used face detectors and trackers; to construct a multiscale facial ROI, on the one hand, the initial ROI size is halved stepwise, i.e. i=1, 2,3 … these ROIs cover skin areas and do not involve background pixels; since these regions of interest cover different areas of the face, there are correspondingly different color changes; on the other hand, the size of the initial ROI is enlarged, i.e. let l= -1, in order to apply the signalWhen an algorithm is extracted, taking the pixel gray level of a motion background in a video into consideration;
the pixels in the facial multiscale ROI are converted into the original rpg trajectory by spatial averaging, calculated by:
the number of pyramid levels in the formula, i= -1,0,1 …, where I c (x, y, t) represents the gray scale of the pixel of the t-th frame with coordinates (x, y), c ε { R, G, B } represents the color channel, R l (t) represents a region of interest ROI of the t-th frame of the first layer, area (R) l (t)) is denoted as R l Total number of pixels in (t). The ROI tracking frame of each frame of the video is determined by the face tracker, and the original rpg trajectory is obtained by averaging the pixel gray levels I l,C (t) is calculated by linking to the whole t and is denoted as I l (t)。
Still further, in the step 2, the blood volume pulse BVP signal of each layer scale needs to be extracted from the trajectory of remote photoplethysmography rpg, respectively, and with the help of the facial multiscale ROI, the motion artifact is partially separated, as follows:
using the skin tone orthogonal plane method POS, it defines a vector of correlations [1,1 ]] T Orthogonal projection planes to eliminate skin tone dependence, to separate BVP signals and motion artifacts, the original trajectories are projected onto two vectors on this plane, respectively, signal processing is performed by alpha-tuning to finally obtain BVP signals, POS is applied to the original tracking trajectories I of each level of the video pyramid l (t), therefore, the original trace I is temporarily omitted from the following formula l The number of levels/of (t), which is simply denoted as I (t);
the original tracking is first processed by a time normalization,
I n (t)=N·I(t) (3)
wherein the method comprises the steps ofIs a diagonal matrix whose ith diagonal gives the inverse of the ith row mean of I, namely:
N ii =1/μ(I i ) (4)
the time-normalized trajectory is then projected onto two vectors defined by a projection matrix, P p =[0,1,-1;-2,1,1]Wherein each row represents a mutually orthogonal projection axis, the projection signal is expressed as:
S 1 (t)=I nG (t)-I nB (t) (5)
S 2 (t)=I nG (t)+I nB (t)-2I nR (t) (6)
in order to separate the specular reflection and the pulse signal component, S 1 (t) and S 2 (t) alpha-tuning treatment is required,
x(t)=S 1 (t)+αS 2 (t) (7)
wherein α=σ (S 1 (t))/σ(S 2 (t)), while σ (·) represents the standard deviation, x (t) is the extracted BVP signal, also known as the POS feature, applying a POS algorithm to each level of the video pyramid results in an L-level BVP signal, where L represents the total number of layers of the video pyramid, and x is l (t) POS features x (t) extracted for the first layer;
linear POS operation can only extract limited pure BVP signals in all sports and recording environments. While with the help of the multiscale facial ROI, motion artifacts are partially separated.
Furthermore, the blood volume pulse BVP signals of each layer scale obtained in the step 1 and the step 2 need to be fused by a convex combination of gaussian priors, and the process is as follows:
the final pulse signal is calculated by fusing the candidate POS features extracted from the multi-scale trajectory, since the candidate POS features are assumed to be complementary, we attribute the signal fusion problem to feature combinations instead of feature selection, and for this purpose, convex combinations are used:
λ l representing the weight of the first hierarchical scale and satisfying the relation of the hierarchical weights and equal to 1, i.e. Σ l λ l The next most critical step, i.e., =1, is to determine the weight of each level, POS features of different levels have different heart rate energies, larger weights should be assigned to those levels with stronger heart rate energies, weights are determined using gaussian priors,
l= -1,0,1,2,3..represents the first layer scale, μ 0 Sum sigma 0 The exponential levels representing center and standard deviation, respectively, the gaussian priors are based on the pulse intensity in the middle layer being greater than the intensities in the lower and higher layers;
for long-time detection videos, windowing outputs are connected in series to obtain a long-time heart rate signal, a video with the length of N is given, a sequence is firstly divided into segments with the length of T, a proposed algorithm is applied to obtain windowed outputs, and an overlap-add method is applied to obtain final heart rate outputs.
The beneficial effects of the invention are as follows: and the heart rate detection accuracy is improved.
Detailed Description
The present invention will be described in further detail below.
A heart rate detection method based on multi-scale video, the method comprising the steps of:
step 1, establishing a video pyramid
The key to creating a video pyramid is a multi-scale facial region of interest, set ω l 、h l For the width and height of the level I facial region of interest ROI, the multi-scale facial ROI is defined as follows:
the number of levels of the pyramid, l= -1,0,1 …, where ω 0 、h 0 Is the width and height of the layer 0 region of interest ROI. Facial ROIs of different scales share the same center point (C x ,C y ) The ROI of layer 0 is defined as a bounding box immediately surrounding the face contour, which can be obtained by a commonly used face detector and tracker, in order to construct a multi-scale facial ROI, on the one hand, the initial ROI size is halved stepwise, i.e. i=1, 2,3, … these ROIs mainly cover skin areas and do not involve background pixels, as these areas of interest cover different areas of the face, and correspondingly there are also different color changes; on the other hand, the size of the initial ROI is enlarged, i.e. let l= -1, in order to take into account the pixel gray-scale of the moving background in the video when applying the signal extraction algorithm;
the pixels in the facial multiscale ROI are converted into the original rpg trajectory by spatial averaging, calculated by:
the number of pyramid levels in the formula, i= -1,0,1 …, where I c (x, y, t) represents the gray scale of the pixel of the t-th frame with coordinates (x, y), c ε { R, G, B } represents the color channel, R l (t) represents a region of interest ROI of the t-th frame of the first layer, area (R) l (t)) is denoted as R l Total number of pixels in (t). The ROI tracking frame of each frame of the video is determined by the face tracker, and the original rpg trajectory is obtained by averaging the pixel gray levels I l,C (t) is calculated by linking to the whole t and is denoted as I l (t);
Step 2 Blood Volume Pulse (BVP) Signal extraction
The Blood Volume Pulse (BVP) signal needs to be extracted from the trajectory of multiscale remote photoplethysmography (rPPG), where we use skin tone positive which has been widely adopted by various rPPG research instituteAn intersection plane method POS, which defines an and vector [1,1 ]] T Orthogonal projection planes to eliminate skin tone dependence, to separate BVP signals and motion artifacts, the original trajectories are projected onto two vectors on this plane, respectively, signal processing is performed by alpha-tuning to finally obtain BVP signals, POS is applied to the original tracking trajectories I of each level of the video pyramid l (t), therefore, the original trace I is temporarily omitted from the following formula l The number of levels of (t), I (t), is simply denoted as I (t).
The original trajectory is first time-normalized,
I n (t)=N·I(t) (3)
wherein the method comprises the steps ofIs a diagonal matrix whose ith diagonal gives the inverse of the ith row mean of I, namely:
N ii =1/μ(I i ) (4)
the time-normalized trajectory is then projected onto two vectors defined by a projection matrix, P p =[0,1,-1;-2,1,1]Wherein each row represents a projection axis that is orthogonal to each other. The projection signal can be expressed as:
S 1 (t)=I nG (t)-I nB (t) (5)
S 2 (t)=I nG (t)+I nB (t)-2I nR (t) (6)
in order to separate the specular reflection and the pulse signal component, S 1 (t) and S 2 (t) alpha-tuning treatment is required,
x(t)=S 1 (t)+αS 2 (t) (7)
wherein α=σ (S 1 (t))/σ(S 2 (t)), while σ (·) represents the standard deviation, x (t) is the extracted BVP signal, also known as the POS feature, applying a POS algorithm to each level of the video pyramid results in an L-level BVP signal, where L represents the total number of layers of the video pyramid, and x is l (t) POS features x (t) extracted for the first layer;
the application of POS in multi-scale tracking is popularization of an original POS algorithm, and the purpose of multi-scale POS feature extraction is to facilitate pulse extraction. In conventional single-scale extraction, all motion artifacts are accompanied by the BVP signal. Linear POS operation can only extract limited pure BVP signals in all sports and recording environments. While with the help of a multi-scale facial ROI, motion artifacts may be partially separated;
step 3. Multi-Scale Signal fusion
The final pulse signal is calculated by fusing the candidate POS features extracted from the multi-scale trajectory, since the candidate POS features are assumed to be complementary, we attribute the signal fusion problem to feature combinations instead of feature selection, and for this purpose, convex combinations are used:
λ l representing the weight of the first hierarchical scale and satisfying the relation of the hierarchical weights and equal to 1, i.e. Σ l λ l The next most critical step, i.e., =1, is to determine the weight of each level, POS features of different levels have different heart rate energies, larger weights should be assigned to those levels with stronger heart rate energies, weights are determined using gaussian priors,
l= -1,0,1,2,3..represents the first layer scale, u 0 Sum sigma 0 The exponential levels representing center and standard deviation, respectively, the gaussian priors are based on the pulse intensity in the middle layer being greater than the intensities in the lower and higher layers;
the operations discussed above are all within the scope of a time window. For long-time detection videos, window outputs are connected in series to obtain a long-time heart rate signal; specifically, given a video of length N, we first split the sequence into segments of length T, apply the proposed algorithm to obtain windowed output, and apply the Overlap-add (overlay) method to obtain the final heart rate output.

Claims (3)

1. A heart rate detection method based on multi-scale video, the method comprising the steps of:
step 1, establishing a video pyramid
On the basis of the original tracking frame, on one hand, the size of the region of interest (ROI) is reduced downwards, and on the other hand, the size of the region of interest (ROI) is enlarged upwards;
step 2, blood volume pulse BVP Signal extraction
The blood volume pulse signals need to be extracted from a multi-scale channel, a multi-scale signal fusion algorithm is popularization and optimization of a skin tone orthogonal plane method POS, and when the scale characteristics of the pyramid are in a single layer, the proposed heart rate extraction method is the skin tone orthogonal plane method POS;
step 3, multi-scale signal fusion
The heart rate signal characteristics of the multiple scale channels are subjected to signal fusion by adopting Gaussian prior convex combination, and the heart rate value is finally obtained by signal processing of the fusion signals of the multiple scales;
the blood volume pulse BVP signals of each layer scale obtained in the step 1 and the step 2 need to be fused through a Gaussian prior convex combination, and the process is as follows:
the final pulse signal is calculated by fusing the candidate POS features extracted from the multi-scale trajectory, and since the candidate POS features are assumed to be complementary, the signal fusion problem is attributed to feature combinations rather than feature selection, and for this purpose, convex combinations are used:
l is the number of layers of the pyramid, lambda l Representing the weight of the first hierarchical scale and satisfying the relation of the hierarchical weights and equal to 1, i.e. Σ l λ l =1,x l (t) POS features x (t) extracted for the first tier, the next most critical step being to determine the weights of each level, POS features of different tiers having different heart rate energies, larger weights being assigned to those tiers having stronger heart rate energies, weights being determined using Gaussian priors,
l= -1,0,1,2,3 … represents the layer l scale, μ 0 Sum sigma 0 The exponential levels representing center and standard deviation, respectively, the gaussian priors are based on the pulse intensity in the middle layer being greater than the intensities in the lower and higher layers;
for long-time detection videos, windowing outputs are connected in series to obtain a long-time heart rate signal, a video with the length of N is given, a sequence is firstly divided into segments with the length of T, a proposed algorithm is applied to obtain windowed outputs, and an overlap-add method is applied to obtain final heart rate outputs.
2. The multi-scale video-based heart rate detection method of claim 1, wherein: in the step 1, the key point of establishing the video pyramid is a multi-scale face region of interest, and omega is set l 、h l For the width and height of the level I facial region of interest ROI, the multi-scale facial ROI is defined as follows:
the number of levels of the pyramid, l= -1,0,1 …, where ω 0 、h 0 For the width and height of the layer 0 region of interest ROI, faces of different dimensionsThe partial ROIs share the same center point (C x ,C y ) The ROI of layer 0 is defined as a bounding box immediately surrounding the face contour, which can be obtained by commonly used face detectors and trackers; to construct a multiscale facial ROI, on the one hand, the initial ROI size is halved stepwise, i.e. i=1, 2,3 … these ROIs cover skin areas and do not involve background pixels; since these regions of interest cover different areas of the face, there are correspondingly different color changes; on the other hand, the size of the initial ROI is enlarged, i.e. let l= -1, in order to take into account the pixel gray-scale of the moving background in the video when applying the signal extraction algorithm;
the pixels in the facial multiscale ROI are converted into the original rpg trajectory by spatial averaging, calculated by:
the number of pyramid levels in the formula, i= -1,0,1 …, where I c (x, y, t) represents the gray scale of the pixel of the t-th frame with coordinates (x, y), c ε { R, G, B } represents the color channel, R l (t) represents a region of interest ROI of the t-th frame of the first layer, area (R) l (t)) is denoted as R l Total number of pixels in (t), ROI tracking frame of each frame of video is determined by face tracker, original rpg trajectory is obtained by averaging pixel gray level I l,C (t) is calculated by linking to the whole t and is denoted as I l (t)。
3. A multi-scale video-based heart rate detection method as claimed in claim 1 or 2, wherein: in the step 2, the blood volume pulse BVP signal of each layer scale needs to be extracted from the trajectory of remote photoplethysmography rpg, and the motion artifact is partially separated with the help of the facial multiscale ROI, as follows:
using the skin tone orthogonal plane method POS, it defines a vector of correlations [1,1 ]] T Orthogonal projection planes to eliminate skin tone dependence, to separate BVP signals and motion artifacts, the original trajectory is splitOnto the two vectors respectively projected onto this plane, signal processing is performed by α -tuning, so as to finally obtain the BVP signal, and POS is applied to the original tracking trajectory I of each level of the video pyramid l (t), therefore, the original trace I is temporarily omitted from the following formula l The number of levels/of (t), which is simply denoted as I (t);
the original tracking is first processed by a time normalization,
I n (t)=N·I(t) (3)
wherein the method comprises the steps ofIs a diagonal matrix whose ith diagonal gives the inverse of the ith row mean of I, namely:
N ii =1/μ(I i ) (4)
the time-normalized trajectory is then projected onto two vectors defined by a projection matrix, P p =[0,1,-1;-2,1,1]Wherein each row represents a mutually orthogonal projection axis, the projection signal is expressed as:
S 1 (t)=I nG (t)-I nB (t) (5)
S 2 (t)=I nG (t)+I nB (t)-2I nR (t) (6)
in order to separate the specular reflection and the pulse signal component, S 1 (t) and S 2 (t) alpha-tuning treatment is required,
x(t)=S 1 (t)+αS 2 (t) (7)
wherein α=σ (S 1 (t))/σ(S 2 (t)), while σ (·) represents the standard deviation, x (t) is the extracted BVP signal, also known as the POS feature, applying a POS algorithm to each level of the video pyramid results in an L-level BVP signal, where L represents the total number of layers of the video pyramid, and x is l (t) POS features x (t) extracted for the first layer;
linear POS operation can only extract limited pure BVP signals in all motion and recording environments, while motion artifacts are partially separated with the help of a multi-scale facial ROI.
CN202010285626.7A 2020-04-13 2020-04-13 Heart rate detection method based on multi-scale video Active CN111597891B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010285626.7A CN111597891B (en) 2020-04-13 2020-04-13 Heart rate detection method based on multi-scale video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010285626.7A CN111597891B (en) 2020-04-13 2020-04-13 Heart rate detection method based on multi-scale video

Publications (2)

Publication Number Publication Date
CN111597891A CN111597891A (en) 2020-08-28
CN111597891B true CN111597891B (en) 2023-07-25

Family

ID=72190634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010285626.7A Active CN111597891B (en) 2020-04-13 2020-04-13 Heart rate detection method based on multi-scale video

Country Status (1)

Country Link
CN (1) CN111597891B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113892930B (en) * 2021-12-10 2022-04-22 之江实验室 Facial heart rate measuring method and device based on multi-scale heart rate signals

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201601140D0 (en) * 2016-01-21 2016-03-09 Oxehealth Ltd Method and apparatus for estimating heart rate
CN109793506A (en) * 2019-01-18 2019-05-24 合肥工业大学 A kind of contactless radial artery Wave shape extracting method
CN110084085A (en) * 2018-11-06 2019-08-02 天津工业大学 RPPG high-precision heart rate detection method based on shaped signal
CN110353646A (en) * 2019-07-29 2019-10-22 苏州市高事达信息科技股份有限公司 Contactless heart rate detection method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8360986B2 (en) * 2006-06-30 2013-01-29 University Of Louisville Research Foundation, Inc. Non-contact and passive measurement of arterial pulse through thermal IR imaging, and analysis of thermal IR imagery
US10448846B2 (en) * 2014-12-16 2019-10-22 Oxford University Innovation Limited Method and apparatus for measuring and displaying a haemodynamic parameter

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB201601140D0 (en) * 2016-01-21 2016-03-09 Oxehealth Ltd Method and apparatus for estimating heart rate
CN110084085A (en) * 2018-11-06 2019-08-02 天津工业大学 RPPG high-precision heart rate detection method based on shaped signal
CN109793506A (en) * 2019-01-18 2019-05-24 合肥工业大学 A kind of contactless radial artery Wave shape extracting method
CN110353646A (en) * 2019-07-29 2019-10-22 苏州市高事达信息科技股份有限公司 Contactless heart rate detection method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
De Haan G等.Robust pulse rate from chrominance-based rPPG.EEE Transactions on Biomedical Engineering.2013,第60卷(第10期),全文. *
Gambi E等.Heart rate detection using microsoft kinect: Validation and comparison to wearable devices.Sensors.2017,第17卷(第8期),全文. *
Patil O R等.camera-based continuous ppg monitoring system using laplacian pyramid.Smart Health.2018,第9卷全文. *
Wang W等.Robust heart rate from fitness videos.Physiological measurement.2017,第38卷(第6期),全文. *
路雪.基于面部视频的多媒体播放控制系统.中国硕士学位论文全文数据库.2018,全文. *

Also Published As

Publication number Publication date
CN111597891A (en) 2020-08-28

Similar Documents

Publication Publication Date Title
Bobbia et al. Unsupervised skin tissue segmentation for remote photoplethysmography
CN111797716B (en) Single target tracking method based on Siamese network
Zeng et al. Background subtraction using multiscale fully convolutional network
CN108062525B (en) Deep learning hand detection method based on hand region prediction
Shi et al. C 2 G 2 FSnake: automatic tongue image segmentation utilizing prior knowledge
CN104050488B (en) A kind of gesture identification method of the Kalman filter model based on switching
CN109934224B (en) Small target detection method based on Markov random field and visual contrast mechanism
US20140341421A1 (en) Method for Detecting Persons Using 1D Depths and 2D Texture
CN107944431A (en) A kind of intelligent identification Method based on motion change
CN106485735A (en) Human body target recognition and tracking method based on stereovision technique
EP2061008A1 (en) Method and device for continuous figure-ground segmentation in images from dynamic visual scenes
CN107609571B (en) Adaptive target tracking method based on LARK features
Li et al. Learning motion-robust remote photoplethysmography through arbitrary resolution videos
CN107564035B (en) Video tracking method based on important area identification and matching
CN112613565B (en) Anti-occlusion tracking method based on multi-feature fusion and adaptive learning rate updating
CN112270697A (en) Satellite sequence image moving target detection method combined with super-resolution reconstruction
Zoidi et al. Stereo object tracking with fusion of texture, color and disparity information
CN116129129A (en) Character interaction detection model and detection method
CN111597891B (en) Heart rate detection method based on multi-scale video
CN117133032A (en) Personnel identification and positioning method based on RGB-D image under face shielding condition
Ma et al. MSMA-Net: An Infrared Small Target Detection Network by Multi-scale Super-resolution Enhancement and Multi-level Attention Fusion
CN110309729A (en) Tracking and re-detection method based on anomaly peak detection and twin network
Nicodemou et al. Learning to infer the depth map of a hand from its color image
Décombas et al. Spatio-temporal saliency based on rare model
CN111429479A (en) Space target identification method based on image integral mean value

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant