CN111597891B - Heart rate detection method based on multi-scale video - Google Patents
Heart rate detection method based on multi-scale video Download PDFInfo
- Publication number
- CN111597891B CN111597891B CN202010285626.7A CN202010285626A CN111597891B CN 111597891 B CN111597891 B CN 111597891B CN 202010285626 A CN202010285626 A CN 202010285626A CN 111597891 B CN111597891 B CN 111597891B
- Authority
- CN
- China
- Prior art keywords
- scale
- heart rate
- signal
- roi
- video
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 21
- 230000004927 fusion Effects 0.000 claims abstract description 14
- 239000008280 blood Substances 0.000 claims abstract description 13
- 210000004369 blood Anatomy 0.000 claims abstract description 13
- 238000000605 extraction Methods 0.000 claims abstract description 13
- 238000012545 processing Methods 0.000 claims abstract description 7
- 239000000284 extract Substances 0.000 claims abstract description 5
- 238000005457 optimization Methods 0.000 claims abstract description 3
- 230000001815 facial effect Effects 0.000 claims description 21
- 239000013598 vector Substances 0.000 claims description 9
- 238000012935 Averaging Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000013186 photoplethysmography Methods 0.000 claims description 6
- 230000000295 complement effect Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 2
- 239000010410 layer Substances 0.000 claims 11
- 239000002356 single layer Substances 0.000 claims 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000004397 blinking Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000009532 heart rate measurement Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000010349 pulsation Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/02—Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
- A61B5/024—Detecting, measuring or recording pulse rate or heart rate
- A61B5/02416—Detecting, measuring or recording pulse rate or heart rate using photoplethysmograph signals, e.g. generated by infrared radiation
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4084—Scaling of whole images or parts thereof, e.g. expanding or contracting in the transform domain, e.g. fast Fourier transform [FFT] domain scaling
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/56—Extraction of image or video features relating to colour
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/168—Feature extraction; Face representation
- G06V40/171—Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30101—Blood vessel; Artery; Vein; Vascular
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Surgery (AREA)
- Molecular Biology (AREA)
- Data Mining & Analysis (AREA)
- Medical Informatics (AREA)
- Human Computer Interaction (AREA)
- Biophysics (AREA)
- Animal Behavior & Ethology (AREA)
- Cardiology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Artificial Intelligence (AREA)
- Pathology (AREA)
- Heart & Thoracic Surgery (AREA)
- Biomedical Technology (AREA)
- Physiology (AREA)
- Signal Processing (AREA)
- Psychiatry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
- Ultra Sonic Daignosis Equipment (AREA)
- Measuring Pulse, Heart Rate, Blood Pressure Or Blood Flow (AREA)
Abstract
A heart rate detection method based on multi-scale video comprises the following steps: step 1, establishing a video pyramid: on the basis of the original tracking frame, the size of the region of interest (ROI) is reduced downwards, and the size of the region of interest is enlarged upwards; step 2, blood volume pulse BVP signal extraction: the blood volume pulse signals need to be extracted from a multi-scale channel, the multi-scale signal fusion algorithm is popularization and optimization of a skin tone orthogonal plane method, and when the scale features of the pyramid are in one layer, the proposed heart rate extraction method is the skin tone orthogonal plane method; step 3, multi-scale signal fusion: and the heart rate signal characteristics of the channels with multiple scales are subjected to signal fusion by adopting Gaussian prior convex combination, and the heart rate value is finally obtained by signal processing of the fused signals with multiple scales. The invention extracts rich heart rate characteristics on different scales by changing the frequency, and fuses the characteristics so as to improve the heart rate detection precision.
Description
Technical Field
The invention relates to the fields of video heart rate detection, computer vision and signal processing.
Background
In recent years, remote photoplethysmography (rpg) based on optical and physiological principles has been rapidly developed, which is a technique for measuring Blood Volume Pulse (BVP) and heart rate in a non-contact manner, and has a very wide range of applications. In visible light, the remote photoplethysmography adopts a consumer-level digital camera to measure heart rate, thereby expanding the application range of pulse measurement. The selection of a facial region of interest (ROI) is a key issue for the system, and the selection of the location of the region of interest directly affects the quality of the measurement signal. Existing studies indicate that the cheek, lip and chin regions of the face contain more abundant capillaries and have better pulse signal intensity than other regions. However, areas of higher pulsation intensity are not necessarily suitable for original signal extraction for remote photoplethysmography, as these areas may be disturbed by non-rigid movements like blinking, speaking, smiling, etc. Existing video heart rate detection methods all extract heart rate signals based on a single region of interest, however the heart rate signal characteristics of a single region of interest are always limited.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a heart rate detection method based on a multi-scale video, and the basic idea of the method is to extract heart rate signal characteristics on different scales in the video by changing frequency and fuse the characteristics so as to improve heart rate detection accuracy.
The invention adopts the technical scheme that:
a heart rate detection method based on multi-scale video, the method comprising the steps of:
step 1, establishing a video pyramid
On the basis of the original tracking frame, on one hand, the size of the region of interest (ROI) is reduced downwards, and on the other hand, the size of the region of interest (ROI) is enlarged upwards;
step 2, blood volume pulse BVP Signal extraction
The blood volume pulse signals need to be extracted from a multi-scale channel, a multi-scale signal fusion algorithm is popularization and optimization of a skin tone orthogonal plane method POS, and when the scale characteristics of the pyramid are in the condition of only one layer, the proposed heart rate extraction method is POS;
step 3, multi-scale signal fusion
And the heart rate signal characteristics of the channels with multiple scales are subjected to signal fusion by adopting Gaussian prior convex combination, and the heart rate value is finally obtained by signal processing of the fused signals with multiple scales.
Further, in the step 1, the key of creating the video pyramid is a multi-scale region of interest of the face, and ω is set l 、h l For the width and height of the level I facial region of interest ROI, the multi-scale facial ROI is defined as follows:
the number of levels of the pyramid, l= -1,0,1 …, where ω 0 、h 0 For the width and height of the layer 0 region of interest ROI, the facial ROIs of different scales share the same center point (C x ,C y ) The ROI of layer 0 is defined as a bounding box immediately surrounding the face contour, which can be obtained by commonly used face detectors and trackers; to construct a multiscale facial ROI, on the one hand, the initial ROI size is halved stepwise, i.e. i=1, 2,3 … these ROIs cover skin areas and do not involve background pixels; since these regions of interest cover different areas of the face, there are correspondingly different color changes; on the other hand, the size of the initial ROI is enlarged, i.e. let l= -1, in order to apply the signalWhen an algorithm is extracted, taking the pixel gray level of a motion background in a video into consideration;
the pixels in the facial multiscale ROI are converted into the original rpg trajectory by spatial averaging, calculated by:
the number of pyramid levels in the formula, i= -1,0,1 …, where I c (x, y, t) represents the gray scale of the pixel of the t-th frame with coordinates (x, y), c ε { R, G, B } represents the color channel, R l (t) represents a region of interest ROI of the t-th frame of the first layer, area (R) l (t)) is denoted as R l Total number of pixels in (t). The ROI tracking frame of each frame of the video is determined by the face tracker, and the original rpg trajectory is obtained by averaging the pixel gray levels I l,C (t) is calculated by linking to the whole t and is denoted as I l (t)。
Still further, in the step 2, the blood volume pulse BVP signal of each layer scale needs to be extracted from the trajectory of remote photoplethysmography rpg, respectively, and with the help of the facial multiscale ROI, the motion artifact is partially separated, as follows:
using the skin tone orthogonal plane method POS, it defines a vector of correlations [1,1 ]] T Orthogonal projection planes to eliminate skin tone dependence, to separate BVP signals and motion artifacts, the original trajectories are projected onto two vectors on this plane, respectively, signal processing is performed by alpha-tuning to finally obtain BVP signals, POS is applied to the original tracking trajectories I of each level of the video pyramid l (t), therefore, the original trace I is temporarily omitted from the following formula l The number of levels/of (t), which is simply denoted as I (t);
the original tracking is first processed by a time normalization,
I n (t)=N·I(t) (3)
wherein the method comprises the steps ofIs a diagonal matrix whose ith diagonal gives the inverse of the ith row mean of I, namely:
N ii =1/μ(I i ) (4)
the time-normalized trajectory is then projected onto two vectors defined by a projection matrix, P p =[0,1,-1;-2,1,1]Wherein each row represents a mutually orthogonal projection axis, the projection signal is expressed as:
S 1 (t)=I nG (t)-I nB (t) (5)
S 2 (t)=I nG (t)+I nB (t)-2I nR (t) (6)
in order to separate the specular reflection and the pulse signal component, S 1 (t) and S 2 (t) alpha-tuning treatment is required,
x(t)=S 1 (t)+αS 2 (t) (7)
wherein α=σ (S 1 (t))/σ(S 2 (t)), while σ (·) represents the standard deviation, x (t) is the extracted BVP signal, also known as the POS feature, applying a POS algorithm to each level of the video pyramid results in an L-level BVP signal, where L represents the total number of layers of the video pyramid, and x is l (t) POS features x (t) extracted for the first layer;
linear POS operation can only extract limited pure BVP signals in all sports and recording environments. While with the help of the multiscale facial ROI, motion artifacts are partially separated.
Furthermore, the blood volume pulse BVP signals of each layer scale obtained in the step 1 and the step 2 need to be fused by a convex combination of gaussian priors, and the process is as follows:
the final pulse signal is calculated by fusing the candidate POS features extracted from the multi-scale trajectory, since the candidate POS features are assumed to be complementary, we attribute the signal fusion problem to feature combinations instead of feature selection, and for this purpose, convex combinations are used:
λ l representing the weight of the first hierarchical scale and satisfying the relation of the hierarchical weights and equal to 1, i.e. Σ l λ l The next most critical step, i.e., =1, is to determine the weight of each level, POS features of different levels have different heart rate energies, larger weights should be assigned to those levels with stronger heart rate energies, weights are determined using gaussian priors,
l= -1,0,1,2,3..represents the first layer scale, μ 0 Sum sigma 0 The exponential levels representing center and standard deviation, respectively, the gaussian priors are based on the pulse intensity in the middle layer being greater than the intensities in the lower and higher layers;
for long-time detection videos, windowing outputs are connected in series to obtain a long-time heart rate signal, a video with the length of N is given, a sequence is firstly divided into segments with the length of T, a proposed algorithm is applied to obtain windowed outputs, and an overlap-add method is applied to obtain final heart rate outputs.
The beneficial effects of the invention are as follows: and the heart rate detection accuracy is improved.
Detailed Description
The present invention will be described in further detail below.
A heart rate detection method based on multi-scale video, the method comprising the steps of:
step 1, establishing a video pyramid
The key to creating a video pyramid is a multi-scale facial region of interest, set ω l 、h l For the width and height of the level I facial region of interest ROI, the multi-scale facial ROI is defined as follows:
the number of levels of the pyramid, l= -1,0,1 …, where ω 0 、h 0 Is the width and height of the layer 0 region of interest ROI. Facial ROIs of different scales share the same center point (C x ,C y ) The ROI of layer 0 is defined as a bounding box immediately surrounding the face contour, which can be obtained by a commonly used face detector and tracker, in order to construct a multi-scale facial ROI, on the one hand, the initial ROI size is halved stepwise, i.e. i=1, 2,3, … these ROIs mainly cover skin areas and do not involve background pixels, as these areas of interest cover different areas of the face, and correspondingly there are also different color changes; on the other hand, the size of the initial ROI is enlarged, i.e. let l= -1, in order to take into account the pixel gray-scale of the moving background in the video when applying the signal extraction algorithm;
the pixels in the facial multiscale ROI are converted into the original rpg trajectory by spatial averaging, calculated by:
the number of pyramid levels in the formula, i= -1,0,1 …, where I c (x, y, t) represents the gray scale of the pixel of the t-th frame with coordinates (x, y), c ε { R, G, B } represents the color channel, R l (t) represents a region of interest ROI of the t-th frame of the first layer, area (R) l (t)) is denoted as R l Total number of pixels in (t). The ROI tracking frame of each frame of the video is determined by the face tracker, and the original rpg trajectory is obtained by averaging the pixel gray levels I l,C (t) is calculated by linking to the whole t and is denoted as I l (t);
Step 2 Blood Volume Pulse (BVP) Signal extraction
The Blood Volume Pulse (BVP) signal needs to be extracted from the trajectory of multiscale remote photoplethysmography (rPPG), where we use skin tone positive which has been widely adopted by various rPPG research instituteAn intersection plane method POS, which defines an and vector [1,1 ]] T Orthogonal projection planes to eliminate skin tone dependence, to separate BVP signals and motion artifacts, the original trajectories are projected onto two vectors on this plane, respectively, signal processing is performed by alpha-tuning to finally obtain BVP signals, POS is applied to the original tracking trajectories I of each level of the video pyramid l (t), therefore, the original trace I is temporarily omitted from the following formula l The number of levels of (t), I (t), is simply denoted as I (t).
The original trajectory is first time-normalized,
I n (t)=N·I(t) (3)
wherein the method comprises the steps ofIs a diagonal matrix whose ith diagonal gives the inverse of the ith row mean of I, namely:
N ii =1/μ(I i ) (4)
the time-normalized trajectory is then projected onto two vectors defined by a projection matrix, P p =[0,1,-1;-2,1,1]Wherein each row represents a projection axis that is orthogonal to each other. The projection signal can be expressed as:
S 1 (t)=I nG (t)-I nB (t) (5)
S 2 (t)=I nG (t)+I nB (t)-2I nR (t) (6)
in order to separate the specular reflection and the pulse signal component, S 1 (t) and S 2 (t) alpha-tuning treatment is required,
x(t)=S 1 (t)+αS 2 (t) (7)
wherein α=σ (S 1 (t))/σ(S 2 (t)), while σ (·) represents the standard deviation, x (t) is the extracted BVP signal, also known as the POS feature, applying a POS algorithm to each level of the video pyramid results in an L-level BVP signal, where L represents the total number of layers of the video pyramid, and x is l (t) POS features x (t) extracted for the first layer;
the application of POS in multi-scale tracking is popularization of an original POS algorithm, and the purpose of multi-scale POS feature extraction is to facilitate pulse extraction. In conventional single-scale extraction, all motion artifacts are accompanied by the BVP signal. Linear POS operation can only extract limited pure BVP signals in all sports and recording environments. While with the help of a multi-scale facial ROI, motion artifacts may be partially separated;
step 3. Multi-Scale Signal fusion
The final pulse signal is calculated by fusing the candidate POS features extracted from the multi-scale trajectory, since the candidate POS features are assumed to be complementary, we attribute the signal fusion problem to feature combinations instead of feature selection, and for this purpose, convex combinations are used:
λ l representing the weight of the first hierarchical scale and satisfying the relation of the hierarchical weights and equal to 1, i.e. Σ l λ l The next most critical step, i.e., =1, is to determine the weight of each level, POS features of different levels have different heart rate energies, larger weights should be assigned to those levels with stronger heart rate energies, weights are determined using gaussian priors,
l= -1,0,1,2,3..represents the first layer scale, u 0 Sum sigma 0 The exponential levels representing center and standard deviation, respectively, the gaussian priors are based on the pulse intensity in the middle layer being greater than the intensities in the lower and higher layers;
the operations discussed above are all within the scope of a time window. For long-time detection videos, window outputs are connected in series to obtain a long-time heart rate signal; specifically, given a video of length N, we first split the sequence into segments of length T, apply the proposed algorithm to obtain windowed output, and apply the Overlap-add (overlay) method to obtain the final heart rate output.
Claims (3)
1. A heart rate detection method based on multi-scale video, the method comprising the steps of:
step 1, establishing a video pyramid
On the basis of the original tracking frame, on one hand, the size of the region of interest (ROI) is reduced downwards, and on the other hand, the size of the region of interest (ROI) is enlarged upwards;
step 2, blood volume pulse BVP Signal extraction
The blood volume pulse signals need to be extracted from a multi-scale channel, a multi-scale signal fusion algorithm is popularization and optimization of a skin tone orthogonal plane method POS, and when the scale characteristics of the pyramid are in a single layer, the proposed heart rate extraction method is the skin tone orthogonal plane method POS;
step 3, multi-scale signal fusion
The heart rate signal characteristics of the multiple scale channels are subjected to signal fusion by adopting Gaussian prior convex combination, and the heart rate value is finally obtained by signal processing of the fusion signals of the multiple scales;
the blood volume pulse BVP signals of each layer scale obtained in the step 1 and the step 2 need to be fused through a Gaussian prior convex combination, and the process is as follows:
the final pulse signal is calculated by fusing the candidate POS features extracted from the multi-scale trajectory, and since the candidate POS features are assumed to be complementary, the signal fusion problem is attributed to feature combinations rather than feature selection, and for this purpose, convex combinations are used:
l is the number of layers of the pyramid, lambda l Representing the weight of the first hierarchical scale and satisfying the relation of the hierarchical weights and equal to 1, i.e. Σ l λ l =1,x l (t) POS features x (t) extracted for the first tier, the next most critical step being to determine the weights of each level, POS features of different tiers having different heart rate energies, larger weights being assigned to those tiers having stronger heart rate energies, weights being determined using Gaussian priors,
l= -1,0,1,2,3 … represents the layer l scale, μ 0 Sum sigma 0 The exponential levels representing center and standard deviation, respectively, the gaussian priors are based on the pulse intensity in the middle layer being greater than the intensities in the lower and higher layers;
for long-time detection videos, windowing outputs are connected in series to obtain a long-time heart rate signal, a video with the length of N is given, a sequence is firstly divided into segments with the length of T, a proposed algorithm is applied to obtain windowed outputs, and an overlap-add method is applied to obtain final heart rate outputs.
2. The multi-scale video-based heart rate detection method of claim 1, wherein: in the step 1, the key point of establishing the video pyramid is a multi-scale face region of interest, and omega is set l 、h l For the width and height of the level I facial region of interest ROI, the multi-scale facial ROI is defined as follows:
the number of levels of the pyramid, l= -1,0,1 …, where ω 0 、h 0 For the width and height of the layer 0 region of interest ROI, faces of different dimensionsThe partial ROIs share the same center point (C x ,C y ) The ROI of layer 0 is defined as a bounding box immediately surrounding the face contour, which can be obtained by commonly used face detectors and trackers; to construct a multiscale facial ROI, on the one hand, the initial ROI size is halved stepwise, i.e. i=1, 2,3 … these ROIs cover skin areas and do not involve background pixels; since these regions of interest cover different areas of the face, there are correspondingly different color changes; on the other hand, the size of the initial ROI is enlarged, i.e. let l= -1, in order to take into account the pixel gray-scale of the moving background in the video when applying the signal extraction algorithm;
the pixels in the facial multiscale ROI are converted into the original rpg trajectory by spatial averaging, calculated by:
the number of pyramid levels in the formula, i= -1,0,1 …, where I c (x, y, t) represents the gray scale of the pixel of the t-th frame with coordinates (x, y), c ε { R, G, B } represents the color channel, R l (t) represents a region of interest ROI of the t-th frame of the first layer, area (R) l (t)) is denoted as R l Total number of pixels in (t), ROI tracking frame of each frame of video is determined by face tracker, original rpg trajectory is obtained by averaging pixel gray level I l,C (t) is calculated by linking to the whole t and is denoted as I l (t)。
3. A multi-scale video-based heart rate detection method as claimed in claim 1 or 2, wherein: in the step 2, the blood volume pulse BVP signal of each layer scale needs to be extracted from the trajectory of remote photoplethysmography rpg, and the motion artifact is partially separated with the help of the facial multiscale ROI, as follows:
using the skin tone orthogonal plane method POS, it defines a vector of correlations [1,1 ]] T Orthogonal projection planes to eliminate skin tone dependence, to separate BVP signals and motion artifacts, the original trajectory is splitOnto the two vectors respectively projected onto this plane, signal processing is performed by α -tuning, so as to finally obtain the BVP signal, and POS is applied to the original tracking trajectory I of each level of the video pyramid l (t), therefore, the original trace I is temporarily omitted from the following formula l The number of levels/of (t), which is simply denoted as I (t);
the original tracking is first processed by a time normalization,
I n (t)=N·I(t) (3)
wherein the method comprises the steps ofIs a diagonal matrix whose ith diagonal gives the inverse of the ith row mean of I, namely:
N ii =1/μ(I i ) (4)
the time-normalized trajectory is then projected onto two vectors defined by a projection matrix, P p =[0,1,-1;-2,1,1]Wherein each row represents a mutually orthogonal projection axis, the projection signal is expressed as:
S 1 (t)=I nG (t)-I nB (t) (5)
S 2 (t)=I nG (t)+I nB (t)-2I nR (t) (6)
in order to separate the specular reflection and the pulse signal component, S 1 (t) and S 2 (t) alpha-tuning treatment is required,
x(t)=S 1 (t)+αS 2 (t) (7)
wherein α=σ (S 1 (t))/σ(S 2 (t)), while σ (·) represents the standard deviation, x (t) is the extracted BVP signal, also known as the POS feature, applying a POS algorithm to each level of the video pyramid results in an L-level BVP signal, where L represents the total number of layers of the video pyramid, and x is l (t) POS features x (t) extracted for the first layer;
linear POS operation can only extract limited pure BVP signals in all motion and recording environments, while motion artifacts are partially separated with the help of a multi-scale facial ROI.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010285626.7A CN111597891B (en) | 2020-04-13 | 2020-04-13 | Heart rate detection method based on multi-scale video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010285626.7A CN111597891B (en) | 2020-04-13 | 2020-04-13 | Heart rate detection method based on multi-scale video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111597891A CN111597891A (en) | 2020-08-28 |
CN111597891B true CN111597891B (en) | 2023-07-25 |
Family
ID=72190634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010285626.7A Active CN111597891B (en) | 2020-04-13 | 2020-04-13 | Heart rate detection method based on multi-scale video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111597891B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113892930B (en) * | 2021-12-10 | 2022-04-22 | 之江实验室 | Facial heart rate measuring method and device based on multi-scale heart rate signals |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201601140D0 (en) * | 2016-01-21 | 2016-03-09 | Oxehealth Ltd | Method and apparatus for estimating heart rate |
CN109793506A (en) * | 2019-01-18 | 2019-05-24 | 合肥工业大学 | A kind of contactless radial artery Wave shape extracting method |
CN110084085A (en) * | 2018-11-06 | 2019-08-02 | 天津工业大学 | RPPG high-precision heart rate detection method based on shaped signal |
CN110353646A (en) * | 2019-07-29 | 2019-10-22 | 苏州市高事达信息科技股份有限公司 | Contactless heart rate detection method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8360986B2 (en) * | 2006-06-30 | 2013-01-29 | University Of Louisville Research Foundation, Inc. | Non-contact and passive measurement of arterial pulse through thermal IR imaging, and analysis of thermal IR imagery |
US10448846B2 (en) * | 2014-12-16 | 2019-10-22 | Oxford University Innovation Limited | Method and apparatus for measuring and displaying a haemodynamic parameter |
-
2020
- 2020-04-13 CN CN202010285626.7A patent/CN111597891B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB201601140D0 (en) * | 2016-01-21 | 2016-03-09 | Oxehealth Ltd | Method and apparatus for estimating heart rate |
CN110084085A (en) * | 2018-11-06 | 2019-08-02 | 天津工业大学 | RPPG high-precision heart rate detection method based on shaped signal |
CN109793506A (en) * | 2019-01-18 | 2019-05-24 | 合肥工业大学 | A kind of contactless radial artery Wave shape extracting method |
CN110353646A (en) * | 2019-07-29 | 2019-10-22 | 苏州市高事达信息科技股份有限公司 | Contactless heart rate detection method |
Non-Patent Citations (5)
Title |
---|
De Haan G等.Robust pulse rate from chrominance-based rPPG.EEE Transactions on Biomedical Engineering.2013,第60卷(第10期),全文. * |
Gambi E等.Heart rate detection using microsoft kinect: Validation and comparison to wearable devices.Sensors.2017,第17卷(第8期),全文. * |
Patil O R等.camera-based continuous ppg monitoring system using laplacian pyramid.Smart Health.2018,第9卷全文. * |
Wang W等.Robust heart rate from fitness videos.Physiological measurement.2017,第38卷(第6期),全文. * |
路雪.基于面部视频的多媒体播放控制系统.中国硕士学位论文全文数据库.2018,全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN111597891A (en) | 2020-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bobbia et al. | Unsupervised skin tissue segmentation for remote photoplethysmography | |
CN111797716B (en) | Single target tracking method based on Siamese network | |
Zeng et al. | Background subtraction using multiscale fully convolutional network | |
CN108062525B (en) | Deep learning hand detection method based on hand region prediction | |
Shi et al. | C 2 G 2 FSnake: automatic tongue image segmentation utilizing prior knowledge | |
CN104050488B (en) | A kind of gesture identification method of the Kalman filter model based on switching | |
CN109934224B (en) | Small target detection method based on Markov random field and visual contrast mechanism | |
US20140341421A1 (en) | Method for Detecting Persons Using 1D Depths and 2D Texture | |
CN107944431A (en) | A kind of intelligent identification Method based on motion change | |
CN106485735A (en) | Human body target recognition and tracking method based on stereovision technique | |
EP2061008A1 (en) | Method and device for continuous figure-ground segmentation in images from dynamic visual scenes | |
CN107609571B (en) | Adaptive target tracking method based on LARK features | |
Li et al. | Learning motion-robust remote photoplethysmography through arbitrary resolution videos | |
CN107564035B (en) | Video tracking method based on important area identification and matching | |
CN112613565B (en) | Anti-occlusion tracking method based on multi-feature fusion and adaptive learning rate updating | |
CN112270697A (en) | Satellite sequence image moving target detection method combined with super-resolution reconstruction | |
Zoidi et al. | Stereo object tracking with fusion of texture, color and disparity information | |
CN116129129A (en) | Character interaction detection model and detection method | |
CN111597891B (en) | Heart rate detection method based on multi-scale video | |
CN117133032A (en) | Personnel identification and positioning method based on RGB-D image under face shielding condition | |
Ma et al. | MSMA-Net: An Infrared Small Target Detection Network by Multi-scale Super-resolution Enhancement and Multi-level Attention Fusion | |
CN110309729A (en) | Tracking and re-detection method based on anomaly peak detection and twin network | |
Nicodemou et al. | Learning to infer the depth map of a hand from its color image | |
Décombas et al. | Spatio-temporal saliency based on rare model | |
CN111429479A (en) | Space target identification method based on image integral mean value |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |