WO2021184620A1 - Camera-based non-contact heart rate and body temperature measurement method - Google Patents

Camera-based non-contact heart rate and body temperature measurement method Download PDF

Info

Publication number
WO2021184620A1
WO2021184620A1 PCT/CN2020/103087 CN2020103087W WO2021184620A1 WO 2021184620 A1 WO2021184620 A1 WO 2021184620A1 CN 2020103087 W CN2020103087 W CN 2020103087W WO 2021184620 A1 WO2021184620 A1 WO 2021184620A1
Authority
WO
WIPO (PCT)
Prior art keywords
heart rate
body temperature
image
model
camera
Prior art date
Application number
PCT/CN2020/103087
Other languages
French (fr)
Chinese (zh)
Inventor
谢世朋
袁柱柱
Original Assignee
南京昊眼晶睛智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京昊眼晶睛智能科技有限公司 filed Critical 南京昊眼晶睛智能科技有限公司
Publication of WO2021184620A1 publication Critical patent/WO2021184620A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/02Detecting, measuring or recording pulse, heart rate, blood pressure or blood flow; Combined pulse/heart-rate/blood pressure determination; Evaluating a cardiovascular condition not otherwise provided for, e.g. using combinations of techniques provided for in this group with electrocardiography or electroauscultation; Heart catheters for measuring blood pressure
    • A61B5/0205Simultaneously evaluating both cardiovascular conditions and different types of body conditions, e.g. heart and respiratory condition
    • A61B5/02055Simultaneously evaluating both cardiovascular condition and temperature
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/0059Measuring for diagnostic purposes; Identification of persons using light, e.g. diagnosis by transillumination, diascopy, fluorescence
    • A61B5/0077Devices for viewing the surface of the body, e.g. camera, magnifying lens
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/01Measuring temperature of body parts ; Diagnostic temperature sensing, e.g. for malignant or inflamed tissue
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/318Heart-related electrical modalities, e.g. electrocardiography [ECG]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device

Definitions

  • the invention relates to the technical field of non-contact physical sign monitoring and image processing, and more specifically to a camera-based non-contact heart rate and body temperature measurement method.
  • Heart rate is one of the important physiological parameters of human metabolism and functional activities.
  • the most accurate method of traditional heart rate measurement is electrocardiography, but the electrocardiography method requires electrodes to be glued to the subject’s skin. This method is more complicated and less convenient to use, and this method needs to be in direct contact with the skin. Subject to limitations, such as measuring heart rate and body temperature of infants and athletes during exercise.
  • PPG Photoplethysmography
  • photoplethysmography is a method that uses photoelectric means to non-invasively detect changes in blood volume in living tissues. Reflected light intensity, trace the blood volume pulse (BVP) signal and calculate the heart rate. Fu Mingzhe and others first proposed a non-contact heart rate detection method using an ordinary web camera. This method uses independent component analysis (ICA) to separate three average color traces into three base source signals, and analyzes the second base source signal. The power spectrum estimates the heart rate.
  • ICA independent component analysis
  • the above methods all require the tester to be in a cooperative situation and need to measure under the condition of sufficient light. When the light is weak, this method is difficult to extract a clean BVP signal, which will contain excess noise, which will have a serious impact on the detection result. However, due to the harsh measurement conditions and the influence of measurement errors, the above methods have not been widely used.
  • the present invention provides a camera-based non-contact heart rate and body temperature measurement method.
  • the method of measuring heart rate and body temperature is less affected by light, has low requirements for measurement conditions, and the measured results are more accurate.
  • the existing non-contact heart rate measurement methods have to deal with the problem of severe measurement conditions and large errors in the measurement results.
  • a camera-based non-contact heart rate and body temperature measurement method includes:
  • S1 Under ordinary visible light conditions, collect video images of the facial area of the subject through the camera, and perform color correction on the collected video images;
  • S2 Perform face recognition on each frame of video image after color correction, and intercept the face contour image from the recognized face area;
  • S5 Calculate the body temperature of the subject according to the relationship between the normal human heart rate benchmark and the obtained heart rate value.
  • the beneficial effect of the present invention is that the method uses color correction to the original video image to remove the interference caused by the light, and minimizes the influence of the light intensity on the measurement result during the measurement process.
  • the method uses the deep learning method to obtain the heart Electrical curve, no need to locate the key parts of the face, just input the contour image of the face into the built model, you can get the ECG curve, the whole measurement process is simple and convenient, based on the measured heart rate, you can also Further using the heart rate value to calculate the body temperature data of the person to be tested, the whole method not only greatly improves the measurement accuracy, but also has more complete functions, which can better meet the actual heart rate and body temperature measurement needs.
  • step S1 performing color correction on the collected video image specifically includes:
  • V 2 N -1, 0 ⁇ N ⁇ 225.
  • the method of the present invention removes the influence of the illumination transformation by converting the RGB value of each pixel in the image.
  • step S2 specifically includes:
  • S201 Build a SegNet semantic segmentation model, a U-net semantic segmentation model, and a semantic segmentation model coupled with Faster-RCNN and digital matting;
  • S202 Use the constructed SegNet semantic segmentation model, U-net semantic segmentation model, and the semantic segmentation model coupled with Faster-RCNN and digital matting to perform face recognition and semantic segmentation on each frame of video image after color correction. Obtain three sets of recognition results;
  • S203 Perform a weighted average of the obtained three sets of recognition results to obtain a final face contour image.
  • the beneficial effect of adopting the above technical solution is that the method of using the weighted average of the three segmentation models to obtain the face contour image, compared with the method of directly obtaining the face contour by directly using the edge detection, the face contour obtained by the present invention is closer to the actual person. Face shape.
  • step S3 specifically includes:
  • S301 Construct a feature fusion residual network, select ECG images obtained by multiple testers wearing ECG acquisition equipment and face contour images obtained after processing at step S2 at the same time as the test set, and fuse the features
  • the residual network is trained to obtain the ECG detection model
  • step S302 Input the face contour image in a segment of view image obtained in step S2 into the ECG detection model, and output the ECG curve.
  • the beneficial effect of adopting the above-mentioned further scheme is that the feature fusion residual network is trained through multiple sets of test data to obtain an ECG detection model.
  • the input of the model is a continuous face contour image, and the output is an ECG curve.
  • the electrocardiogram curve can be obtained directly from the face contour image.
  • step S5 specifically includes:
  • S501 Construct a deep learning network, select multiple groups of different experimenters' heart rate and body temperature corresponding data under the same conditions, train the deep learning network to obtain a heart rate and body temperature conversion model;
  • S502 Input the obtained heart rate value of the person to be measured into the heart rate and body temperature conversion model, and output the body temperature value of the person to be measured.
  • a deep learning network By constructing a deep learning network, multiple sets of heart rate and body temperature corresponding data are trained to obtain the heart rate and body temperature conversion relationship, and then the heart rate data of the subject is used as an input value into the model, and the corresponding body temperature value is output to realize the measurement of body temperature.
  • Fig. 1 is a schematic flowchart of a camera-based non-contact heart rate and body temperature measurement method provided by the present invention
  • Fig. 2 is a schematic diagram of a process flow of performing color correction on a captured video image in an embodiment of the present invention
  • Fig. 3 is a schematic flowchart of a process of acquiring a face contour image in an embodiment of the present invention
  • Figure 4 is a schematic diagram of a SegNet network structure in an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of the U-Net network structure in an embodiment of the present invention.
  • Fig. 6 is a schematic diagram of a process flow of obtaining an ECG curve in an embodiment of the present invention.
  • Fig. 7 is a schematic diagram of a feature fusion residual network structure in an embodiment of the present invention.
  • Fig. 8 is a schematic diagram of the network structure of EDSR and WDSR in an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of the size of the convolution kernel used in RSDB and the size of the convolution kernel used in WDSR in an embodiment of the present invention.
  • Fig. 10 is a schematic flow chart of the process of calculating the body temperature value of the test subject according to the relationship between the normal human heart rate reference and the obtained heart rate value in the embodiment of the present invention.
  • an embodiment of the present invention discloses a camera-based non-contact heart rate and body temperature measurement method, which includes:
  • S1 Under ordinary visible light conditions, collect video images of the facial area of the subject through the camera, and perform color correction on the collected video images;
  • S2 Perform face recognition on each frame of video image after color correction, and intercept the face contour image from the recognized face area;
  • S4 Eliminate baseline drift and strengthen R wave processing on the obtained ECG curve, and calculate the heart rate value of the subject by calculating the number of R wave appearances per minute;
  • S5 According to the relationship between the body's normal heart rate reference and the obtained heart rate value , Calculate the body temperature of the person to be tested.
  • step S1 performing color correction on the collected video image specifically includes:
  • the achromatic model is:
  • V 2 N -1, 0 ⁇ N ⁇ 225.
  • the method of this embodiment removes the influence of the illumination transformation by converting the RGB value of each pixel in the image.
  • step S2 specifically includes:
  • S201 Build a SegNet semantic segmentation model, a U-net semantic segmentation model, and a semantic segmentation model coupled with Faster-RCNN and digital matting;
  • S202 Use the constructed SegNet semantic segmentation model, U-net semantic segmentation model, and the semantic segmentation model coupled with Faster-RCNN and digital matting to perform face recognition and semantic segmentation on each frame of video image after color correction. Obtain three sets of recognition results;
  • S203 Perform a weighted average of the obtained three sets of recognition results to obtain a final face contour image.
  • SegNet is an image semantic segmentation deep network proposed by Cambridge, open source, based on the caffe framework.
  • SegNet is a semantic segmentation network obtained by modifying the VGG-16 network based on FCN.
  • the network structure is clear and easy to understand, and the training is fast and there are few pits.
  • the SegNet network structure is an encoder-decoder structure. When SegNet performs semantic segmentation, a CRF module is usually added at the end for post-processing, aiming to further refine the edge segmentation results.
  • the novelty of SegNet lies in the way the decoder upsamples its lower resolution input feature maps. Specifically, the decoder uses the pooling index calculated in the maximum pooling step of the corresponding encoder to perform non-linear upsampling. This method eliminates the need for learning upsampling.
  • the feature map after upsampling is sparse, so a trainable convolution kernel is then used to perform convolution operations to generate dense feature maps.
  • SegNet uses de-pooling in the decoder to upsample the feature map and maintain the integrity of high-frequency details during segmentation.
  • the encoder does not use a fully connected layer (convolution like FCN), so it is a lightweight network with fewer parameters.
  • the index of each maximum pooling layer in the encoder is stored, and the stored indexes are used in the decoder to perform the depooling operation on the corresponding feature map later. This helps maintain the integrity of high-frequency information, but when depooling low-resolution feature maps, neighboring information is ignored.
  • SegNet network structure diagram is shown as in Fig. 4.
  • the SegNet semantic segmentation model includes a convolutional layer, a batch normalization layer, an activation layer, a pooling layer, an upsampling layer, and a Softmax layer.
  • the convolutional layer and the activation layer are the same as those in the patch-based CNN classification model.
  • the pooling layer And the upsampling layer has done corresponding processing for information loss, using Soflmax function for classification.
  • the Batch Normalization (BN) operation speeds up the convergence of the model through transformation and reconstruction, greatly improves the speed of training, and improves the generalization ability of the network to suppress over-fitting.
  • BN Batch Normalization
  • pooling The essence of pooling is sampling, which compresses the input feature map to reduce the feature map and simplify the calculation complexity of the network; it has better adaptation to a small range of pixel offsets, making the network more robust. Great. Commonly used pooling operations include Max Pooling, which searches for the maximum value in each area.
  • Upsampling is the inverse process of the pooling operation. Through the index position recorded in the pooling layer, the data of the feature map can be returned to the corresponding position during the pooling operation, and 0 is added to other positions. value.
  • U-Net The U-Net network structure is very simple. The first half is for feature extraction, and the second half is for upsampling. It can also be called an encoder-decoder structure. Since the overall structure of this network is similar to the uppercase English letter U, it is named U-Net.
  • U-Net is very different from other common segmentation networks: U-Net uses a completely different feature fusion method: splicing. U-Net uses the feature to be spliced together in the channel dimension to form a thicker feature. The addition of corresponding points used in FCN fusion does not form a thicker feature.
  • U-Net can combine the information of the bottom layer and the high layer.
  • Bottom (deep) information low-resolution information after multiple downsampling. It can provide the contextual semantic information of the segmentation target in the entire image, which can be understood as a feature that reflects the relationship between the target and its environment. This feature helps to determine the category of the object (so the classification problem usually only requires low resolution/deep information, and does not involve multi-scale fusion) high-level (shallow) information: directly passed from the encoder to the same height decoder through the concatenate operation High-resolution information. It can provide more refined features for segmentation, such as gradients.
  • U-Net has many advantages. The biggest feature is that it can train a good model on a small data set. This advantage can shorten the process of labeling training samples for this project task. Moreover, U-Net is also very fast in training speed.
  • U-Net's network structure diagram is shown in Figure 5.
  • the original U-Net contains 18 3 ⁇ 3 convolutional layers, 1 1 ⁇ 1 convolutional layer, 4 2 ⁇ 2 down-sampling layers, and 4 2 ⁇ 2 up-sampling layers , Using ReLU as the activation function.
  • the pooling operation will lose high-frequency components in the image, produce dull and blurred image blocks, and lose position information.
  • U-Net uses a 4-hop connection method to connect the low-level and high-level feature maps.
  • U-Net is actually a fully convolutional neural network, the input and output are images, and the fully connected layer is omitted. The shallower layer is used to solve the pixel positioning problem, and the deeper layer is used to solve the pixel classification problem.
  • the conversion is performed layer by layer.
  • the last layer of the structure is a prediction output map of the same size as the original image, and each pixel in the output image is an integer value representing a category.
  • the network structure adopted in this embodiment has more convolutional layers, and batch standardization operations are performed before the convolutional layer and the deconvolutional layer.
  • the maximum pooling is adopted, and the activation function is adopted It's ELU.
  • the continuous operation of "batch normalization + convolution/deconvolution + ELU activation" in the network is called a "super convolution".
  • the entire network is actually composed of a series of superconvolution, pooling, connection and finally pixel-level classification operations.
  • the convolution filter size is 3 ⁇ 3 ⁇ 64, unit step size, zero padding; in the deconvolution operation, the filter size is 2 ⁇ 2 ⁇ 64, and the output size is the input size 2 times, the step size is 2, zero filling; in the pooling operation, the filter size is 2 ⁇ 2, and the step size is also 2.
  • the weights of all filters are initialized with random values that obey the truncated Gaussian distribution, with zero mean and the variance set to 0.1. All offsets are initialized with 0.1m. It is worth noting that in the original U-Net, the filter depth is increased from 64 to 1024 layer by layer, while the network disclosed in this embodiment sets the filter depth to 64 uniformly. If you refer to the filter depth in the original U-Net, the network is not easy to converge, and the segmentation accuracy is low.
  • the improved network of this embodiment has the following advantages:
  • the model construction method is specifically as follows:
  • the feature extraction network is based on the ZF network, and the region suggestion network is used to generate the region suggestion frame; at the same time, the Faster-RCNN network is used as the detection framework.
  • the RCNN method can be subdivided into four steps: candidate region generation, feature extraction, suggested region classification, and coordinate regression. Selective Search is used to generate candidate regions in RCNN, and then convolutional network is used for feature extraction. Finally, the extracted features are classified by SVM and the location is refined by regression network.
  • feature extraction, SVM, and regression network are combined into a convolutional neural network, which greatly improves the running speed.
  • convolution feature extraction is required for each candidate region, and there are a lot of repeated calculations.
  • the candidate region generation is also completed through a convolutional network, and the feature extraction network of the feature extraction part that generates the candidate region and the feature extraction network of the classification part are combined.
  • Faster RCNN uses ROI pooling to map the position of the generated candidate region on the last feature layer, eliminating a lot of repeated calculations. From the perspective of network structure, Faster RCNN can be regarded as a combination of RPN network and Fast RCNN network.
  • this project uses the loss function:
  • i is the index number of the suggestion box
  • P i is the probability that the suggestion box contains typical weather elements
  • the proposed box regression loss function L reg is defined as: Where R is the robust loss function smooth L1 is defined as:
  • a cost function is introduced based on the smooth changes of foreground and background brightness, which demonstrates how to eliminate foreground and background brightness to obtain a quadratic cost function.
  • the principle is as follows:
  • the image matting algorithm is used to process image I, that is, image I is input; the brightness of the i-th pixel is the corresponding foreground brightness and Combination of background brightness:
  • I i ⁇ i F i +(1- ⁇ i )B i
  • ⁇ i is the opaque part of the pixel foreground.
  • a cost function obtained from the local smoothing on the foreground brightness F and the background brightness B is:
  • I i ⁇ i F i +(1- ⁇ i )B i
  • the foreground brightness F and the background brightness B can be eliminated to generate a quadratic cost function of ⁇ ; and the global optimal solution of the quadratic cost function can be obtained by solving the sparse linear equations; this embodiment Only need to directly calculate ⁇ without estimating foreground brightness F and background brightness B. At the same time, there is less user input, which can reduce the amount of calculation to a certain extent, and finally obtain high-quality cutouts; and use closed-form formulas for the obtained cutouts Check the eigenvector understanding of the sparse matrix and the characteristics of the prediction scheme.
  • the foreground brightness F and the background brightness B are approximately constants on a small window near each pixel, and the foreground brightness F and the background brightness B are set to be locally smooth; in this embodiment, the foreground brightness F and the background brightness B are locally smoothed.
  • W j is a small window near the pixel j.
  • this embodiment performs a regularization operation on a in the cost function.
  • a window of 3 ⁇ 3 pixels is used to implement the above operation. Specifically, a window is placed around each pixel, so that the windows W j in the cost function overlap together to ensure the gap between adjacent pixels. The information overlaps to ensure that high-quality alpha matting is finally obtained; of course, the pixel window used is not limited and fixed in this embodiment, and the selection can be made according to the actual situation.
  • the cost function (1-6) is the quadratic function of ⁇ , a and b, in actual situations, for an image with N pixels, there are 3N unknowns in common; at this time, in order to obtain only N
  • the unknown quadratic cost function that is, the alpha value of the pixel, this embodiment eliminates a and b in the following manner.
  • only the position of the face element can be located through the regional positioning based on deep learning.
  • s is the brush pixel set
  • s i is the value pointed to by the brush to achieve the extraction operation of ⁇ .
  • a 3 ⁇ 3 window is used to define the Laplacian matting matrix.
  • the current scene brightness F When the background brightness B distribution is not very complicated, a wider window can be used; at the same time, in order to ensure that the calculation time of using a wider window is reduced, this embodiment adopts the linear coefficient of the ⁇ matte channel of the image I:
  • the coefficient obtained at a finer resolution with a wider window is similar to that obtained from a smaller window on a coarse image, thereby calculating the linear coefficient of the coarse image; then the linear coefficient is interpolated, and the It is applied to a finer resolution image; the resulting alpha frosted channel is similar to that when a wider window is used to directly solve the matting system on a finer image, that is, the alpha value is obtained, and high-quality pictures are obtained.
  • the present invention uses three segmentation models, and the models also adopt different parameters for training and prediction, so that many prediction segmentation maps will be obtained, so that the recognition results of the three intelligent recognition models are the probability that each pixel is a face.
  • the method based on artificial intelligence determines the weight coefficients of the recognition results of these three recognition models.
  • the method is: use artificial intelligence to learn and train the historical recognition accuracy of these three recognition models, obtain the weight of each recognition model through intelligent learning and training, and finally obtain the occurrence of a certain meteorological element on each pixel through weighted average.
  • a certain threshold such as 80%
  • step S3 specifically includes:
  • S301 Construct a feature fusion residual network, select ECG images obtained by multiple testers wearing ECG acquisition equipment and face contour images obtained after processing at step S2 at the same time as the test set, and fuse the features
  • the residual network is trained to obtain the ECG detection model
  • step S302 Input the face contour image in a segment of view image obtained in step S2 into the ECG detection model, and output the ECG curve.
  • FFRN Feature Fusion Residual Network
  • RSDB skips to connect the local feature fusion layer in the two building modules, and the feature fusion result of the former module is used as the input of the latter module. Then stack the features after local feature fusion, and use residual learning to integrate feature information to form the basic network architecture.
  • both EDSR and WDSR use up-sampling (pixel shuffl) at the end of the network.
  • This method can reduce calculations without loss of model capacity and greatly increase the operating speed.
  • the new up-sampling method (pixel shuffle) adopted by WDSR has little effect on network accuracy.
  • the zooming operation of the image will not increase the information of the image, so the quality of the image will inevitably decrease, and the feature information will also be affected.
  • the medical image correction task is to predict dense pixels, which is very sensitive to the amount of feature information. Therefore, the FFRN network chooses to abandon the up-sampling method, and the image size in the network remains unchanged for end-to-end learning.
  • WDSR-B chooses to increase the number of convolution kernels before the ReLU activation layer and reduce the number of convolution kernels after the ReLU activation layer.
  • WDSR-A uses a 3 ⁇ 3 size convolution kernel before and after the activation layer
  • WDSR-B uses a 1 ⁇ 1 size convolution kernel before and after the ReLU activation layer to further expand the number of channels before the activation layer to obtain a broader feature map, as shown in the figure 9 shown.
  • WDSR-B trains a deep neural network with RB as the network building module.
  • the heart rate of the subject can be calculated by detecting the number of appearances of the R wave per minute.
  • step S5 specifically includes:
  • S501 Build a deep learning network, select multiple groups of different experimenters' heart rate and body temperature corresponding data under the same conditions, train the deep learning network to obtain a heart rate and body temperature conversion model;
  • S502 Input the obtained heart rate value of the person to be measured into the heart rate and body temperature conversion model, and output the body temperature value of the person to be measured.
  • a deep learning network By constructing a deep learning network, multiple sets of heart rate and body temperature corresponding data are trained to obtain the heart rate and body temperature conversion relationship, and then the heart rate data of the subject is used as an input value into the model, and the corresponding body temperature value is output to realize the measurement of body temperature.
  • the body temperature data can also be estimated based on the existing corresponding relationship between heart rate and body temperature.
  • the method is as follows:
  • the heart rate value estimates the body temperature value.
  • the method provided by the embodiment of the present invention has the following advantages:
  • This method uses the deep learning method to obtain the ECG curve, no need to locate the key parts of the face, just input the face contour image into the built model to get the ECG curve, the whole measurement process is simple and convenient ;
  • the heart rate value can be further used to calculate the body temperature data of the person under test.
  • the whole method not only greatly improves the measurement accuracy, but also has more complete functions, which can better meet the actual heart rate and body temperature measurement needs.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Animal Behavior & Ethology (AREA)
  • Surgery (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Cardiology (AREA)
  • Artificial Intelligence (AREA)
  • Physiology (AREA)
  • Evolutionary Computation (AREA)
  • Signal Processing (AREA)
  • Psychiatry (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Fuzzy Systems (AREA)
  • Pulmonology (AREA)
  • Image Analysis (AREA)

Abstract

A camera-based non-contact heart rate and body temperature measurement method. According to the method, by performing color correction on original video images and eliminating interference caused by light, the impact of light intensity on measurement results during a measurement process is minimized. According to the method, an electrocardiogram curve is obtained using a depth learning method, there is no need to position key parts of a face, and the electrocardiogram curve can be obtained only by inputting a face contour into a constructed model. The whole measurement process is simple and convenient. In the case that the heat rate is measured, body temperature data of the subject can be further obtained by calculation using the heart rate value. The whole method not only greatly improves the measurement accuracy, but also is more complete in function, and can better meet the actual heart rate and body temperature measurement requirements.

Description

一种基于摄像头的非接触式心率、体温测量方法A camera-based non-contact heart rate and body temperature measurement method 技术领域Technical field
本发明涉及非接触式体征监测和图像处理技术领域,更具体的说是涉及一种基于摄像头的非接触式心率、体温测量方法。The invention relates to the technical field of non-contact physical sign monitoring and image processing, and more specifically to a camera-based non-contact heart rate and body temperature measurement method.
背景技术Background technique
目前,随着心脑血管疾病的发病率不断提高,人们的健康意识也逐渐增强,对心率、体温等体征参数的检测意识也不断提高。心率是人体新陈代谢和功能活动重要的生理参数之一。传统的心率测量最准确的方法是心电图法,但心电图法需将电极粘在受测者皮肤上,这种方法使用起来较复杂不太方便,而且这种方法需要和皮肤直接接触,使用场景比较受局限,例如对于婴儿以及运动员在运动过程中的心率及体温测量。At present, as the incidence of cardiovascular and cerebrovascular diseases continues to increase, people's health awareness has gradually increased, and their awareness of detecting physical parameters such as heart rate and body temperature has also been increasing. Heart rate is one of the important physiological parameters of human metabolism and functional activities. The most accurate method of traditional heart rate measurement is electrocardiography, but the electrocardiography method requires electrodes to be glued to the subject’s skin. This method is more complicated and less convenient to use, and this method needs to be in direct contact with the skin. Subject to limitations, such as measuring heart rate and body temperature of infants and athletes during exercise.
为此,图像PPG(Photo plethysmography)技术应运而生,PPG即光电容积脉搏波描记法,是一种利用光电手段在活体组织中无创检测血液容积变化的方法,它通过测量经活体组织吸收后的反射光强度,描记血液容积脉冲(BVP)信号后计算心率。傅明哲等人最早提出利用普通网络摄像头的非接触式心率检测方法,该方法利用独立成分分析(ICA)将三个平均的颜色踪迹分离为三个基源信号,通过分析第二个基源信号的功率谱估计心率。以上方法都需要测试者处于合作情况以及需要光线充足的情况下测量,而在光线较弱时,这种方法很难提取出干净的BVP信号,会包含多余的噪声,对检测结果造成严重的影响,由于测量条件的严苛以及测量误差影响,上述方法并未得到广泛应用。For this reason, image PPG (Photoplethysmography) technology came into being. PPG, or photoplethysmography, is a method that uses photoelectric means to non-invasively detect changes in blood volume in living tissues. Reflected light intensity, trace the blood volume pulse (BVP) signal and calculate the heart rate. Fu Mingzhe and others first proposed a non-contact heart rate detection method using an ordinary web camera. This method uses independent component analysis (ICA) to separate three average color traces into three base source signals, and analyzes the second base source signal. The power spectrum estimates the heart rate. The above methods all require the tester to be in a cooperative situation and need to measure under the condition of sufficient light. When the light is weak, this method is difficult to extract a clean BVP signal, which will contain excess noise, which will have a serious impact on the detection result. However, due to the harsh measurement conditions and the influence of measurement errors, the above methods have not been widely used.
因此,如何提供一种实用性强、测量精度高且稳定可靠的非接触式心率、 体温测量方法是本领域技术人员亟需解决的问题。Therefore, how to provide a non-contact heart rate and body temperature measurement method with strong practicability, high measurement accuracy, and stability and reliability is an urgent problem to be solved by those skilled in the art.
发明内容Summary of the invention
有鉴于此,本发明提供了一种基于摄像头的非接触式心率、体温测量方法,该方法测量心率和体温的过程受光线影响小,对测量条件要求低,测得的结果更加准确,解决了现有的非接触式心率测量方法对测量条件要去严苛、测量结果误差大的问题。In view of this, the present invention provides a camera-based non-contact heart rate and body temperature measurement method. The method of measuring heart rate and body temperature is less affected by light, has low requirements for measurement conditions, and the measured results are more accurate. The existing non-contact heart rate measurement methods have to deal with the problem of severe measurement conditions and large errors in the measurement results.
为了实现上述目的,本发明采用如下技术方案:In order to achieve the above objectives, the present invention adopts the following technical solutions:
一种基于摄像头的非接触式心率、体温测量方法,该方法包括:A camera-based non-contact heart rate and body temperature measurement method, the method includes:
S1:在普通可见光条件下,通过摄像头采集待测者面部区域的视频图像,并对采集到的视频图像进行颜色校正;S1: Under ordinary visible light conditions, collect video images of the facial area of the subject through the camera, and perform color correction on the collected video images;
S2:对颜色校正后的每一帧视频图像分别进行人脸识别,并从识别出的人脸区域中截取人脸轮廓图像;S2: Perform face recognition on each frame of video image after color correction, and intercept the face contour image from the recognized face area;
S3:对一段连续的视频图像中截取的人脸轮廓图像分别进行深度学习,并求取心电曲线;S3: Perform deep learning on the face contour images intercepted in a continuous video image, and obtain the ECG curve;
S4:对得到的心电曲线进行消除基线漂移以及强化R波处理,通过每分钟内R波出现的次数计算得到待测者的心率值;S4: Eliminate baseline drift and strengthen R wave processing on the obtained ECG curve, and calculate the heart rate value of the subject by calculating the number of R wave appearances per minute;
S5:根据人体正常心率基准与得到的心率值的关系,计算待测者的体温值。S5: Calculate the body temperature of the subject according to the relationship between the normal human heart rate benchmark and the obtained heart rate value.
本发明的有益效果是:该方法通过对原始视频图像进行颜色校正,去除光线带来的干扰,最大程度上降低了测量过程中光线强度对测量结果的影响,该方法使用深度学习的方法获取心电曲线,无需对人脸的关键部位进行定位,只需将人脸轮廓图像输入构建好的模型中,即可得到心电曲线,整个测量过程简单便捷,在测得心率的基础上,还可以进一步利用心率值计算到待测者 的体温数据,整个方法不仅测量精度大大提高,功能也更加齐全,更能满足实际心率、体温测量需求。The beneficial effect of the present invention is that the method uses color correction to the original video image to remove the interference caused by the light, and minimizes the influence of the light intensity on the measurement result during the measurement process. The method uses the deep learning method to obtain the heart Electrical curve, no need to locate the key parts of the face, just input the contour image of the face into the built model, you can get the ECG curve, the whole measurement process is simple and convenient, based on the measured heart rate, you can also Further using the heart rate value to calculate the body temperature data of the person to be tested, the whole method not only greatly improves the measurement accuracy, but also has more complete functions, which can better meet the actual heart rate and body temperature measurement needs.
进一步地,步骤S1中,对采集到的视频图像进行颜色校正,具体包括:Further, in step S1, performing color correction on the collected video image specifically includes:
S101:建立非彩色模型,假设平均图像是非彩色的;S101: Establish an achromatic model, assuming that the average image is achromatic;
S102:获取每一帧视频图像的RGB值,将每一帧视频图像的RGB值分别代入所述非彩色模型进行颜色校正。S102: Obtain the RGB value of each frame of video image, and substitute the RGB value of each frame of video image into the achromatic model to perform color correction.
进一步地,所述非彩色模型为:Further, the achromatic model is:
Figure PCTCN2020103087-appb-000001
Figure PCTCN2020103087-appb-000001
其中,
Figure PCTCN2020103087-appb-000002
为校正后的颜色分量,k为比例系数,取值为:
in,
Figure PCTCN2020103087-appb-000002
Is the corrected color component, k is the scale factor, and the value is:
Figure PCTCN2020103087-appb-000003
Figure PCTCN2020103087-appb-000003
其中,V=2 N-1,0<N<225。 Where, V=2 N -1, 0<N<225.
为了避免测量环境中的光照变化对测量结果的影响,本发明的方法通过转换图像中每个像素点的RGB值来去除光照变换的影响。In order to avoid the influence of the illumination change in the measurement environment on the measurement result, the method of the present invention removes the influence of the illumination transformation by converting the RGB value of each pixel in the image.
进一步地,步骤S2具体包括:Further, step S2 specifically includes:
S201:分别构建SegNet语义分割模型、U-net语义分割模型以及Faster-RCNN与数字抠图耦合的语义分割模型;S201: Build a SegNet semantic segmentation model, a U-net semantic segmentation model, and a semantic segmentation model coupled with Faster-RCNN and digital matting;
S202:分别使用构建好的SegNet语义分割模型、U-net语义分割模型和Faster-RCNN与数字抠图相耦合的语义分割模型对颜色校正后的每一帧视频图像进行人脸识别和语义分割,得到三组识别结果;S202: Use the constructed SegNet semantic segmentation model, U-net semantic segmentation model, and the semantic segmentation model coupled with Faster-RCNN and digital matting to perform face recognition and semantic segmentation on each frame of video image after color correction. Obtain three sets of recognition results;
S203:将得到的三组识别结果进行加权平均得到最终的人脸轮廓图像。S203: Perform a weighted average of the obtained three sets of recognition results to obtain a final face contour image.
采用上述技术方案的有益效果是:使用三种分割模型加权平均的方法获取人脸轮廓图像,相比于直接采用边缘检测直接获得人脸轮廓的方式,本发明得到的人脸轮廓更接近实际人脸形状。The beneficial effect of adopting the above technical solution is that the method of using the weighted average of the three segmentation models to obtain the face contour image, compared with the method of directly obtaining the face contour by directly using the edge detection, the face contour obtained by the present invention is closer to the actual person. Face shape.
进一步地,步骤S3具体包括:Further, step S3 specifically includes:
S301:构建特征融合残差网络,选取多个测试者佩戴心电采集设备获得的心电图像以及同时刻拍摄的视频图像经所述步骤S2处理后得到的人脸轮廓图像作为测试集,对特征融合残差网络进行训练,得到心电检测模型;S301: Construct a feature fusion residual network, select ECG images obtained by multiple testers wearing ECG acquisition equipment and face contour images obtained after processing at step S2 at the same time as the test set, and fuse the features The residual network is trained to obtain the ECG detection model;
S302:将步骤S2获得的一段视图图像中的人脸轮廓图像输入心电检测模型,输出得到心电曲线。S302: Input the face contour image in a segment of view image obtained in step S2 into the ECG detection model, and output the ECG curve.
采用上述进一步方案的有益效果是:通过多组测试集数据对特征融合残差网络进行训练,得到心电检测模型,模型输入量为一段连续的人脸轮廓图像,输出为心电曲线,这样心电曲线在获取过程中,不需要对人脸轮廓进行关键部位提取,直接通过人脸轮廓图像就可以得到心电曲线。The beneficial effect of adopting the above-mentioned further scheme is that the feature fusion residual network is trained through multiple sets of test data to obtain an ECG detection model. The input of the model is a continuous face contour image, and the output is an ECG curve. In the process of obtaining the electrical curve, there is no need to extract the key parts of the face contour, and the electrocardiogram curve can be obtained directly from the face contour image.
进一步地,步骤S5具体包括:Further, step S5 specifically includes:
S501:构建深度学习网络,选取多组不同试验者在同一条件下的心率与体温的对应数据,对深度学习网络进行训练,得到心率体温换算模型;S501: Construct a deep learning network, select multiple groups of different experimenters' heart rate and body temperature corresponding data under the same conditions, train the deep learning network to obtain a heart rate and body temperature conversion model;
S502:将得到的待测者的心率值输入心率体温换算模型,输出得到待测者的体温值。S502: Input the obtained heart rate value of the person to be measured into the heart rate and body temperature conversion model, and output the body temperature value of the person to be measured.
通过构建深度学习网络对多组心率与体温对应数据进行训练,获得心率体温换算关系,进而将待测者的心率数据作为输入值输入模型后,输出对应的体温值,实现体温的测量。By constructing a deep learning network, multiple sets of heart rate and body temperature corresponding data are trained to obtain the heart rate and body temperature conversion relationship, and then the heart rate data of the subject is used as an input value into the model, and the corresponding body temperature value is output to realize the measurement of body temperature.
附图说明Description of the drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。In order to explain the embodiments of the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only It is an embodiment of the present invention. For those of ordinary skill in the art, other drawings can be obtained based on the provided drawings without creative work.
图1附图为本发明提供的一种基于摄像头的非接触式心率、体温测量方法的流程示意图;Fig. 1 is a schematic flowchart of a camera-based non-contact heart rate and body temperature measurement method provided by the present invention;
图2附图为本发明实施例中对采集到的视频图像进行颜色校正过程的流程示意图;Fig. 2 is a schematic diagram of a process flow of performing color correction on a captured video image in an embodiment of the present invention;
图3附图为本发明实施例中获取人脸轮廓图像过程的流程示意图;Fig. 3 is a schematic flowchart of a process of acquiring a face contour image in an embodiment of the present invention;
图4附图为本发明实施例中SegNet网络结构示意图;Figure 4 is a schematic diagram of a SegNet network structure in an embodiment of the present invention;
图5附图为本发明实施例中U-Net网络结构示意图;Figure 5 is a schematic diagram of the U-Net network structure in an embodiment of the present invention;
图6附图为本发明实施例中获取心电曲线过程的流程示意图;Fig. 6 is a schematic diagram of a process flow of obtaining an ECG curve in an embodiment of the present invention;
图7附图为本发明实施例中特征融合残差网络结构示意图;Fig. 7 is a schematic diagram of a feature fusion residual network structure in an embodiment of the present invention;
图8附图为本发明实施例中EDSR和WDSR网络结构示意图;Fig. 8 is a schematic diagram of the network structure of EDSR and WDSR in an embodiment of the present invention;
图9附图为本发明实施例中RSDB中使用的卷积核大小及WDSR中使用的卷积核大小示意图;9 is a schematic diagram of the size of the convolution kernel used in RSDB and the size of the convolution kernel used in WDSR in an embodiment of the present invention;
图10附图为本发明实施例中根据人体正常心率基准与得到的心率值的关系,计算待测者的体温值过程的流程示意图。Fig. 10 is a schematic flow chart of the process of calculating the body temperature value of the test subject according to the relationship between the normal human heart rate reference and the obtained heart rate value in the embodiment of the present invention.
具体实施方式Detailed ways
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, rather than all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of the present invention.
参见附图1,本发明实施例公开了一种基于摄像头的非接触式心率、体温测量方法,该方法包括:Referring to Figure 1, an embodiment of the present invention discloses a camera-based non-contact heart rate and body temperature measurement method, which includes:
S1:在普通可见光条件下,通过摄像头采集待测者面部区域的视频图像,并对采集到的视频图像进行颜色校正;S1: Under ordinary visible light conditions, collect video images of the facial area of the subject through the camera, and perform color correction on the collected video images;
S2:对颜色校正后的每一帧视频图像分别进行人脸识别,并从识别出的人脸区域中截取人脸轮廓图像;S2: Perform face recognition on each frame of video image after color correction, and intercept the face contour image from the recognized face area;
S3:对一段连续的视频图像中截取的人脸轮廓图像分别进行深度学习,并求取心电曲线;S3: Perform deep learning on the face contour images intercepted in a continuous video image, and obtain the ECG curve;
S4:对得到的心电曲线进行消除基线漂移以及强化R波处理,通过每分钟内R波出现的次数计算得到待测者的心率值;S5:根据人体正常心率基准与得到的心率值的关系,计算待测者的体温值。S4: Eliminate baseline drift and strengthen R wave processing on the obtained ECG curve, and calculate the heart rate value of the subject by calculating the number of R wave appearances per minute; S5: According to the relationship between the body's normal heart rate reference and the obtained heart rate value , Calculate the body temperature of the person to be tested.
在一个具体的实施例中,参见附图2,步骤S1中,对采集到的视频图像进行颜色校正,具体包括:In a specific embodiment, referring to FIG. 2, in step S1, performing color correction on the collected video image specifically includes:
S101:建立非彩色模型,假设平均图像是非彩色的;S101: Establish an achromatic model, assuming that the average image is achromatic;
S102:获取每一帧视频图像的RGB值,将每一帧视频图像的RGB值分别代入非彩色模型进行颜色校正。S102: Obtain the RGB value of each frame of video image, and substitute the RGB value of each frame of video image into the achromatic model to perform color correction.
在一个具体的实施例中,非彩色模型为:In a specific embodiment, the achromatic model is:
Figure PCTCN2020103087-appb-000004
Figure PCTCN2020103087-appb-000004
其中,
Figure PCTCN2020103087-appb-000005
为校正后的颜色分量,k为比例系数,取值为:
in,
Figure PCTCN2020103087-appb-000005
Is the corrected color component, k is the scale factor, and the value is:
Figure PCTCN2020103087-appb-000006
Figure PCTCN2020103087-appb-000006
其中,V=2 N-1,0<N<225。 Where, V=2 N -1, 0<N<225.
为了避免测量环境中的光照变化对测量结果的影响,本实施例的方法通过转换图像中每个像素点的RGB值来去除光照变换的影响。In order to avoid the influence of the illumination change in the measurement environment on the measurement result, the method of this embodiment removes the influence of the illumination transformation by converting the RGB value of each pixel in the image.
在一个具体的实施例中,参见附图3,步骤S2具体包括:In a specific embodiment, referring to FIG. 3, step S2 specifically includes:
S201:分别构建SegNet语义分割模型、U-net语义分割模型以及Faster-RCNN与数字抠图耦合的语义分割模型;S201: Build a SegNet semantic segmentation model, a U-net semantic segmentation model, and a semantic segmentation model coupled with Faster-RCNN and digital matting;
S202:分别使用构建好的SegNet语义分割模型、U-net语义分割模型和Faster-RCNN与数字抠图相耦合的语义分割模型对颜色校正后的每一帧视频图像进行人脸识别和语义分割,得到三组识别结果;S202: Use the constructed SegNet semantic segmentation model, U-net semantic segmentation model, and the semantic segmentation model coupled with Faster-RCNN and digital matting to perform face recognition and semantic segmentation on each frame of video image after color correction. Obtain three sets of recognition results;
S203:将得到的三组识别结果进行加权平均得到最终的人脸轮廓图像。S203: Perform a weighted average of the obtained three sets of recognition results to obtain a final face contour image.
下面对上述的三个分割模型分别做对应的说明,内容如下:The following is a corresponding description of the above three segmentation models, and the content is as follows:
(1)SegNet语义分割模型(1) SegNet semantic segmentation model
SegNet是Cambridge提出的图像语义分割深度网络,开放源码,基于caffe框架。SegNet是基于FCN,修改VGG-16网络得到的语义分割网络。该网络结构清晰易懂,训练快速坑少,SegNet网络结构是编码器-解码器的结构,SegNet做语义分割时通常在末端加入CRF模块做后处理,旨在进一步精修边缘的分割结果。SegNet is an image semantic segmentation deep network proposed by Cambridge, open source, based on the caffe framework. SegNet is a semantic segmentation network obtained by modifying the VGG-16 network based on FCN. The network structure is clear and easy to understand, and the training is fast and there are few pits. The SegNet network structure is an encoder-decoder structure. When SegNet performs semantic segmentation, a CRF module is usually added at the end for post-processing, aiming to further refine the edge segmentation results.
SegNet的新颖之处在于解码器对其较低分辨率的输入特征图进行上采样的方式。具体地说,解码器使用了在相应编码器的最大池化步骤中计算的池化索引来执行非线性上采样。这种方法消除了学习上采样的需要。经上采样后的特征图是稀疏的,因此随后使用可训练的卷积核进行卷积操作,生成密集的特征图。SegNet在解码器中使用去池化对特征图进行上采样,并在分割中保持高频细节的完整性。编码器不使用全连接层(和FCN一样进行卷积),因此是拥有较少参数的轻量级网络。编码器中的每一个最大池化层的索引都存储了起来,用于之后在解码器中使用那些存储的索引来对相应特征图进行去池化操作。这有助于保持高频信息的完整性,但当对低分辨率的特征图进行去池化时,会忽略邻近的信息。SegNet网络结构图如图4所示。The novelty of SegNet lies in the way the decoder upsamples its lower resolution input feature maps. Specifically, the decoder uses the pooling index calculated in the maximum pooling step of the corresponding encoder to perform non-linear upsampling. This method eliminates the need for learning upsampling. The feature map after upsampling is sparse, so a trainable convolution kernel is then used to perform convolution operations to generate dense feature maps. SegNet uses de-pooling in the decoder to upsample the feature map and maintain the integrity of high-frequency details during segmentation. The encoder does not use a fully connected layer (convolution like FCN), so it is a lightweight network with fewer parameters. The index of each maximum pooling layer in the encoder is stored, and the stored indexes are used in the decoder to perform the depooling operation on the corresponding feature map later. This helps maintain the integrity of high-frequency information, but when depooling low-resolution feature maps, neighboring information is ignored. SegNet network structure diagram is shown as in Fig. 4.
SegNet语义分割模型包含了卷积层、批标准化层、激活层、池化层、上采样层以及Softmax层,其中卷积层和激活层都与基于patch的CNN分类模型中的一样,池化层和上采样层针对信息丢失做了相应处理,使用Soflmax函数做分类。The SegNet semantic segmentation model includes a convolutional layer, a batch normalization layer, an activation layer, a pooling layer, an upsampling layer, and a Softmax layer. The convolutional layer and the activation layer are the same as those in the patch-based CNN classification model. The pooling layer And the upsampling layer has done corresponding processing for information loss, using Soflmax function for classification.
批标准化(Batch Normalization,BN)操作通过变换重构,加快了模型的收敛,大大提高了训练的速度,并且提高了网络的泛化能力,抑制过拟合的情况。可用于激活函数前,将上一层输出的数据进行标准化,使输出的不同维度的值均值为0,方差为1。The Batch Normalization (BN) operation speeds up the convergence of the model through transformation and reconstruction, greatly improves the speed of training, and improves the generalization ability of the network to suppress over-fitting. Before it can be used to activate the function, normalize the output data of the previous layer, so that the value of the different dimensions of the output has a mean value of 0 and a variance of 1.
池化(Pooling)的本质就是采样,对输入的特征图进行某种压缩,减小了特征图,简化网络计算复杂度;对小范围的像素偏移有了更好的适应,使网络更加鲁棒。常用的池化操作有Max Pooling,在每个区域中寻找最大值。The essence of pooling is sampling, which compresses the input feature map to reduce the feature map and simplify the calculation complexity of the network; it has better adaptation to a small range of pixel offsets, making the network more robust. Great. Commonly used pooling operations include Max Pooling, which searches for the maximum value in each area.
(2)上采样(Upsampling)是池化操作的逆过程,通过在池化层中记录的索引位置,可以将特征图的数据放回池化操作时对应的位置,在其他的位置补上0值。(2) Upsampling is the inverse process of the pooling operation. Through the index position recorded in the pooling layer, the data of the feature map can be returned to the corresponding position during the pooling operation, and 0 is added to other positions. value.
(2)U-net语义分割(2) U-net semantic segmentation
(3)U-Net网络结构非常简单,前半部分作用是特征提取,后半部分是上采样。也可称作编码器-解码器结构。由于此网络整体结构类似于大写的英文字母U,故得名U-Net。U-Net与其他常见的分割网络有一点非常不同的地方:U-Net采用了完全不同的特征融合方式:拼接,U-Net采用将特征在通道维度拼接在一起,形成更厚的特征。而FCN融合时使用的对应点相加,并不形成更厚的特征。(3) The U-Net network structure is very simple. The first half is for feature extraction, and the second half is for upsampling. It can also be called an encoder-decoder structure. Since the overall structure of this network is similar to the uppercase English letter U, it is named U-Net. U-Net is very different from other common segmentation networks: U-Net uses a completely different feature fusion method: splicing. U-Net uses the feature to be spliced together in the channel dimension to form a thicker feature. The addition of corresponding points used in FCN fusion does not form a thicker feature.
(4)根据U-Net的结构,它能够结合底层和高层的信息。底层(深层)信息:经过多次下采样后的低分辨率信息。能够提供分割目标在整个图像中上下文语义信息,可理解为反应目标和它的环境之间关系的特征。这个特征有助于物体的类别判断(所以分类问题通常只需要低分辨率/深层信息,不涉及多尺度融合)高层(浅层)信息:经过concatenate操作从编码器直接传递到同高度解码器上的高分辨率信息。能够为分割提供更加精细的特征,如梯度等。U-Net有很多优点,最大特点就是它可以在小数据集上也能训练出一个好 的模型,这个优点对于本项目任务来说可缩短标注训练样本过程。而且,U-Net在训练速度上也是非常快的。(4) According to the structure of U-Net, it can combine the information of the bottom layer and the high layer. Bottom (deep) information: low-resolution information after multiple downsampling. It can provide the contextual semantic information of the segmentation target in the entire image, which can be understood as a feature that reflects the relationship between the target and its environment. This feature helps to determine the category of the object (so the classification problem usually only requires low resolution/deep information, and does not involve multi-scale fusion) high-level (shallow) information: directly passed from the encoder to the same height decoder through the concatenate operation High-resolution information. It can provide more refined features for segmentation, such as gradients. U-Net has many advantages. The biggest feature is that it can train a good model on a small data set. This advantage can shorten the process of labeling training samples for this project task. Moreover, U-Net is also very fast in training speed.
(5)U-Net的网络结构图如图5所示。图中可以看出,原始U-Net包含18个3×3的卷积层,1个1×1的卷积层,4个2×2的下采样层,4个2×2的上采样层,使用ReLU作为激活函数。通常,池化操作会损失图像中的高频成分,产生钝化模糊的图像块,并丢失位置信息。为了恢复原始图像结构特征,U-Net使用了4次跳跃连接方式来连接低层与高层的特征图。U-Net实际是一个全卷积神经网络,输入和输出均为图像,省略了全连接层。较浅的层用来解决像素定位问题,较深的层用来解决像素分类问题。(5) U-Net's network structure diagram is shown in Figure 5. As can be seen in the figure, the original U-Net contains 18 3×3 convolutional layers, 1 1×1 convolutional layer, 4 2×2 down-sampling layers, and 4 2×2 up-sampling layers , Using ReLU as the activation function. Generally, the pooling operation will lose high-frequency components in the image, produce dull and blurred image blocks, and lose position information. In order to restore the structural features of the original image, U-Net uses a 4-hop connection method to connect the low-level and high-level feature maps. U-Net is actually a fully convolutional neural network, the input and output are images, and the fully connected layer is omitted. The shallower layer is used to solve the pixel positioning problem, and the deeper layer is used to solve the pixel classification problem.
(6)按照标准的卷积神经网络框架,逐层进行转换,结构的最后一层是和原始图像同样大小的预测输出图,输出图中的每一像素点是代表类别的整数值。与原始U-Net结构相比,本实施例采用的网络结构拥有更多的卷积层,并在卷积层和反卷积层前都做了批标准化操作,采用最大池化,激活函数采用的是ELU。把网络中“批标准化+卷积/解卷积+ELU激活”连续操作称为一次“超卷积”。整个网络实际上是由一系列的超卷积、池化、连接和最后的像素级分类操作组成的。(6) According to the standard convolutional neural network framework, the conversion is performed layer by layer. The last layer of the structure is a prediction output map of the same size as the original image, and each pixel in the output image is an integer value representing a category. Compared with the original U-Net structure, the network structure adopted in this embodiment has more convolutional layers, and batch standardization operations are performed before the convolutional layer and the deconvolutional layer. The maximum pooling is adopted, and the activation function is adopted It's ELU. The continuous operation of "batch normalization + convolution/deconvolution + ELU activation" in the network is called a "super convolution". The entire network is actually composed of a series of superconvolution, pooling, connection and finally pixel-level classification operations.
(7)卷积操作中,卷积过滤器尺寸为3×3×64,单位步长,零填充;反卷积操作中,过滤器尺寸均为2×2×64,输出尺寸是输入尺寸的2倍,步长为2,零填充;池化操作中,过滤器尺寸均为2×2,步长也为2。所有过滤器的权值用服从截断高斯分布的随机值来初始化,零均值,方差设成0.1。所有偏置均用0.1m初始化。值得注意的是,在原始U-Net中,过滤器深度从64逐层增加至1024,而本实施例公开的网络把过滤器的深度统一设置为64。如果参照原始U-Net中的过滤器深度,网络不易收敛,分割准确率较低。本实施例改进后的网络具有如下优点:(7) In the convolution operation, the convolution filter size is 3×3×64, unit step size, zero padding; in the deconvolution operation, the filter size is 2×2×64, and the output size is the input size 2 times, the step size is 2, zero filling; in the pooling operation, the filter size is 2×2, and the step size is also 2. The weights of all filters are initialized with random values that obey the truncated Gaussian distribution, with zero mean and the variance set to 0.1. All offsets are initialized with 0.1m. It is worth noting that in the original U-Net, the filter depth is increased from 64 to 1024 layer by layer, while the network disclosed in this embodiment sets the filter depth to 64 uniformly. If you refer to the filter depth in the original U-Net, the network is not easy to converge, and the segmentation accuracy is low. The improved network of this embodiment has the following advantages:
①数据集中类别数和待识别特征数均较少,网络池化操作中丢失掉的信息可以通过“反卷积”和“跳跃连接”重新获取。① Both the number of categories and the number of features to be identified in the data set are small, and the information lost in the network pooling operation can be retrieved through "deconvolution" and "jump connection".
②设计统一的过滤器数量,可以降低时间和空间复杂度。②Design a uniform number of filters, which can reduce time and space complexity.
③使用更深的网络,有利用提高分割精度。③Using a deeper network can improve the accuracy of segmentation.
(3)基于Faster RCNN与交互式数字抠图相耦合的分割模型(3) A segmentation model based on the coupling of Faster RCNN and interactive digital matting
该模型构建方法具体为:The model construction method is specifically as follows:
首先,获取人脸图像;然后,对应人脸标注方框位置、图片和标注文件按比例分为训练集和测试集;接着,将处理后的图片集送入卷积神经网络进行训练;在通过特征提取模块进行特征提取的过程中,特征提取网络基于ZF网络,采用区域建议网络生成区域建议框;同时使用Faster-RCNN网络作为检测框架。First, obtain the face image; then, the corresponding face label box position, picture and label file are divided into training set and test set in proportion; then, the processed picture set is sent to the convolutional neural network for training; In the process of feature extraction by the feature extraction module, the feature extraction network is based on the ZF network, and the region suggestion network is used to generate the region suggestion frame; at the same time, the Faster-RCNN network is used as the detection framework.
RCNN类的方法可以细分为四个步骤:生成候选区域、特征提取、建议区域分类、坐标回归。在RCNN中候选区域的生成使用Selective Search,然后使用卷积网络进行特征提取,最后通过SVM将提取到的特征进行分类并通过回归网络对位置进行精修。在Fast RCNN中,将特征提取、SVM、回归网络合并在一个卷积神经网络,极大地提升了运行速度。但是在Fast RCNN中,对于每个候选区域都需要进行卷积特征提取,存在大量的重复计算。但是在Faster RCNN中将候选区域生成也通过卷积网络完成并将生成候选区域的特征提取部分的网络和分类部分的特征提取网络相合并。除此之外,Faster RCNN使用ROI pooling,将生成的候选区域位置映射在最后一层特征层上,免去了大量的重复计算。从网络结构上看,可以将Faster RCNN视为RPN网络和Fast RCNN网络结合。The RCNN method can be subdivided into four steps: candidate region generation, feature extraction, suggested region classification, and coordinate regression. Selective Search is used to generate candidate regions in RCNN, and then convolutional network is used for feature extraction. Finally, the extracted features are classified by SVM and the location is refined by regression network. In Fast RCNN, feature extraction, SVM, and regression network are combined into a convolutional neural network, which greatly improves the running speed. However, in Fast RCNN, convolution feature extraction is required for each candidate region, and there are a lot of repeated calculations. However, in Faster RCNN, the candidate region generation is also completed through a convolutional network, and the feature extraction network of the feature extraction part that generates the candidate region and the feature extraction network of the classification part are combined. In addition, Faster RCNN uses ROI pooling to map the position of the generated candidate region on the last feature layer, eliminating a lot of repeated calculations. From the perspective of network structure, Faster RCNN can be regarded as a combination of RPN network and Fast RCNN network.
通过Faster-RCNN网络检测过程中,对于存在损失的图像,本项目通过损失函数:In the process of detecting through the Faster-RCNN network, for the loss of the image, this project uses the loss function:
Figure PCTCN2020103087-appb-000007
Figure PCTCN2020103087-appb-000007
其中,i是建议框的索引号,P i是建议框中含有典型天气要素的概率;
Figure PCTCN2020103087-appb-000008
通过人工标记的标签来计算,人工标记的标签若含有典型人脸要素,即为1,不含则为0;t i是代表建议框坐标的四维向量,而
Figure PCTCN2020103087-appb-000009
是代表人工标记人脸要素坐标的四维向量(即矩形框坐标表示);分类损失函数定义为:
Among them, i is the index number of the suggestion box, and P i is the probability that the suggestion box contains typical weather elements;
Figure PCTCN2020103087-appb-000008
Calculated by manually marked labels. If the manually marked label contains typical face elements, it is 1, and if it does not contain it, it is 0; t i is a four-dimensional vector representing the coordinates of the suggested frame, and
Figure PCTCN2020103087-appb-000009
Is a four-dimensional vector representing the coordinates of the artificially labeled face elements (that is, the coordinates of the rectangular frame); the classification loss function is defined as:
Figure PCTCN2020103087-appb-000010
Figure PCTCN2020103087-appb-000010
其中,建议框回归损失函数L reg定义为:
Figure PCTCN2020103087-appb-000011
其中R是鲁棒的损失函数smooth L1定义为:
Among them, the proposed box regression loss function L reg is defined as:
Figure PCTCN2020103087-appb-000011
Where R is the robust loss function smooth L1 is defined as:
Figure PCTCN2020103087-appb-000012
Figure PCTCN2020103087-appb-000012
为了得到高质量的人脸抠图,基于前景和背景亮度平滑变化引入了一个代价函数,演示了如何消除前景和背景亮度获得一个二次代价函数,原理如下:In order to obtain high-quality face matting, a cost function is introduced based on the smooth changes of foreground and background brightness, which demonstrates how to eliminate foreground and background brightness to obtain a quadratic cost function. The principle is as follows:
假设得到的人脸图片为由前景亮度F和前景亮度B组成的图像I,采用图像抠图算法对图像I进行处理,即将图像I作为输入;可得第i个像素的亮度是对应前景亮度和背景亮度的组合:Assuming that the obtained face image is an image I composed of foreground brightness F and foreground brightness B, the image matting algorithm is used to process image I, that is, image I is input; the brightness of the i-th pixel is the corresponding foreground brightness and Combination of background brightness:
I i=α iF i+(1-α i)B i I ii F i +(1-α i )B i
其中,α i是像素前景不透明部分。 Among them, α i is the opaque part of the pixel foreground.
为了最终获得一个良好的抠图,本实施例中采用闭合形式方案从人脸图像中提取抠图;具体的,从前景亮度F和背景亮度B上的局部平滑中得到一个代价函数为:In order to finally obtain a good cutout, the closed form scheme is used to extract the cutout from the face image in this embodiment; specifically, a cost function obtained from the local smoothing on the foreground brightness F and the background brightness B is:
I i=α iF i+(1-α i)B i I ii F i +(1-α i )B i
从结果的表达式来看,可以消除前景亮度F和背景亮度B,生成一个α的二次代价函数;并通过解稀疏线性方程组的方法得到二次代价函数的全局最优解;本实施例只需要直接计算α,而不需要估计前景亮度F和背景亮度B,同时用户输入少,可以在一定程度上减少计算量,最后得到高质量的抠图; 并对得到的抠图采用闭合形式公式检验稀疏矩阵的特征向量理解和预测方案的特征。From the expression of the result, the foreground brightness F and the background brightness B can be eliminated to generate a quadratic cost function of α; and the global optimal solution of the quadratic cost function can be obtained by solving the sparse linear equations; this embodiment Only need to directly calculate α without estimating foreground brightness F and background brightness B. At the same time, there is less user input, which can reduce the amount of calculation to a certain extent, and finally obtain high-quality cutouts; and use closed-form formulas for the obtained cutouts Check the eigenvector understanding of the sparse matrix and the characteristics of the prediction scheme.
由于此先推导一个灰度图像的抠图闭合形式方案,且抠图是一个严重的欠约束问题,因此,需要对前景亮度F、背景亮度B和/或α进行假设操作。Since this first derives a closed-form scheme of matting for grayscale images, and matting is a serious under-constrained problem, it is necessary to perform hypothetical operations on foreground brightness F, background brightness B and/or α.
具体的,假设前景亮度F和背景亮度B是每个像素附近小窗口上的近似常量,并设定前景亮度F和背景亮度B局部平滑;在本实施例中,前景亮度F和背景亮度B局部平滑并不意味着输入的图像I是局部平滑的,α不连续意味着I不连续;由此可对公式(1-4)I i=α iF i+(1-α i)B i进行重写得到α表示成图像I的线性函数: Specifically, it is assumed that the foreground brightness F and the background brightness B are approximately constants on a small window near each pixel, and the foreground brightness F and the background brightness B are set to be locally smooth; in this embodiment, the foreground brightness F and the background brightness B are locally smoothed. Smoothing does not mean that the input image I is locally smooth, and α discontinuous means I is discontinuous; therefore, the formula (1-4) I i = α i F i + (1-α i ) B i Rewrite to get α expressed as a linear function of image I:
Figure PCTCN2020103087-appb-000013
Figure PCTCN2020103087-appb-000013
其中,
Figure PCTCN2020103087-appb-000014
和W是小的图像窗口。
in,
Figure PCTCN2020103087-appb-000014
And W is the small image window.
在此,需要求解出α、a和b,本项目通过最小化代价函数进行求解,公式如下:Here, α, a, and b need to be solved. This project is solved by minimizing the cost function. The formula is as follows:
Figure PCTCN2020103087-appb-000015
Figure PCTCN2020103087-appb-000015
其中,W j是像素j附近的小窗口。此外,为了保证通过代价函数得到的数值具有稳定性,本实施例在代价函数中对a的进行正则化项操作。 Among them, W j is a small window near the pixel j. In addition, in order to ensure that the value obtained through the cost function is stable, this embodiment performs a regularization operation on a in the cost function.
优选的,在本实施例中,使用3×3像素的窗口实现上述操作,具体在每个像素周围放一个窗口,这样代价函数中的窗口W j会重叠在一起,以保证相邻像素间的信息重叠,保证最后得到高质量的alpha抠图;当然,对于使用的像素窗口,本实施例并未进行限制和固定,可根据实际情况进行选择。这样,由于代价函数(1-6)是α,a和b的二次函数,在实际情况中,对一个有N个像素的图像,即公有3N个未知数;此时,为了得到只含有N个未知数的二次代价函数,即像素的alpha值,本实施例通过如下方式消除a和b。 Preferably, in this embodiment, a window of 3×3 pixels is used to implement the above operation. Specifically, a window is placed around each pixel, so that the windows W j in the cost function overlap together to ensure the gap between adjacent pixels. The information overlaps to ensure that high-quality alpha matting is finally obtained; of course, the pixel window used is not limited and fixed in this embodiment, and the selection can be made according to the actual situation. In this way, since the cost function (1-6) is the quadratic function of α, a and b, in actual situations, for an image with N pixels, there are 3N unknowns in common; at this time, in order to obtain only N The unknown quadratic cost function, that is, the alpha value of the pixel, this embodiment eliminates a and b in the following manner.
在实施例中,通过基于深度学习的区域定位只能定位出人脸要素的位置,根据上述关于使用深度学习的定位过程可知当α=0时的背景亮度B,以及α=1时的前景亮度F;这样,就可以通过解如下方程:In the embodiment, only the position of the face element can be located through the regional positioning based on deep learning. According to the above-mentioned positioning process using deep learning, it can be known that the background brightness B when α=0 and the foreground brightness when α=1 F; In this way, you can solve the following equation:
α=argminα TLα,s.t.α i=s i α=argminα T Lα, stα i =s i
s是笔刷像素集,s i是笔刷所指的值实现α的提取操作,具体的,采用3×3的窗口定义拉普拉斯抠图矩阵,在其他实施例中,当前景亮度F和背景亮度B分布不是很复杂时,可用更宽的窗口;同时,为了保证使用更宽窗口的计算时间减少,本实施例采用与图像I的α磨砂通道的线性系数: s is the brush pixel set, s i is the value pointed to by the brush to achieve the extraction operation of α. Specifically, a 3×3 window is used to define the Laplacian matting matrix. In other embodiments, the current scene brightness F When the background brightness B distribution is not very complicated, a wider window can be used; at the same time, in order to ensure that the calculation time of using a wider window is reduced, this embodiment adopts the linear coefficient of the α matte channel of the image I:
Figure PCTCN2020103087-appb-000016
Figure PCTCN2020103087-appb-000016
即用一个较宽的窗口在一个更精细的分辨率上得到的系数与从一个粗糙的图像上一个较小的窗口中获得相似,由此计算粗糙图像的线性系数;随后对线性系数插值,将其应用在一个更精细的分辨率图像上;从而获得的α磨砂通道与用更宽的窗口在更精细的图像上直接求解抠图系统时相似的,即得到α值,获得高质量的图片。That is, the coefficient obtained at a finer resolution with a wider window is similar to that obtained from a smaller window on a coarse image, thereby calculating the linear coefficient of the coarse image; then the linear coefficient is interpolated, and the It is applied to a finer resolution image; the resulting alpha frosted channel is similar to that when a wider window is used to directly solve the matting system on a finer image, that is, the alpha value is obtained, and high-quality pictures are obtained.
本发明使用了三种分割模型,模型也会采取不同参数去训练和预测,那么就会得到很多预测分割图,这样三种智能识别模型的识别结果为每个像素点是人脸的概率。接下来基于人工智能的方法对这三种识别模型的识别结果确定权重系数。方法为:利用人工智能的方法对这三种识别模型的历史识别正确率进行学习训练,通过智能学习训练得到每个识别模型的权重,最后通过加权平均得到每个像素点上某种气象要素发生的概率,当发生的概率大于某个阈值时(如80%),则判别是人脸图像,从而得到准确的人脸轮廓图像。The present invention uses three segmentation models, and the models also adopt different parameters for training and prediction, so that many prediction segmentation maps will be obtained, so that the recognition results of the three intelligent recognition models are the probability that each pixel is a face. Next, the method based on artificial intelligence determines the weight coefficients of the recognition results of these three recognition models. The method is: use artificial intelligence to learn and train the historical recognition accuracy of these three recognition models, obtain the weight of each recognition model through intelligent learning and training, and finally obtain the occurrence of a certain meteorological element on each pixel through weighted average. When the probability of occurrence is greater than a certain threshold (such as 80%), it is judged to be a face image, so as to obtain an accurate face contour image.
在一个具体的实施例中,参见附图6,步骤S3具体包括:In a specific embodiment, referring to FIG. 6, step S3 specifically includes:
S301:构建特征融合残差网络,选取多个测试者佩戴心电采集设备获得的心电图像以及同时刻拍摄的视频图像经所述步骤S2处理后得到的人脸轮廓图像作为测试集,对特征融合残差网络进行训练,得到心电检测模型;S301: Construct a feature fusion residual network, select ECG images obtained by multiple testers wearing ECG acquisition equipment and face contour images obtained after processing at step S2 at the same time as the test set, and fuse the features The residual network is trained to obtain the ECG detection model;
S302:将步骤S2获得的一段视图图像中的人脸轮廓图像输入心电检测模型,输出得到心电曲线。S302: Input the face contour image in a segment of view image obtained in step S2 into the ECG detection model, and output the ECG curve.
下面对本实施例中提到的特征融合残差网络结构做具体说明。The following specifically describes the feature fusion residual network structure mentioned in this embodiment.
特征融合残差网络(FFRN)是基于超分辨网络EDSR和WDSR整合得到的,该网络适用于稀疏CT图像重建,FFRN网络架构如图7所示。首先EDSR和WDSR的提出在其相关领域都取得了巨大的进步,也为图像重建方向提供了重要的思路,二者均采用残差块,WDSR也对残差块进行了改进,减少了网络的参数,同时增加了准确性。但二者均没有充分利用RB中的特征信息。因此,我们提出采用RSDB作为网络浅层的构建模块。局部特征融合层在构建模块的两层卷积层后。RSDB跳过式的连接两个构建模块中局部特征融合层,前一个模块特征融合结果作为后一个模块的输入。而后堆叠局部特征融合后的特征,利用残差学习来整合特征信息,形成网络基本架构。Feature Fusion Residual Network (FFRN) is based on the integration of super-resolution networks EDSR and WDSR. This network is suitable for sparse CT image reconstruction. The FFRN network architecture is shown in Figure 7. First of all, the proposal of EDSR and WDSR has made great progress in their related fields, and also provides important ideas for the direction of image reconstruction. Both use residual blocks. WDSR also improves the residual blocks to reduce network problems. Parameters, while increasing accuracy. However, neither of them makes full use of the feature information in the RB. Therefore, we propose to adopt RSDB as the shallow building module of the network. The local feature fusion layer is after the two-layer convolutional layer of the building module. RSDB skips to connect the local feature fusion layer in the two building modules, and the feature fusion result of the former module is used as the input of the latter module. Then stack the features after local feature fusion, and use residual learning to integrate feature information to form the basic network architecture.
从图8可以看到,EDSR和WDSR均在网络末端使用上采样(pixel shuffl),这种方法可以减少计算而不会损失模型容量,大幅提高运行速度。而WDSR采用的新型上采样方法(pixel shuffle)对网络准确性定影响不大。图像的缩放操作不会增加图像的信息,因此图像的质量将不可避免的下降,特征信息也会受到影响。而医学影像校正任务是对密集像素进行预测,对于特征信息量大小非常敏感,因此FFRN网络选择舍弃上采样方法,网络中图像大小保持不变的进行端到端学习。It can be seen from Figure 8 that both EDSR and WDSR use up-sampling (pixel shuffl) at the end of the network. This method can reduce calculations without loss of model capacity and greatly increase the operating speed. The new up-sampling method (pixel shuffle) adopted by WDSR has little effect on network accuracy. The zooming operation of the image will not increase the information of the image, so the quality of the image will inevitably decrease, and the feature information will also be affected. The medical image correction task is to predict dense pixels, which is very sensitive to the amount of feature information. Therefore, the FFRN network chooses to abandon the up-sampling method, and the image size in the network remains unchanged for end-to-end learning.
当使用卷积提取图像特征时,卷积核的大小将确定卷积的感受野,并且还会影响模型的参数量。为了减少计算开销和参数,WDSR-B选择增加ReLU激活层前面的卷积核个数,并减少ReLU激活层之后的卷积核个数。WDSR-A在激 活层前后使用3×3大小卷积核,而WDSR-B在ReLU激活层前后使用1×1大小卷积核进一步扩大激活层前通道数,获得更宽泛的特征图,如图9所示。当WDSR-B训练以RB为网络构建模块的深度神经网络时。网络达到一定深度后,精度没有明显提高。其CT图像伪影去除性能甚至不如WDSR-A。因此,我们提出的RSDB中均使用3×3小卷积核。这种形式可以增加卷积感受野,同时又避免了使用大卷积核提取太多毫无意义的特征。将3×3卷积核拆分为3×1和1×3卷积核具有与3×3卷积核相同的效果,并加快了操作速度。网络最后层为全连接层,与输出心电曲线耦合。When convolution is used to extract image features, the size of the convolution kernel will determine the receptive field of the convolution and will also affect the parameters of the model. In order to reduce computational overhead and parameters, WDSR-B chooses to increase the number of convolution kernels before the ReLU activation layer and reduce the number of convolution kernels after the ReLU activation layer. WDSR-A uses a 3×3 size convolution kernel before and after the activation layer, while WDSR-B uses a 1×1 size convolution kernel before and after the ReLU activation layer to further expand the number of channels before the activation layer to obtain a broader feature map, as shown in the figure 9 shown. When WDSR-B trains a deep neural network with RB as the network building module. After the network reaches a certain depth, the accuracy is not significantly improved. Its CT image artifact removal performance is not even as good as WDSR-A. Therefore, we all use 3×3 small convolution kernels in the RSDBs we proposed. This form can increase the convolution receptive field, while avoiding the use of large convolution kernels to extract too many meaningless features. Splitting the 3×3 convolution kernel into 3×1 and 1×3 convolution kernels has the same effect as the 3×3 convolution kernel and speeds up the operation. The last layer of the network is a fully connected layer, which is coupled with the output ECG curve.
在本实施例中,由于心电信号所有的信息波段中,R波最明显,所以通过检测每分钟内R波出现的次数可以计算待测者的心率。In this embodiment, since the R wave is the most obvious among all the information bands of the ECG signal, the heart rate of the subject can be calculated by detecting the number of appearances of the R wave per minute.
在一个具体的实施例中,参见附图10,步骤S5具体包括:In a specific embodiment, referring to FIG. 10, step S5 specifically includes:
S501:构建深度学习网络,选取多组不同试验者在同一条件下的心率与体温的对应数据,对深度学习网络进行训练,得到心率体温换算模型;S501: Build a deep learning network, select multiple groups of different experimenters' heart rate and body temperature corresponding data under the same conditions, train the deep learning network to obtain a heart rate and body temperature conversion model;
S502:将得到的待测者的心率值输入心率体温换算模型,输出得到待测者的体温值。S502: Input the obtained heart rate value of the person to be measured into the heart rate and body temperature conversion model, and output the body temperature value of the person to be measured.
通过构建深度学习网络对多组心率与体温对应数据进行训练,获得心率体温换算关系,进而将待测者的心率数据作为输入值输入模型后,输出对应的体温值,实现体温的测量。By constructing a deep learning network, multiple sets of heart rate and body temperature corresponding data are trained to obtain the heart rate and body temperature conversion relationship, and then the heart rate data of the subject is used as an input value into the model, and the corresponding body temperature value is output to realize the measurement of body temperature.
在一些实施例中,还可以通过现有的心率与体温的对应关系,对体温数据进行估算,方法如下:In some embodiments, the body temperature data can also be estimated based on the existing corresponding relationship between heart rate and body temperature. The method is as follows:
1)将得到的待测者的心率值与正常心率基准进行做差计算,得到心率差值;1) Calculate the difference between the obtained heart rate value of the test subject and the normal heart rate benchmark to obtain the heart rate difference;
2)根据得到的心率差值以及心率与体温的变换关系,计算体温差值;2) Calculate the body temperature difference according to the obtained heart rate difference and the transformation relationship between heart rate and body temperature;
3)将得到的体温差值与正常体温基准进行求和计算,得到待测者的体温值。3) Calculate the sum of the obtained body temperature difference value and the normal body temperature reference to obtain the body temperature value of the test subject.
由于人体的体温每上升1℃心率会增加10次/分,正常人在平静状态下的心率一般在60-90次/分之间,由此可以得到体温与心率大致的换算关系,进而可以通过心率值对体温值进行估算。As the body temperature of the human body increases by 1℃, the heart rate will increase by 10 beats/min. The heart rate of a normal person in a calm state is generally between 60-90 beats/min. From this, the approximate conversion relationship between body temperature and heart rate can be obtained, and then it can be passed The heart rate value estimates the body temperature value.
综上所述,本发明实施例提供的方法,与现有技术相比,具有如下优点:In summary, compared with the prior art, the method provided by the embodiment of the present invention has the following advantages:
1、通过对原始视频图像进行颜色校正,去除光线带来的干扰;1. Remove the interference caused by light by performing color correction on the original video image;
2、该方法使用深度学习的方法获取心电曲线,无需对人脸的关键部位进行定位,只需将人脸轮廓图像输入构建好的模型中,即可得到心电曲线,整个测量过程简单便捷;2. This method uses the deep learning method to obtain the ECG curve, no need to locate the key parts of the face, just input the face contour image into the built model to get the ECG curve, the whole measurement process is simple and convenient ;
3、在测得心率的基础上,还可以进一步利用心率值计算到待测者的体温数据,整个方法不仅测量精度大大提高,功能也更加齐全,更能满足实际心率、体温测量需求。3. Based on the measured heart rate, the heart rate value can be further used to calculate the body temperature data of the person under test. The whole method not only greatly improves the measurement accuracy, but also has more complete functions, which can better meet the actual heart rate and body temperature measurement needs.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The various embodiments in this specification are described in a progressive manner. Each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments can be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant parts can be referred to the description of the method part.
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments enables those skilled in the art to implement or use the present invention. Various modifications to these embodiments will be obvious to those skilled in the art, and the general principles defined herein can be implemented in other embodiments without departing from the spirit or scope of the present invention. Therefore, the present invention will not be limited to the embodiments shown in this document, but should conform to the widest scope consistent with the principles and novel features disclosed in this document.

Claims (6)

  1. 一种基于摄像头的非接触式心率、体温测量方法,其特征在于,包括:A camera-based non-contact heart rate and body temperature measurement method, which is characterized in that it includes:
    S1:在普通可见光条件下,通过摄像头采集待测者面部区域的视频图像,并对采集到的视频图像进行颜色校正;S1: Under ordinary visible light conditions, collect video images of the facial area of the subject through the camera, and perform color correction on the collected video images;
    S2:对颜色校正后的每一帧视频图像分别进行人脸识别,并从识别出的人脸区域中截取人脸轮廓图像;S2: Perform face recognition on each frame of video image after color correction, and intercept the face contour image from the recognized face area;
    S3:对一段连续的视频图像中截取的人脸轮廓图像分别进行深度学习,并求取心电曲线;S3: Perform deep learning on the face contour images intercepted in a continuous video image, and obtain the ECG curve;
    S4:对得到的心电曲线进行消除基线漂移以及强化R波处理,通过每分钟内R波出现的次数计算得到待测者的心率值;S4: Eliminate baseline drift and strengthen R wave processing on the obtained ECG curve, and calculate the heart rate value of the subject by calculating the number of R wave appearances per minute;
    S5:根据人体正常心率基准与得到的心率值的关系,计算待测者的体温值。S5: Calculate the body temperature of the subject according to the relationship between the normal human heart rate benchmark and the obtained heart rate value.
  2. 根据权利要求1所述的一种基于摄像头的非接触式心率、体温测量方法,其特征在于,步骤S1中,对采集到的视频图像进行颜色校正,具体包括:A camera-based non-contact heart rate and body temperature measurement method according to claim 1, wherein in step S1, performing color correction on the collected video image specifically includes:
    S101:建立非彩色模型,假设平均图像是非彩色的;S101: Establish an achromatic model, assuming that the average image is achromatic;
    S102:获取每一帧视频图像的RGB值,将每一帧视频图像的RGB值分别代入所述非彩色模型进行颜色校正。S102: Obtain the RGB value of each frame of video image, and substitute the RGB value of each frame of video image into the achromatic model to perform color correction.
  3. 根据权利要求2所述的一种基于摄像头的非接触式心率、体温测量方法,其特征在于,所述非彩色模型为:A camera-based non-contact heart rate and body temperature measurement method according to claim 2, wherein the achromatic model is:
    Figure PCTCN2020103087-appb-100001
    Figure PCTCN2020103087-appb-100001
    其中,
    Figure PCTCN2020103087-appb-100002
    为校正后的颜色分量,k为比例系数,取值为:
    in,
    Figure PCTCN2020103087-appb-100002
    Is the corrected color component, k is the scale factor, and the value is:
    Figure PCTCN2020103087-appb-100003
    Figure PCTCN2020103087-appb-100003
    其中,V=2 N-1,0<N<225。 Where, V=2 N -1, 0<N<225.
  4. 根据权利要求1所述的一种基于摄像头的非接触式心率、体温测量方法,其特征在于,步骤S2具体包括:A camera-based non-contact heart rate and body temperature measurement method according to claim 1, wherein step S2 specifically includes:
    S201:分别构建SegNet语义分割模型、U-net语义分割模型以及Faster-RCNN与数字抠图耦合的语义分割模型;S201: Build a SegNet semantic segmentation model, a U-net semantic segmentation model, and a semantic segmentation model coupled with Faster-RCNN and digital matting;
    S202:分别使用构建好的SegNet语义分割模型、U-net语义分割模型和Faster-RCNN与数字抠图相耦合的语义分割模型对颜色校正后的每一帧视频图像进行人脸识别和语义分割,得到三组识别结果;S202: Use the constructed SegNet semantic segmentation model, U-net semantic segmentation model, and the semantic segmentation model coupled with Faster-RCNN and digital matting to perform face recognition and semantic segmentation on each frame of video image after color correction. Obtain three sets of recognition results;
    S203:将得到的三组识别结果进行加权平均得到最终的人脸轮廓图像。S203: Perform a weighted average of the obtained three sets of recognition results to obtain a final face contour image.
  5. 根据权利要求1所述的一种基于摄像头的非接触式心率、体温测量方法,其特征在于,步骤S3具体包括:A camera-based non-contact heart rate and body temperature measurement method according to claim 1, wherein step S3 specifically includes:
    S301:构建特征融合残差网络,选取多个测试者佩戴心电采集设备获得的心电图像以及同时刻拍摄的视频图像经所述步骤S2处理后得到的人脸轮廓图像作为测试集,对特征融合残差网络进行训练,得到心电检测模型;S301: Construct a feature fusion residual network, select ECG images obtained by multiple testers wearing ECG acquisition equipment and face contour images obtained after processing at step S2 at the same time as the test set, and fuse the features The residual network is trained to obtain the ECG detection model;
    S302:将步骤S2获得的一段视图图像中的人脸轮廓图像输入心电检测模型,输出得到心电曲线。S302: Input the face contour image in a segment of view image obtained in step S2 into the ECG detection model, and output the ECG curve.
  6. 根据权利要求1所述的一种基于摄像头的非接触式心率、体温测量方法,其特征在于,步骤S5具体包括:A camera-based non-contact heart rate and body temperature measurement method according to claim 1, wherein step S5 specifically includes:
    S501:构建深度学习网络,选取多组不同试验者在同一条件下的心率与体温的对应数据,对深度学习网络进行训练,得到心率体温换算模型;S501: Construct a deep learning network, select multiple groups of different experimenters' heart rate and body temperature corresponding data under the same conditions, train the deep learning network to obtain a heart rate and body temperature conversion model;
    S502:将得到的待测者的心率值输入心率体温换算模型,输出得到待测者的体温值。S502: Input the obtained heart rate value of the person to be measured into the heart rate and body temperature conversion model, and output the body temperature value of the person to be measured.
PCT/CN2020/103087 2020-03-19 2020-07-20 Camera-based non-contact heart rate and body temperature measurement method WO2021184620A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010197862.3 2020-03-19
CN202010197862.3A CN111407245B (en) 2020-03-19 2020-03-19 Non-contact heart rate and body temperature measuring method based on camera

Publications (1)

Publication Number Publication Date
WO2021184620A1 true WO2021184620A1 (en) 2021-09-23

Family

ID=71485210

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/103087 WO2021184620A1 (en) 2020-03-19 2020-07-20 Camera-based non-contact heart rate and body temperature measurement method

Country Status (2)

Country Link
CN (1) CN111407245B (en)
WO (1) WO2021184620A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114305355A (en) * 2022-01-05 2022-04-12 北京科技大学 Respiration and heartbeat detection method, system and device based on millimeter wave radar
CN114758363A (en) * 2022-06-16 2022-07-15 四川金信石信息技术有限公司 Insulating glove wearing detection method and system based on deep learning
CN115049918A (en) * 2022-06-14 2022-09-13 中国科学院沈阳自动化研究所 Method and device for rapidly detecting image target of underwater robot
CN115375626A (en) * 2022-07-25 2022-11-22 浙江大学 Medical image segmentation method, system, medium, and apparatus based on physical resolution
CN116594061A (en) * 2023-07-18 2023-08-15 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN116889388A (en) * 2023-09-11 2023-10-17 长春理工大学 Intelligent detection system and method based on rPPG technology

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111407245B (en) * 2020-03-19 2021-11-02 南京昊眼晶睛智能科技有限公司 Non-contact heart rate and body temperature measuring method based on camera
CN112001122B (en) * 2020-08-26 2023-09-26 合肥工业大学 Non-contact physiological signal measurement method based on end-to-end generation countermeasure network
CN112381011B (en) * 2020-11-18 2023-08-22 中国科学院自动化研究所 Non-contact heart rate measurement method, system and device based on face image
CN113496482B (en) * 2021-05-21 2022-10-04 郑州大学 Toxic driving test paper image segmentation model, positioning segmentation method and portable device
CN113538350B (en) * 2021-06-29 2022-10-04 河北深保投资发展有限公司 Method for identifying depth of foundation pit based on multiple cameras
CN113449653B (en) * 2021-06-30 2022-11-01 广东电网有限责任公司 Heart rate detection method, system, terminal device and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8360986B2 (en) * 2006-06-30 2013-01-29 University Of Louisville Research Foundation, Inc. Non-contact and passive measurement of arterial pulse through thermal IR imaging, and analysis of thermal IR imagery
US20130245465A1 (en) * 2012-03-16 2013-09-19 Fujitsu Limited Sleep depth determining apparatus and method
CN105310667A (en) * 2015-11-09 2016-02-10 北京体育大学 Body core temperature monitoring method, motion early warning method and early warning system
KR20170056232A (en) * 2015-11-13 2017-05-23 금오공과대학교 산학협력단 None-contact measurement method of vital signals and device using the same
CN107358220A (en) * 2017-07-31 2017-11-17 江西中医药大学 A kind of human heart rate and the contactless measurement of breathing
CN107802245A (en) * 2017-09-26 2018-03-16 深圳市赛亿科技开发有限公司 A kind of monitoring of pulse robot and its monitoring method
CN109247923A (en) * 2018-11-15 2019-01-22 中国科学院自动化研究所 Contactless pulse real-time estimation method and equipment based on video
CN109846469A (en) * 2019-04-16 2019-06-07 合肥工业大学 A kind of contactless method for measuring heart rate based on convolutional neural networks
CN110236508A (en) * 2019-06-12 2019-09-17 云南东巴文健康管理有限公司 A kind of non-invasive blood pressure continuous monitoring method
CN110276271A (en) * 2019-05-30 2019-09-24 福建工程学院 Merge the non-contact heart rate estimation technique of IPPG and depth information anti-noise jamming
CN110384491A (en) * 2019-08-21 2019-10-29 河南科技大学 A kind of heart rate detection method based on common camera
CN111407245A (en) * 2020-03-19 2020-07-14 南京昊眼晶睛智能科技有限公司 Non-contact heart rate and body temperature measuring method based on camera

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7801591B1 (en) * 2000-05-30 2010-09-21 Vladimir Shusterman Digital healthcare information management
CN102902967B (en) * 2012-10-16 2015-03-11 第三眼(天津)生物识别科技有限公司 Method for positioning iris and pupil based on eye structure classification
CN105125181B (en) * 2015-09-23 2018-03-23 广东小天才科技有限公司 A kind of method and device for measuring user's body temperature
US10354362B2 (en) * 2016-09-08 2019-07-16 Carnegie Mellon University Methods and software for detecting objects in images using a multiscale fast region-based convolutional neural network
CN106447184B (en) * 2016-09-21 2019-04-05 中国人民解放军国防科学技术大学 Unmanned plane operator's state evaluating method based on multisensor measurement and neural network learning
CN106580294B (en) * 2016-12-30 2020-09-04 上海交通大学 Physiological signal remote monitoring system based on multi-mode imaging technology and application
WO2018125580A1 (en) * 2016-12-30 2018-07-05 Konica Minolta Laboratory U.S.A., Inc. Gland segmentation with deeply-supervised multi-level deconvolution networks
CN106845395A (en) * 2017-01-19 2017-06-13 北京飞搜科技有限公司 A kind of method that In vivo detection is carried out based on recognition of face
WO2018146558A2 (en) * 2017-02-07 2018-08-16 Mindmaze Holding Sa Systems, methods and apparatuses for stereo vision and tracking
CN107770490A (en) * 2017-09-30 2018-03-06 广东博媒广告传播有限公司 A kind of LED advertisements identification monitoring system
CN107692997B (en) * 2017-11-08 2020-04-21 清华大学 Heart rate detection method and device
CN108665496B (en) * 2018-03-21 2021-01-26 浙江大学 End-to-end semantic instant positioning and mapping method based on deep learning
CN108596248B (en) * 2018-04-23 2021-11-02 上海海洋大学 Remote sensing image classification method based on improved deep convolutional neural network
CN108538038A (en) * 2018-05-31 2018-09-14 京东方科技集团股份有限公司 fire alarm method and device
CN108921163A (en) * 2018-06-08 2018-11-30 南京大学 A kind of packaging coding detection method based on deep learning
CN109190449A (en) * 2018-07-09 2019-01-11 北京达佳互联信息技术有限公司 Age recognition methods, device, electronic equipment and storage medium
CN109166130B (en) * 2018-08-06 2021-06-22 北京市商汤科技开发有限公司 Image processing method and image processing device
CN109044297A (en) * 2018-09-11 2018-12-21 管桂云 Personal Mininurse's health monitoring system
CN109829892A (en) * 2019-01-03 2019-05-31 众安信息技术服务有限公司 A kind of training method of prediction model, prediction technique and device using the model

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8360986B2 (en) * 2006-06-30 2013-01-29 University Of Louisville Research Foundation, Inc. Non-contact and passive measurement of arterial pulse through thermal IR imaging, and analysis of thermal IR imagery
US20130245465A1 (en) * 2012-03-16 2013-09-19 Fujitsu Limited Sleep depth determining apparatus and method
CN105310667A (en) * 2015-11-09 2016-02-10 北京体育大学 Body core temperature monitoring method, motion early warning method and early warning system
KR20170056232A (en) * 2015-11-13 2017-05-23 금오공과대학교 산학협력단 None-contact measurement method of vital signals and device using the same
CN107358220A (en) * 2017-07-31 2017-11-17 江西中医药大学 A kind of human heart rate and the contactless measurement of breathing
CN107802245A (en) * 2017-09-26 2018-03-16 深圳市赛亿科技开发有限公司 A kind of monitoring of pulse robot and its monitoring method
CN109247923A (en) * 2018-11-15 2019-01-22 中国科学院自动化研究所 Contactless pulse real-time estimation method and equipment based on video
CN109846469A (en) * 2019-04-16 2019-06-07 合肥工业大学 A kind of contactless method for measuring heart rate based on convolutional neural networks
CN110276271A (en) * 2019-05-30 2019-09-24 福建工程学院 Merge the non-contact heart rate estimation technique of IPPG and depth information anti-noise jamming
CN110236508A (en) * 2019-06-12 2019-09-17 云南东巴文健康管理有限公司 A kind of non-invasive blood pressure continuous monitoring method
CN110384491A (en) * 2019-08-21 2019-10-29 河南科技大学 A kind of heart rate detection method based on common camera
CN111407245A (en) * 2020-03-19 2020-07-14 南京昊眼晶睛智能科技有限公司 Non-contact heart rate and body temperature measuring method based on camera

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114305355A (en) * 2022-01-05 2022-04-12 北京科技大学 Respiration and heartbeat detection method, system and device based on millimeter wave radar
CN114305355B (en) * 2022-01-05 2023-08-22 北京科技大学 Breathing heartbeat detection method, system and device based on millimeter wave radar
CN115049918A (en) * 2022-06-14 2022-09-13 中国科学院沈阳自动化研究所 Method and device for rapidly detecting image target of underwater robot
CN114758363A (en) * 2022-06-16 2022-07-15 四川金信石信息技术有限公司 Insulating glove wearing detection method and system based on deep learning
CN114758363B (en) * 2022-06-16 2022-08-19 四川金信石信息技术有限公司 Insulating glove wearing detection method and system based on deep learning
CN115375626A (en) * 2022-07-25 2022-11-22 浙江大学 Medical image segmentation method, system, medium, and apparatus based on physical resolution
CN115375626B (en) * 2022-07-25 2023-06-06 浙江大学 Medical image segmentation method, system, medium and device based on physical resolution
CN116594061A (en) * 2023-07-18 2023-08-15 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN116594061B (en) * 2023-07-18 2023-09-22 吉林大学 Seismic data denoising method based on multi-scale U-shaped attention network
CN116889388A (en) * 2023-09-11 2023-10-17 长春理工大学 Intelligent detection system and method based on rPPG technology
CN116889388B (en) * 2023-09-11 2023-11-17 长春理工大学 Intelligent detection system and method based on rPPG technology

Also Published As

Publication number Publication date
CN111407245B (en) 2021-11-02
CN111407245A (en) 2020-07-14

Similar Documents

Publication Publication Date Title
WO2021184620A1 (en) Camera-based non-contact heart rate and body temperature measurement method
CN109846469B (en) Non-contact heart rate measurement method based on convolutional neural network
CN110706826B (en) Non-contact real-time multi-person heart rate and blood pressure measuring method based on video image
US11227161B1 (en) Physiological signal prediction method
CN105761234A (en) Structure sparse representation-based remote sensing image fusion method
CN113449727A (en) Camouflage target detection and identification method based on deep neural network
CN112819910A (en) Hyperspectral image reconstruction method based on double-ghost attention machine mechanism network
CN114283158A (en) Retinal blood vessel image segmentation method and device and computer equipment
CN112884788B (en) Cup optic disk segmentation method and imaging method based on rich context network
CN112043260B (en) Electrocardiogram classification method based on local mode transformation
Li et al. Non-contact ppg signal and heart rate estimation with multi-hierarchical convolutional network
CN111797901A (en) Retinal artery and vein classification method and device based on topological structure estimation
CN108937905B (en) Non-contact heart rate detection method based on signal fitting
Zheng et al. Heart rate prediction from facial video with masks using eye location and corrected by convolutional neural networks
CN116343284A (en) Attention mechanism-based multi-feature outdoor environment emotion recognition method
CN115375711A (en) Image segmentation method of global context attention network based on multi-scale fusion
Damer et al. Cascaded generation of high-quality color visible face images from thermal captures
CN112163508A (en) Character recognition method and system based on real scene and OCR terminal
CN117274759A (en) Infrared and visible light image fusion system based on distillation-fusion-semantic joint driving
CN112200099A (en) Video-based dynamic heart rate detection method
Wang et al. Transphys: Transformer-based unsupervised contrastive learning for remote heart rate measurement
CN111754503B (en) Enteroscope mirror-withdrawing overspeed duty ratio monitoring method based on two-channel convolutional neural network
CN116994310B (en) Remote heart rate detection method based on rPPG signal
CN112488165A (en) Infrared pedestrian identification method and system based on deep learning model
CN113940635B (en) Skin lesion segmentation and feature extraction method based on depth residual pyramid

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20925853

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20925853

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 10.11.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20925853

Country of ref document: EP

Kind code of ref document: A1