CN112017155A - Method, device and system for measuring health sign data and storage medium - Google Patents

Method, device and system for measuring health sign data and storage medium Download PDF

Info

Publication number
CN112017155A
CN112017155A CN202010668073.3A CN202010668073A CN112017155A CN 112017155 A CN112017155 A CN 112017155A CN 202010668073 A CN202010668073 A CN 202010668073A CN 112017155 A CN112017155 A CN 112017155A
Authority
CN
China
Prior art keywords
region
rppg
interest
value
image frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010668073.3A
Other languages
Chinese (zh)
Other versions
CN112017155B (en
Inventor
缪其恒
卢星星
陈淑君
苏志杰
许炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Automobile Technology Co ltd
Original Assignee
Zhejiang Dahua Automobile Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Automobile Technology Co ltd filed Critical Zhejiang Dahua Automobile Technology Co ltd
Priority to CN202010668073.3A priority Critical patent/CN112017155B/en
Publication of CN112017155A publication Critical patent/CN112017155A/en
Application granted granted Critical
Publication of CN112017155B publication Critical patent/CN112017155B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30101Blood vessel; Artery; Vein; Vascular
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person
    • G06T2207/30201Face
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/14Vascular patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/15Biometric patterns based on physiological signals, e.g. heartbeat, blood flow

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The application relates to a method, a device, a system, an electronic device and a storage medium for measuring health sign data, wherein the method for measuring the health sign data comprises the following steps: acquiring an infrared image frame sequence of a detected object; extracting an interested region in each infrared image frame of the infrared image frame sequence to obtain a first interested region image frame sequence; processing the first interested area image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG predicted value of the detected object; and determining the health sign data of the detected subject according to the rPPG predicted value. By the method and the device, the problems that in the related technology, the influence of illumination change and space tightness on robustness is large, and the measurement of health sign data is inaccurate are solved, and the accurate measurement of health characteristic data through the infrared image of the detected object is realized.

Description

Method, device and system for measuring health sign data and storage medium
Technical Field
The present application relates to the field of computer vision technology, and in particular, to a method, an apparatus, a system, an electronic apparatus, and a storage medium for measuring health sign data.
Background
Intellectualization is one of the important trends in the development of the automotive industry today. Among them, the intelligent cabin system is applied to road scene visual analysis, cabin scene visual analysis, and natural language processing. With the improvement of the functions of the intelligent cockpit system, in recent years, the intelligent cockpit system has been slowly applied to the field of intelligent driving to analyze, early warn, record and the like driver behaviors (such as bad driving behaviors as fatigue, distraction and the like).
The current intelligent cockpit system cannot realize the measurement of health sign data (breathing, heartbeat frequency and the like) of a driver. The detection method of the health sign data in other fields mainly comprises the following steps: the method is based on a respiratory rate and a heartbeat rate detection method of an RGB video image and a respiratory detection method based on a Doppler effect. However, as the vehicle travels, the illumination in the cabin varies greatly, which may result in too dark light in the cabin during night driving. The detection method of the breathing rate and the heartbeat rate based on the RGB video images has low detection accuracy rate of the health sign data in the cockpit scene. Relative vibration exists between a driver and the vehicle in the driving process of the vehicle, and the vibration frequency is close to the breathing frequency and the heartbeat frequency; and the driver is difficult to keep still in the cockpit, and these factors cause that a large amount of interference signals exist when the breathing or heartbeat frequency of the driver is detected by adopting the doppler effect, and some interference signals cannot be separated from the breathing or heartbeat frequency through spectrum analysis, so that the detection accuracy of the health sign data based on the doppler effect is low in the cockpit scene.
At present, no effective solution is provided for the problem of low detection accuracy of health sign data in a cockpit scene in the related art.
Disclosure of Invention
The embodiment of the application provides a method, a device, a system, an electronic device and a storage medium for measuring health sign data, so as to at least solve the problem of low detection accuracy rate of the health sign data in a cockpit scene in the related art.
In a first aspect, an embodiment of the present application provides a method for measuring health sign data, including:
acquiring an infrared image frame sequence of a detected object;
extracting an interested region in each infrared image frame of the infrared image frame sequence to obtain a first interested region image frame sequence;
processing the first region of interest image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG prediction value of the detected object, wherein the rPPG value prediction network is an artificial neural network obtained by training according to a second region of interest image frame sequence and an actually measured rPPG value corresponding to the second region of interest image frame sequence;
and determining the health sign data of the detected subject according to the rPPG predicted value.
In some of these embodiments, extracting a region of interest in each ir image frame of the sequence of ir image frames comprises:
detecting feature data in a feature map of the infrared image frame, wherein the feature data comprises key point information and region information;
respectively generating a plurality of interesting region sub-grids and a target region according to the key point information and the region information;
determining validity of the plurality of region of interest sub-meshes based on the target region and generating a validity code;
and arranging the multiple interesting region sub-grids for coding the effectiveness codes according to a preset arrangement sequence to generate the interesting regions.
In some of these embodiments, the feature data further includes target information, and detecting feature data in the feature map of the infrared image frame includes:
inputting the infrared image frame into a facial analysis neural network to obtain the characteristic map;
detecting the target information and the region information in the feature map, wherein the target information comprises a valid target;
and detecting the key point information from the feature map corresponding to the effective target.
In some of these embodiments, the significance coding comprises significance coding, determining the significance of the plurality of region of interest sub-grids based on the target region, and generating the significance coding comprises:
detecting a visible region sub-grid in the plurality of region of interest sub-grids, wherein the visible region sub-grid is an intersection of each of the region of interest sub-grids and the target region;
calculating the percentage of the visible region sub-grids in the region of interest sub-grids, and judging whether the percentage is greater than a preset threshold value;
and under the condition that the percentage is larger than a preset threshold value, determining the region-of-interest sub-grid as an effective region-of-interest sub-grid, and generating an effective code.
In some of these embodiments, after determining the region of interest submesh as an active region of interest submesh and generating an active encoding, the method further comprises:
calculating the correlation between the historical characteristic map and the characteristic map of the preset region of interest sub-grid; the historical characteristic map is a characteristic map of a historical effective region of interest sub-grid corresponding to the effective region of interest sub-grid in the previous frame of the infrared image frame, and at least one preset region of interest sub-grid is arranged around the effective region of interest sub-grid;
and judging whether the correlation is greater than a preset threshold value or not, and updating the preset region of interest sub-grid into the effective region of interest sub-grid under the condition that the correlation is greater than the preset threshold value.
In some of these embodiments, after generating the region of interest, the method includes:
cascading the interested regions with preset frame numbers to obtain a third interested region image frame sequence;
summing the validity codes of each region of interest sub-grid in the third region of interest image frame sequence to obtain a total validity code value of each region of interest sub-grid;
judging whether the total value of the validity codes is larger than a preset threshold value or not, and selecting an interested shielding area according to a judgment result;
masking the region of interest sub-mesh corresponding to the masked region of interest in the third sequence of region of interest image frames to generate a first sequence of region of interest image frames.
In some embodiments, determining whether the total validity code value is greater than a preset threshold, and selecting the interested masked region according to the determination result includes:
under the condition that the total validity coding value is judged to be larger than a preset threshold value, determining the region-of-interest sub-grid corresponding to the total validity coding value as a valid region-of-interest sub-grid;
and under the condition that the total validity coding value is not larger than a preset threshold value, determining the region-of-interest sub-grid corresponding to the total validity coding value as the interested shielding region.
In some embodiments, the health characteristic data includes a heart rate and a respiratory rate, and determining the health sign data of the detected subject according to the rPPG prediction value includes:
extracting rPPG signal frequency domain information from the rPPG prediction value, wherein the rPPG signal frequency domain information comprises a first frequency domain interval and a second frequency domain interval,
and respectively selecting the rPPG frequency with the maximum peak value from the first frequency domain interval and the second frequency domain interval, determining the rPPG frequency selected from the first frequency domain interval as the heart rate, and determining the rPPG frequency selected from the second frequency domain interval as the respiration rate.
In some of these embodiments, the health characteristic data includes blood oxygen saturation concentration, the rPPG prediction value includes an infrared rPPG prediction value and a visible rPPG prediction value, and determining the health sign data of the detected subject according to the rPPG prediction value includes:
the blood oxygen saturation concentration is calculated by the following formula:
Figure BDA0002581239240000041
wherein, SPO2Is the blood oxygen saturation concentration, sigmaRFor the standard deviation, sigma, of the predicted value of the infrared rPPG in a preset measurement period1RThe standard deviation of the prediction value of the visible light rPPG in a preset measurement period,Ris the average value of the infrared rPPG predicted value in a preset measurement period,1Rthe average value of the prediction value of the visible light rPPG in a preset measurement period is shown.
In a second aspect, an embodiment of the present application provides a health sign data measurement apparatus, including:
the acquisition module is used for acquiring an infrared image frame sequence of the detected object;
the extraction module is used for extracting an interested region in each infrared image frame of the infrared image frame sequence to obtain a first interested region image frame sequence;
the prediction module is used for processing the first interested area image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG predicted value of the detected object, wherein the rPPG value prediction network is an artificial neural network obtained by training according to a second interested area image frame sequence and an actually measured rPPG value corresponding to the second interested area image frame sequence;
and the processing module is used for determining the health sign data of the detected object according to the rPPG predicted value.
In a third aspect, an embodiment of the present application provides an infrared vision system for a vehicle cabin, including an image pickup apparatus, a transmission apparatus, and an electronic device; the camera shooting equipment is connected with the electronic device through the transmission equipment;
the camera shooting equipment is used for acquiring an infrared image in the cockpit;
the transmission equipment is used for transmitting the infrared image to the electronic device;
the electronic device is configured to perform the method for measuring health sign data according to the first aspect.
In a third aspect, an embodiment of the present application provides an electronic apparatus, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor, when executing the computer program, implements the method for measuring health sign data according to the first aspect.
In a fourth aspect, the present application provides a storage medium, on which a computer program is stored, where the program is executed by a processor to implement the method for measuring health sign data according to the first aspect.
Compared with the related art, the method, the device, the system, the electronic device and the storage medium for measuring the health sign data provided by the embodiment of the application acquire the infrared image frame sequence of the detected object; extracting an interested region in each infrared image frame of the infrared image frame sequence to obtain a first interested region image frame sequence; processing the first interested area image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG predicted value of the detected object; and determining the health sign data of the detected subject according to the rPPG predicted value. The problem of low detection accuracy rate of the health sign data in the cockpit scene in the related technology is solved, and accurate measurement of the health characteristic data through the infrared image of the detected object is achieved.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a block diagram of a hardware structure of a terminal of a method for measuring health sign data according to an embodiment of the present invention;
fig. 2 is a flow chart of a method of measuring health sign data according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a network architecture of a facial analysis neural network according to an embodiment of the present application;
fig. 4 is a schematic network architecture diagram of an rPPG values prediction network according to an embodiment of the present application;
fig. 5 is a flow chart of a method of measuring health sign data according to a preferred embodiment of the present application;
fig. 6 is a block diagram of a health sign data measurement apparatus according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The various techniques described in this application may be used in various health feature data detection, facial feature recognition, and infrared visual target tracking systems and devices.
The method provided by the embodiment can be executed in a terminal, a computer or a similar operation device. Taking the operation on the terminal as an example, fig. 1 is a hardware structure block diagram of the terminal of the method for measuring health sign data according to the embodiment of the present invention. As shown in fig. 1, the terminal 10 may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and optionally may also include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the terminal. For example, the terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 can be used to store a computer program, for example, a software program and a module of application software, such as a computer program corresponding to the measurement method of the health sign data in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer program stored in the memory 104, so as to implement the method described above. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
The embodiment provides a method for measuring health sign data. Fig. 2 is a flowchart of a method for measuring health sign data according to an embodiment of the present application, and as shown in fig. 2, the flowchart includes the following steps:
step S201, an infrared image frame sequence of the detected object is acquired.
In this embodiment, an infrared camera is used to collect an infrared image in a cockpit, in the process of collecting the infrared image, the resolution and the collection frequency of the infrared image are set by modifying factory configuration parameters of a camera sensor, and after the image is collected, the infrared image needs to be preprocessed, the image preprocessing operation includes adaptive adjustment of exposure parameters, gain parameters and white balance parameters, the 3D noise reduction and digital wide dynamic parameter adjustment of the image are performed, and the image preprocessing operation can be realized by adjusting and curing algorithm parameters of an ISP module of the camera.
Step S202, extracting a region of interest from each infrared image frame of the infrared image frame sequence to obtain a first region of interest image frame sequence.
In this embodiment, a region of interest is extracted from each infrared image frame of the sequence of infrared image frames, and region information, key point information, and target information are detected through a facial analysis neural network, then a region of interest is generated based on the region information, the key point information, and the target information, and a first sequence of region of interest image frames is obtained by performing channel reconstruction on the region of interest, where the first sequence of region of interest image frames includes a preset number of frames of single-frame region of interest image frames.
Step S203, processing the first interested area image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG predicted value of the detected object, wherein the rPPG value prediction network is an artificial neural network obtained by training according to the second interested area image frame sequence and an actually measured rPPG value corresponding to the second interested area image frame sequence.
In this embodiment, the second region-of-interest image frame sequence is a region-of-interest image frame sequence obtained by extracting a region of interest from a plurality of infrared image frame samples through the facial analysis neural network and reconstructing a channel, and belongs to an acquired training set sample, and an actually measured rPPG value is an rPPG timing sequence training tag of an rPPG value prediction network, and is an rPPG actual value corresponding to a detected object synchronously acquired by a contact type PPG signal measurement device when the plurality of infrared image frame samples are acquired, and a weight parameter of the rPPG value prediction network is updated by the actually measured rPPG value, so as to train an artificial neural network capable of accurately predicting an rPPG predicted value.
And step S204, determining the health sign data of the detected object according to the rPPG predicted value.
In this embodiment, after the rPPG predicted value is obtained, based on the rPPG predicted value, frequency domain information of the rPPG signal is obtained through Fast Fourier Transform (FFT), and then health characteristic data is extracted from the frequency domain information of the rPPG signal, where the health characteristic data includes a heart rate, a heart rate change rate, a respiration rate, a blood pressure, and a blood pressure saturation concentration.
Through the steps S201 to S204, acquiring an infrared image frame sequence of the detected object; extracting an interested region in each infrared image frame of the infrared image frame sequence to obtain a first interested region image frame sequence; processing the first interested area image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG predicted value of the detected object; according to the rPPG predicted value, determining the health sign data of the detected object; by combining effective region extraction and region-of-interest tracking, the problem that sign data measurement is inaccurate due to the fact that illumination changes are large in an existing cabin scene and an observed person is in a non-static state can be solved, the problem of facial movement and/or shielding of a driver in a vehicle-mounted scene can be effectively solved, and meanwhile, a time sequence convolution neural network is adopted to predict an rPPG value, so that the application is wide in applicable scene and high in health characteristic data measurement accuracy. Through the method and the device, the problem of low detection accuracy of the health sign data in the cockpit scene in the related technology is solved, and the health characteristic data can be accurately measured through the infrared image of the detected object.
Fig. 3 is a schematic diagram of a network architecture of a facial analysis neural network according to an embodiment of the present application. In this embodiment, a multitask facial analysis neural network is used to perform visual analysis on the preprocessed infrared scene image, which mainly includes facial region detection, facial skin region segmentation, and facial key point detection.
As shown in fig. 3, after the infrared image in the cabin scene is preprocessed and input into the facial analysis neural network, the facial neural network performs multi-scale scene feature description (i.e., the shared feature description shown in fig. 3) on the preprocessed image, and performs offline training to obtain a feature map, where the multi-scale scene feature description operation mainly includes convolution, activation, pooling, upsampling, and channel cascade, the multi-scale scene feature description includes multiple layers of BN-RELU-Conv operations, and the BN-RELU-Conv operation of each layer belongs to the conventional art, and is not described herein again.
After the facial analysis neural network is trained offline to obtain the characteristic map, the following steps are carried out:
in step S21, a face target is detected (detection target information).
The face detection branch of the face analysis neural network performs face target detection based on the feature map, the face detection branch at least comprises a convolutional neural network branch composed of an input layer, a convolutional layer (conv), a stimulus layer (relu), a pooling layer (prol) and an output layer (softmax), and at the same time, before the feature map is input into the face detection branch, image processing is required to be performed through a BN (normalization) layer, the above technical features are all in the prior art, and in the embodiment, the process of the face detection branch for face target detection comprises the following steps: classifying and position regressing potential obstacle regions by using a prior target candidate region strategy, then performing non-maximum suppression processing on a network output result after classification and position regression, outputting a predefined facial target class and a position thereof, and outputting the predefined facial target class and the position thereof by using a softmax layer of a convolutional neural network; wherein the facial object categories include: the method comprises the following steps of (1) background, effective human faces and abnormal human faces, wherein the abnormal human faces refer to extreme angles and image human face areas with serious shielding; the position of the facial object class refers to its coordinates in the image coordinate system, which include: center abscissa (x), center ordinate (y), width (w), and height (h).
In step S22, facial skin region segmentation (detection region information).
The method comprises the steps of adopting a skin region segmentation branch of a facial analysis neural network to carry out skin region segmentation detection based on a feature map, specifically, selecting a specific scale feature, carrying out deconvolution operation (a deconvolution layer is correspondingly arranged), carrying out feature cascade (a feature cascade layer is correspondingly arranged and can be realized by arranging a full connection layer FC) and pixel level classification (a classification layer) on a preset scale, then carrying out up-sampling, swelling corrosion and confidence coefficient filtering operation, outputting a binary facial skin region mark, and needing to pass through a softmax layer of the skin region segmentation branch (a convolution neural network branch) when outputting the facial skin region mark, wherein the facial skin region mark comprises a background and skin and is respectively marked by 0 and 1.
In step S23, facial keypoint detection (detection keypoint information).
Using a key point detection branch of a facial analysis neural network to detect facial key point information based on the feature map, specifically, performing effective human face interested pooling (face region ROI pooling) according to the feature map and a facial target detection result, sending the pooled features into the key point detection branch, and outputting confidence of each facial key point and a corresponding image coordinate position (x-image abscissa, y-image ordinate), wherein the output result of the key point detection comprises: eyebrow, eye, nose, mouth and face contour, and when outputting face key point information, it needs to pass through softmax layer of convolutional neural network.
It should be noted that the feature maps used by the face detection branch, the skin region segmentation branch, and the key point detection branch are selected feature maps that have undergone different convolution operations, and the number of convolution operations is set as required.
For the obtained face analysis neural network, before the face vision analysis detection is carried out, the face analysis neural network is obtained through multitask network off-line training, wherein the multitask network off-line training comprises the following steps: acquiring cabin scene data (about 10 thousands of orders of magnitude) of different visual angles, vehicle types and illumination conditions by using an analysis camera (an infrared camera), manually marking to generate training data sets for face region detection, face skin region segmentation and face key point detection, expanding the training data sets on line through random geometric and color transformation, randomly initializing neural network model parameters, and optimizing a following loss function L by adopting a batch random gradient descent method (SGD)
L=k1Lface+k2Lskin+k3Llandmark
Figure BDA0002581239240000101
Figure BDA0002581239240000102
Figure BDA0002581239240000103
Wherein L isfaceAs a loss function of the face region, LskinSegmenting the loss function for the skin region, LlandmarkAs a loss function of facial key points, k1、k2、k3Respectively representing the weight of each loss function, Lcross-entropyAs a cross entropy loss function, L1smoothRepresents the central loss function, log is a logarithmic function, loc is an index function, N is the objective function of the corresponding task, [ u, v]Is the image number coordinate of each point in the image coordinate system, [ W, H ]]For inputting the width and height of an image, g is a truth label of a corresponding coordinate, cls is a predicted value of the corresponding coordinate, gl is a center point coordinate of a corresponding area, and face, skin and landmark respectively represent a face area, a key point and a skin area mark.
After the off-line training of the multi-task network is completed to obtain the weight parameters of the facial analysis neural network, pruning (channel clipping and thinning) and quantization (8-bit or 16-bit floating point and fixed point data types) are performed, so that the facial analysis neural network is constructed and deployed on a corresponding embedded platform (in this embodiment, a server).
Fig. 4 is a schematic network architecture diagram of an rPPG value prediction network according to an embodiment of the present application. The rPPG values prediction network is pre-trained before deploying the server. As shown in fig. 4, the rPPG values prediction network at least includes a 3D convolutional layer, a pooling layer, and an activation layer. During pre-training, performing convolutional neural network training by using a 10 ten thousand-magnitude second region-of-interest image frame sequence detected by the facial analysis neural network or a related neural network, and updating a neural network weight parameter by using an rPPG timing sequence training tag, wherein the rPPG timing sequence training tag is an rPPG measured value corresponding to a detected object and synchronously acquired by contact type PPG signal measurement equipment when an infrared image frame sample is acquired, the frequencies of the acquired infrared image frame sample and the rPPG timing sequence training tag are both set to be 30Hz, and meanwhile, when the neural network is trained, the following loss function is solved by using a batch gradient descent method to update the weight parameter of the rPPG value prediction network:
Figure BDA0002581239240000104
wherein, N is the frame number of the region of interest of the input second region of interest image frame sequence, y is the rPPG predicted value at a certain moment in the second region of interest image frame sequence, and g is the rPPG real value at a certain moment in the second region of interest image frame sequence.
In the embodiment, an infrared scene image acquisition and preprocessing method suitable for vehicle-mounted application is adopted, so that the response to the dynamic change of the illumination of the vehicle-mounted cabin is timely, and the image effect is good under low illumination. Meanwhile, the non-contact measurement of the health characteristics of the rider is realized by analyzing the facial image characteristics of the rider in the infrared scene.
In some embodiments, extracting the region of interest in each ir image frame of the sequence of ir image frames comprises:
in step S31, feature data is detected in a feature map of the infrared image frame, where the feature data includes key point information and region information.
Step S32, a plurality of interesting region sub-grids and target regions are respectively generated according to the key point information and the region information.
In this embodiment, a plurality of sub-networks of regions of interest are generated based on the keypoint information detected by the facial vision analysis, for example: the detected key point information includes eyebrows, eyes, noses and mouths, and the correspondingly generated grazing interest region sub-networks are eyebrow sub-grids, eye sub-grids, nose sub-grids and mouth sub-grids. In the present embodiment, the background region and the skin region are divided according to the region information.
Step S33, determining validity of the plurality of region of interest sub-meshes based on the target region, and generating a validity code.
In this embodiment, the visible region of the sub-mesh is determined by the percentage of the intersection of each sub-mesh and the skin region in the corresponding sub-mesh (when the sub-mesh is located in the background region, it is determined that the sub-mesh is not visible), and the validity of the sub-mesh is further determined.
Step S34, arranging the multiple interesting region sub-grids encoded by the encoding effectiveness according to a preset arrangement order, and generating an interesting region.
In this embodiment, the sub-grids are rearranged in a predefined order, and together with their corresponding validity codes, a single frame of the health sign operation region Mij (corresponding to the region of interest of each frame of the infrared image frame) is generated.
Through the steps from S31 to S34, the detection of the effective sub-grids in the infrared image is realized, the interesting region of the single-frame infrared image is generated through the effective sub-grids, and the tracking of the interesting region is completed; meanwhile, the errors of the facial interesting regions caused by facial movement and facial region shielding can be compensated, so that the accuracy of health characteristic data measurement is improved.
In some embodiments, the feature data further includes target information, and detecting the feature data in the feature map of the infrared image frame includes the following steps:
and step S41, inputting the infrared image frame into a facial analysis neural network to obtain a characteristic map.
In step S42, target information and region information are detected in the feature map, wherein the target information includes valid targets.
In this embodiment, the targets include: background, valid face and abnormal face.
And step S43, key point information is detected from the feature map corresponding to the effective target.
In this embodiment, the key points include at least one of: eyebrows, eyes, nose, mouth, and facial contours.
Through the steps S41 to S43, the key point information is detected by the feature map.
In some of these embodiments, the significance coding comprises a significance coding, determining the significance of the plurality of region of interest sub-meshes based on the target region, and generating the significance coding comprises the steps of:
step S33-1, detecting a visible region sub-grid among the plurality of region of interest sub-grids, wherein the visible region sub-grid is an intersection of each region of interest sub-grid and the target region.
And step S33-2, counting the percentage of the visible area sub-grid in the interesting area sub-grid, and judging whether the percentage is larger than a preset threshold value.
And step S33-3, determining the region of interest sub-grid as an effective region of interest sub-grid and generating an effective code under the condition that the percentage is judged to be larger than the preset threshold value.
Through the above steps S33-1 to S33-3, the verification of the validity of the sub-grids is realized.
In some embodiments, after determining the region of interest submesh as an effective region of interest submesh and generating the effective encoding, the following steps are further performed:
step S51, calculating the correlation between the historical characteristic map and the characteristic map of the preset region of interest sub-grid; the historical characteristic map is a characteristic map of a historical effective interesting region sub-grid corresponding to the effective interesting region sub-grid in the previous frame of infrared image frame, and at least one preset interesting region sub-grid is arranged around the effective interesting region sub-grid.
And step S52, judging whether the correlation is greater than a preset threshold value, and updating the preset region of interest sub-grid into an effective region of interest sub-grid under the condition that the correlation is greater than the preset threshold value.
In the present embodiment, through steps S51 to S52, the effective region of interest sub-grid is optimized, and the region of interest sub-grid with the largest correlation is selected to determine the optimal region of interest of the infrared image frame.
In some of these embodiments, after the region of interest is generated, the following steps are also performed:
step S61, cascading the interested areas with preset frame numbers to obtain a third interested area image frame sequence.
In this embodiment, the preset frame number defaults to 120 frames, that is, the regions of interest of the 120 infrared image frames are concatenated, so as to generate a candidate region of interest image frame sequence, which needs to be further determined to be able to be used for obtaining the rPPG predicted value.
Step S62, summing the validity codes of each region of interest sub-grid in the third region of interest image frame sequence to obtain a total value of the validity codes of each region of interest sub-grid.
In this embodiment, the number of frames of the region of interest in the third sequence of region of interest image frames for which the region of interest sub-grid meets the preset requirement is determined by the significance coding total value.
And step S63, judging whether the total value of the validity codes is greater than a preset threshold value, and selecting the interested shielding area according to the judgment result.
In this embodiment, when the total value of the validity codes does not meet the requirement, it indicates that the number of frames of the region of interest for a certain type of region of interest sub-grid is not satisfactory in the third region of interest image frame sequence, and therefore, the region of interest sub-grid in the third region of interest image frame sequence needs to be filtered out, that is, a time-series invalid sub-grid is filtered out by a preset threshold, and image information of a corresponding region is shielded.
Step S64 is to mask a region of interest sub-mesh corresponding to the masked region of interest in the third sequence of region of interest image frames to generate a first sequence of region of interest image frames.
In this embodiment, the generated first region of interest image frame sequence is a region of interest image frame sequence that satisfies the measurement rPPG prediction value.
Through the steps S61 to S64, face passing reconstruction is realized, and errors of a face region of interest caused by face movement and face region occlusion can be compensated, so that the accuracy of health characteristic data measurement is improved.
In some embodiments, the health characteristic data includes heart rate and respiratory rate, and determining the health sign data of the detected subject according to the rPPG prediction value includes the following steps:
step S71, extracting rPPG signal frequency domain information from the rPPG predicted value, wherein the rPPG signal frequency domain information comprises a first frequency domain interval and a second frequency domain interval.
Step S72, selecting rPPG frequencies with the maximum peak from the first frequency domain interval and the second frequency domain interval, respectively, determining the rPPG frequency selected from the first frequency domain interval as a heart rate, and determining the rPPG frequency selected from the second frequency domain interval as a respiratory rate.
In this embodiment, when the heart rate is measured, the rate of change of the heart rate is also measured. Specifically, the maximum peak value f of the Heart Rate (HR) signal measurement value in a preset frequency interval (the default value is 0.5Hz-3Hz) is determined by rPPG frequency domain information1.maxDetermining that the rate of change (HRV) index of the heart rate is expressed by the standard deviation of the change of the heart rate signal in one minute, and calculating the formula as follows:
HR=60·f1.max
Figure BDA0002581239240000131
in the embodiment, when the Respiration Rate (RR) is measured, the maximum peak value f of the respiration rate signal measured value in a preset frequency interval (the default value is 0.1Hz-1Hz) is obtained from the rPPG frequency domain information2.maxDetermining that the preset frequency interval and the heart rate signal are in the same frequency interval, and then removing the frequency corresponding to the heart rate and taking the maximum value as the respiration rate signal, wherein the calculation formula is as follows:
RR=60·f2.max
in some embodiments, the health characteristic data includes blood oxygen saturation concentration, the rPPG prediction value includes an infrared rPPG prediction value and a visible rPPG prediction value, and determining the health sign data of the detected subject according to the rPPG prediction value includes:
the blood oxygen saturation concentration is calculated by the following formula:
Figure BDA0002581239240000141
wherein, SPO2Is the blood oxygen saturation concentration, sigmaRFor the standard deviation, sigma, of the predicted value of the infrared rPPG in a preset measurement period1RThe standard deviation of the prediction value of the visible light rPPG in a preset measurement period,Ris the average value of the infrared rPPG predicted value in a preset measurement period,1Rthe average value of the prediction value of the visible light rPPG in a preset measurement period is shown.
The blood oxygen saturation concentration of this embodiment is as optional measurement item, and the infrared image frame of gathering needs to carry out image acquisition through the infrared camera of the RGBIR of bi-pass filter sensor, when predicting the rPPG predicted value through rPPG value prediction network, need measure the rPPG predicted value that infrared band corresponds and the rPPG predicted value that the visible light wave band corresponds.
Fig. 5 is a flowchart of a method for measuring health sign data according to a preferred embodiment of the present application. As shown in fig. 5, the process includes:
step S501, determining whether the device is in an initialization state, if not, executing step S502, otherwise, executing step S503.
In this embodiment, the initialization includes image resolution and image acquisition frequency settings, and is accomplished by modifying camera sensor production configuration parameters.
Step S502 initializes algorithm parameters, and then step S503 is executed.
In step S503, image preprocessing is performed, and then step S504 is performed.
In this embodiment, the image preprocessing includes adaptive adjustment of exposure parameters, gain parameters and white balance parameters of the image, 3D noise reduction of the image, and digital wide dynamic parameter adjustment.
Step S504, the preprocessed image is subjected to facial skin region segmentation, facial key point detection and facial occlusion and maximum angle determination, and then step S505 is executed.
In step S505, the image face channel is reconstructed to generate a first region-of-interest image frame sequence, and then step S506 is performed.
Step S506, rPPG predicted value measurement, followed by step S507 or step S508 or step S509.
In step S507, the heart rate and change rate signals are extracted, and then step S510 is executed.
In step S508, a respiration rate signal is extracted, and then step S510 is performed.
In step S509, the blood oxygen saturation concentration signal is extracted, and then step S510 is executed.
And step S510, signal post-processing.
In this embodiment, after the measured values of the health sign signals are measured in steps S507, S508 and S509, signal post-processing operations such as abnormal signal filtering, time-series bandpass filtering (moving average filtering, cut-off frequency configuration) and the like are performed, and after the signal post-processing, a stable health sign measurement signal is output to a subsequent module.
In an embodiment, compensation for facial motion and facial region occlusion caused facial region-of-interest errors is achieved by facial region tracking at step S504 and image face channel reconstruction at step S505.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The embodiment also provides a device for measuring health sign data, which is used to implement the above embodiments and preferred embodiments, and the description of the device is omitted. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 6 is a block diagram of a health sign data measurement apparatus according to an embodiment of the present application, and as shown in fig. 6, the apparatus includes:
and the acquisition module 61 is used for acquiring the infrared image frame sequence of the detected object.
And the extracting module 62 is coupled to the obtaining module 61 and configured to extract a region of interest in each infrared image frame of the sequence of infrared image frames, so as to obtain a first sequence of region of interest image frames.
And the prediction module 63 is coupled to the extraction module 62 and configured to process the first region of interest image frame sequence by using an rPPG value prediction network to obtain an rPPG predicted value of the detected object, where the rPPG value prediction network is an artificial neural network trained according to the second region of interest image frame sequence and an actually measured rPPG value corresponding to the second region of interest image frame sequence.
And the processing module 64 is coupled with the prediction module 63 and is used for determining the health sign data of the detected subject according to the rPPG prediction value.
In some embodiments, the extraction module 62 is configured to detect feature data in a feature map of the infrared image frame, where the feature data includes key point information and region information; respectively generating a plurality of interesting region sub-grids and a target region according to the key point information and the region information; determining validity of a plurality of region of interest sub-grids based on the target region and generating validity codes; and arranging the multiple interesting region sub-grids coded by the coding effectiveness according to a preset arrangement sequence to generate the interesting regions.
In some embodiments, the extracting module 62 is configured to input the infrared image frames into a facial analysis neural network to obtain a feature map; detecting target information and area information in the feature map, wherein the target information comprises effective targets; and detecting key point information from the feature map corresponding to the effective target.
In some of these embodiments, the extraction module 62 is configured to detect a visible region sub-grid among a plurality of region of interest sub-grids, where the visible region sub-grid is an intersection of each region of interest sub-grid and the target region; calculating the percentage of the sub-grids of the visible region in the sub-grids of the region of interest, and judging whether the percentage is greater than a preset threshold value; and under the condition that the percentage is judged to be larger than the preset threshold value, determining the region of interest sub-grid as an effective region of interest sub-grid, and generating an effective code.
In some embodiments, the extracting module 62 is configured to calculate a correlation between the historical feature map and the feature map of the preset region of interest submesh after determining that the region of interest submesh is the valid region of interest submesh and generating the valid code; the historical characteristic map is a characteristic map of a historical effective interesting region sub-grid corresponding to the effective interesting region sub-grid in the previous frame of infrared image frame, and at least one preset interesting region sub-grid is arranged around the effective interesting region sub-grid; and judging whether the correlation is greater than a preset threshold value or not, and updating the preset region of interest sub-grid into an effective region of interest sub-grid under the condition that the correlation is greater than the preset threshold value.
In some embodiments, the extraction module 62 is configured to cascade the regions of interest with preset frames after the regions of interest are generated, so as to obtain a third sequence of region-of-interest image frames; summing the validity codes of each region of interest sub-grid in the third region of interest image frame sequence to obtain a total validity code value of each region of interest sub-grid; judging whether the total value of the validity codes is greater than a preset threshold value or not, and selecting an interested shielding area according to a judgment result; a region of interest sub-grid corresponding to the masked region of interest is masked in the third sequence of region of interest image frames, generating a first sequence of region of interest image frames.
In some embodiments, the extracting module 62 is configured to determine, when it is determined that the total validity code value is greater than the preset threshold, that the region-of-interest sub-grid corresponding to the total validity code value is an effective region-of-interest sub-grid; and under the condition that the total validity coding value is not larger than the preset threshold value, determining the sub-grid of the region of interest corresponding to the total validity coding value as the interested shielding region.
In some embodiments, the processing module 64 is configured to extract rPPG signal frequency domain information in the rPPG prediction value, where the rPPG signal frequency domain information includes a first frequency domain interval and a second frequency domain interval; and respectively selecting the rPPG frequency with the maximum peak value from the first frequency domain interval and the second frequency domain interval, determining the rPPG frequency selected from the first frequency domain interval as the heart rate, and determining the rPPG frequency selected from the second frequency domain interval as the respiratory rate.
In some of these embodiments, the health characteristic data includes blood oxygen saturation concentration, the rPPG prediction value includes an infrared rPPG prediction value and a visible rPPG prediction value, and processing module 64 is configured to process the infrared rPPG prediction value and the visible rPPG prediction value to obtain a health characteristic data
The blood oxygen saturation concentration is calculated by the following formula:
Figure BDA0002581239240000171
wherein, SPO2Is the blood oxygen saturation concentration, sigmaRFor the standard deviation, sigma, of the predicted value of the infrared rPPG in a preset measurement period1RThe standard deviation of the prediction value of the visible light rPPG in a preset measurement period,Ris the average value of the infrared rPPG predicted value in a preset measurement period,1Rfor visible rPPG prediction inThe mean value over the measurement period is preset.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
The embodiment also provides an infrared vision system of the vehicle-mounted cabin, which comprises camera equipment, transmission equipment and an electronic device; the camera shooting equipment is connected with the electronic device through the transmission equipment;
the camera equipment is used for acquiring an infrared image in the cockpit;
the transmission equipment is used for transmitting the infrared image to the electronic device;
the electronic device is adapted to perform the steps of any of the method embodiments described above.
The present embodiment also provides an electronic device comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
and S1, acquiring the infrared image frame sequence of the detected object.
S2, extracting the interested region in each infrared image frame of the infrared image frame sequence to obtain a first interested region image frame sequence.
And S3, processing the first region of interest image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG prediction value of the detected object, wherein the rPPG value prediction network is an artificial neural network obtained by training according to the second region of interest image frame sequence and an actually measured rPPG value corresponding to the second region of interest image frame sequence.
And S4, determining the health sign data of the detected subject according to the rPPG predicted value.
It should be noted that, for specific examples in this embodiment, reference may be made to examples described in the foregoing embodiments and optional implementations, and details of this embodiment are not described herein again.
In addition, in combination with the method for measuring health sign data in the foregoing embodiments, the embodiments of the present application may provide a storage medium to implement. The storage medium having stored thereon a computer program; the computer program, when executed by a processor, implements the method for measuring health sign data in any of the above embodiments.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (13)

1. A method for measuring health sign data, comprising:
acquiring an infrared image frame sequence of a detected object;
extracting an interested region in each infrared image frame of the infrared image frame sequence to obtain a first interested region image frame sequence;
processing the first region of interest image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG prediction value of the detected object, wherein the rPPG value prediction network is an artificial neural network obtained by training according to a second region of interest image frame sequence and an actually measured rPPG value corresponding to the second region of interest image frame sequence;
and determining the health sign data of the detected subject according to the rPPG predicted value.
2. The method for measuring health sign data according to claim 1, wherein extracting a region of interest in each infrared image frame of the sequence of infrared image frames comprises:
detecting feature data in a feature map of the infrared image frame, wherein the feature data comprises key point information and region information;
respectively generating a plurality of interesting region sub-grids and a target region according to the key point information and the region information;
determining validity of the plurality of region of interest sub-meshes based on the target region and generating a validity code;
and arranging the multiple interesting region sub-grids for coding the effectiveness codes according to a preset arrangement sequence to generate the interesting regions.
3. The method for measuring health sign data according to claim 2, wherein the feature data further includes target information, and the detecting the feature data in the feature map of the infrared image frame includes:
inputting the infrared image frame into a facial analysis neural network to obtain the characteristic map;
detecting the target information and the region information in the feature map, wherein the target information comprises a valid target;
and detecting the key point information from the feature map corresponding to the effective target.
4. The method of measuring health sign data of claim 2, wherein the significance coding comprises significance coding, wherein determining the significance of the plurality of region of interest sub-grids based on the target region and generating the significance coding comprises:
detecting a visible region sub-grid in the plurality of region of interest sub-grids, wherein the visible region sub-grid is an intersection of each of the region of interest sub-grids and the target region;
calculating the percentage of the visible region sub-grids in the region of interest sub-grids, and judging whether the percentage is greater than a preset threshold value;
and under the condition that the percentage is larger than a preset threshold value, determining the region-of-interest sub-grid as an effective region-of-interest sub-grid, and generating an effective code.
5. The method for measuring health sign data according to claim 4, wherein after determining the region of interest submesh as an effective region of interest submesh and generating an effective code, the method further comprises:
calculating the correlation between the historical characteristic map and the characteristic map of the preset region of interest sub-grid; the historical characteristic map is a characteristic map of a historical effective region of interest sub-grid corresponding to the effective region of interest sub-grid in the previous frame of the infrared image frame, and at least one preset region of interest sub-grid is arranged around the effective region of interest sub-grid;
and judging whether the correlation is greater than a preset threshold value or not, and updating the preset region of interest sub-grid into the effective region of interest sub-grid under the condition that the correlation is greater than the preset threshold value.
6. Method for measurement of health signs data according to claim 2, characterized in that after the generation of the region of interest, the method comprises:
cascading the interested regions with preset frame numbers to obtain a third interested region image frame sequence;
summing the validity codes of each region of interest sub-grid in the third region of interest image frame sequence to obtain a total validity code value of each region of interest sub-grid;
judging whether the total value of the validity codes is larger than a preset threshold value or not, and selecting an interested shielding area according to a judgment result;
masking the region of interest sub-mesh corresponding to the masked region of interest in the third sequence of region of interest image frames to generate a first sequence of region of interest image frames.
7. The method for measuring health sign data according to claim 6, wherein determining whether the total value of validity codes is greater than a preset threshold, and selecting a masked region of interest according to the determination result comprises:
under the condition that the total validity coding value is judged to be larger than a preset threshold value, determining the region-of-interest sub-grid corresponding to the total validity coding value as a valid region-of-interest sub-grid;
and under the condition that the total validity coding value is not larger than a preset threshold value, determining the region-of-interest sub-grid corresponding to the total validity coding value as the interested shielding region.
8. The method for measuring health sign data according to claim 1, wherein the health sign data includes a heart rate and a respiratory rate, and determining the health sign data of the detected subject according to the rPPG prediction value includes:
extracting rPPG signal frequency domain information from the rPPG prediction value, wherein the rPPG signal frequency domain information comprises a first frequency domain interval and a second frequency domain interval,
and respectively selecting the rPPG frequency with the maximum peak value from the first frequency domain interval and the second frequency domain interval, determining the rPPG frequency selected from the first frequency domain interval as the heart rate, and determining the rPPG frequency selected from the second frequency domain interval as the respiration rate.
9. The method for measuring health sign data according to claim 1, wherein the health sign data includes blood oxygen saturation concentration, the rPPG prediction value includes an infrared rPPG prediction value and a visible rPPG prediction value, and determining the health sign data of the detected subject according to the rPPG prediction value includes:
the blood oxygen saturation concentration is calculated by the following formula:
Figure FDA0002581239230000031
wherein, SPO2Is the blood oxygen saturation concentration, sigmaRFor the standard deviation, sigma, of the predicted value of the infrared rPPG in a preset measurement period1RThe standard deviation of the prediction value of the visible light rPPG in a preset measurement period,Ris the average value of the infrared rPPG predicted value in a preset measurement period,1Rthe average value of the prediction value of the visible light rPPG in a preset measurement period is shown.
10. A device for measuring health sign data, comprising:
the acquisition module is used for acquiring an infrared image frame sequence of the detected object;
the extraction module is used for extracting an interested region in each infrared image frame of the infrared image frame sequence to obtain a first interested region image frame sequence;
the prediction module is used for processing the first interested area image frame sequence by utilizing an rPPG value prediction network to obtain an rPPG predicted value of the detected object, wherein the rPPG value prediction network is an artificial neural network obtained by training according to a second interested area image frame sequence and an actually measured rPPG value corresponding to the second interested area image frame sequence;
and the processing module is used for determining the health sign data of the detected object according to the rPPG predicted value.
11. An infrared vision system of a vehicle-mounted cabin is characterized by comprising a camera device, a transmission device and an electronic device; the camera shooting equipment is connected with the electronic device through the transmission equipment;
the camera shooting equipment is used for acquiring an infrared image in the cockpit;
the transmission equipment is used for transmitting the infrared image to the electronic device;
the electronic device is configured to perform a method of measuring health sign data according to any one of claims 1 to 9.
12. An electronic device comprising a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to implement the method for measuring health sign data according to any one of claims 1 to 9.
13. A storage medium on which a computer program is stored, which program, when being executed by a processor, carries out a method of measuring health sign data according to any one of claims 1 to 9.
CN202010668073.3A 2020-07-13 2020-07-13 Method, device, system and storage medium for measuring health sign data Active CN112017155B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010668073.3A CN112017155B (en) 2020-07-13 2020-07-13 Method, device, system and storage medium for measuring health sign data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010668073.3A CN112017155B (en) 2020-07-13 2020-07-13 Method, device, system and storage medium for measuring health sign data

Publications (2)

Publication Number Publication Date
CN112017155A true CN112017155A (en) 2020-12-01
CN112017155B CN112017155B (en) 2023-12-26

Family

ID=73499750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010668073.3A Active CN112017155B (en) 2020-07-13 2020-07-13 Method, device, system and storage medium for measuring health sign data

Country Status (1)

Country Link
CN (1) CN112017155B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200162A (en) * 2020-12-03 2021-01-08 中国科学院自动化研究所 Non-contact heart rate measuring method, system and device based on end-to-end network
CN113537210A (en) * 2021-05-31 2021-10-22 浙江大华技术股份有限公司 Temperature detection method, device, system, computer equipment and storage medium
WO2023193711A1 (en) * 2022-04-07 2023-10-12 Faceheart Corporation Contactless physiological measurement device and method

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120095647A (en) * 2011-02-21 2012-08-29 부경대학교 산학협력단 Apparatus and method for home healthcare monitoring
US20140180132A1 (en) * 2012-12-21 2014-06-26 Koninklijke Philips Electronics N.V. System and method for extracting physiological information from remotely detected electromagnetic radiation
US20150313484A1 (en) * 2014-01-06 2015-11-05 Scanadu Incorporated Portable device with multiple integrated sensors for vital signs scanning
US20160206216A1 (en) * 2015-01-19 2016-07-21 Koninklijke Philips N.V. Device, system and method for skin detection
US20170354334A1 (en) * 2014-12-16 2017-12-14 Oxford University Innovation Limited Method and apparatus for measuring and displaying a haemodynamic parameter
CN108073864A (en) * 2016-11-15 2018-05-25 北京市商汤科技开发有限公司 Target object detection method, apparatus and system and neural network structure
CN108604376A (en) * 2016-02-08 2018-09-28 皇家飞利浦有限公司 Equipment, system and method for pulsation detection
CN108764034A (en) * 2018-04-18 2018-11-06 浙江零跑科技有限公司 A kind of driving behavior method for early warning of diverting attention based on driver's cabin near infrared camera
US20190029543A1 (en) * 2016-01-21 2019-01-31 Oxehealth Limited Method and apparatus for estimating heart rate
CN109791692A (en) * 2016-08-22 2019-05-21 科伊奥斯医药股份有限公司 Computer aided detection is carried out using the multiple images of the different perspectives from area-of-interest to improve accuracy in detection
CN109871808A (en) * 2019-02-21 2019-06-11 天津惊帆科技有限公司 Atrial fibrillation model training and detecting method and device
CN110236515A (en) * 2019-07-19 2019-09-17 合肥工业大学 A kind of contactless heart rate detection method based on near-infrared video
CN110522420A (en) * 2018-11-15 2019-12-03 广州小鹏汽车科技有限公司 Method and apparatus for measuring the physiologic information of living body in the vehicles
CN111134650A (en) * 2019-12-26 2020-05-12 上海眼控科技股份有限公司 Heart rate information acquisition method and device, computer equipment and storage medium
CN111259719A (en) * 2019-10-28 2020-06-09 浙江零跑科技有限公司 Cab scene analysis method based on multi-view infrared vision system
EP3664704A1 (en) * 2017-08-08 2020-06-17 Koninklijke Philips N.V. Device, system and method for determining a physiological parameter of a subject

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120095647A (en) * 2011-02-21 2012-08-29 부경대학교 산학협력단 Apparatus and method for home healthcare monitoring
US20140180132A1 (en) * 2012-12-21 2014-06-26 Koninklijke Philips Electronics N.V. System and method for extracting physiological information from remotely detected electromagnetic radiation
CN104871212A (en) * 2012-12-21 2015-08-26 皇家飞利浦有限公司 System and method for extracting physiological information from remotely detected electromagnetic radiation
US20150313484A1 (en) * 2014-01-06 2015-11-05 Scanadu Incorporated Portable device with multiple integrated sensors for vital signs scanning
US20170354334A1 (en) * 2014-12-16 2017-12-14 Oxford University Innovation Limited Method and apparatus for measuring and displaying a haemodynamic parameter
US20160206216A1 (en) * 2015-01-19 2016-07-21 Koninklijke Philips N.V. Device, system and method for skin detection
US20190029543A1 (en) * 2016-01-21 2019-01-31 Oxehealth Limited Method and apparatus for estimating heart rate
CN108604376A (en) * 2016-02-08 2018-09-28 皇家飞利浦有限公司 Equipment, system and method for pulsation detection
CN109791692A (en) * 2016-08-22 2019-05-21 科伊奥斯医药股份有限公司 Computer aided detection is carried out using the multiple images of the different perspectives from area-of-interest to improve accuracy in detection
CN108073864A (en) * 2016-11-15 2018-05-25 北京市商汤科技开发有限公司 Target object detection method, apparatus and system and neural network structure
EP3664704A1 (en) * 2017-08-08 2020-06-17 Koninklijke Philips N.V. Device, system and method for determining a physiological parameter of a subject
CN108764034A (en) * 2018-04-18 2018-11-06 浙江零跑科技有限公司 A kind of driving behavior method for early warning of diverting attention based on driver's cabin near infrared camera
CN110522420A (en) * 2018-11-15 2019-12-03 广州小鹏汽车科技有限公司 Method and apparatus for measuring the physiologic information of living body in the vehicles
CN109871808A (en) * 2019-02-21 2019-06-11 天津惊帆科技有限公司 Atrial fibrillation model training and detecting method and device
CN110236515A (en) * 2019-07-19 2019-09-17 合肥工业大学 A kind of contactless heart rate detection method based on near-infrared video
CN111259719A (en) * 2019-10-28 2020-06-09 浙江零跑科技有限公司 Cab scene analysis method based on multi-view infrared vision system
CN111134650A (en) * 2019-12-26 2020-05-12 上海眼控科技股份有限公司 Heart rate information acquisition method and device, computer equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
MARTINEZ N,ET AL.: "Non-contact photoplethysmogram and instantaneous heart rate estimation from infrared face video", 2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, pages 2020 - 2024 *
李晓媛;武鹏;刘允;司红玉;王振龙;: "基于人脸视频的心率参数提取", 光学精密工程, no. 03, pages 548 - 557 *
梁智敏;陈骐;肖书明;马洁;甄庆凯;: "利用热成像技术对心率进行无接触检测的研究", 中国体育科技, no. 01, pages 136 - 145 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112200162A (en) * 2020-12-03 2021-01-08 中国科学院自动化研究所 Non-contact heart rate measuring method, system and device based on end-to-end network
CN113537210A (en) * 2021-05-31 2021-10-22 浙江大华技术股份有限公司 Temperature detection method, device, system, computer equipment and storage medium
WO2023193711A1 (en) * 2022-04-07 2023-10-12 Faceheart Corporation Contactless physiological measurement device and method

Also Published As

Publication number Publication date
CN112017155B (en) 2023-12-26

Similar Documents

Publication Publication Date Title
CN112017155B (en) Method, device, system and storage medium for measuring health sign data
CN109993093B (en) Road rage monitoring method, system, equipment and medium based on facial and respiratory characteristics
CN106778695B (en) Multi-person rapid heart rate detection method based on video
Yuan et al. Fast hyperspectral anomaly detection via high-order 2-D crossing filter
CN112446270A (en) Training method of pedestrian re-identification network, and pedestrian re-identification method and device
CN111505632B (en) Ultra-wideband radar action attitude identification method based on power spectrum and Doppler characteristics
CN108932479A (en) A kind of human body anomaly detection method
CN111797804A (en) Channel state information human activity recognition method and system based on deep learning
CN112381011A (en) Non-contact heart rate measurement method, system and device based on face image
CN112465905A (en) Characteristic brain region positioning method of magnetic resonance imaging data based on deep learning
CN113656462B (en) Wisdom sports park data analysis system based on thing networking
CN111079764A (en) Low-illumination license plate image recognition method and device based on deep learning
CN111178331A (en) Radar image recognition system, method, apparatus, and computer-readable storage medium
CN115024706A (en) Non-contact heart rate measurement method integrating ConvLSTM and CBAM attention mechanism
US20230184924A1 (en) Device for characterising the actimetry of a subject in real time
CN111340758A (en) Novel efficient iris image quality evaluation method based on deep neural network
CN113326781B (en) Non-contact anxiety recognition method and device based on face video
Tao et al. Multi-feature fusion prediction of fatigue driving based on improved optical flow algorithm
CN116563768B (en) Intelligent detection method and system for microplastic pollutants
CN114943924B (en) Pain assessment method, system, equipment and medium based on facial expression video
CN112716468A (en) Non-contact heart rate measuring method and device based on three-dimensional convolution network
CN111063438B (en) Sleep quality evaluation system and method based on infrared image sequence
CN106991413A (en) A kind of unmanned plane
CN111914798B (en) Human body behavior identification method based on skeletal joint point data
CN115581435A (en) Sleep monitoring method and device based on multiple sensors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 310051 Room 301, building 3, no.2930, South Ring Road, Puyan street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Zhejiang huaruijie Technology Co.,Ltd.

Address before: 310051 Room 301, building 3, no.2930, South Ring Road, Puyan street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: Zhejiang Dahua Automobile Technology Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant