CN115311714A

CN115311714A - Method, device, equipment and system for detecting physiological state and storage medium

Info

Publication number: CN115311714A
Application number: CN202210947837.1A
Authority: CN
Inventors: 高原; 孙其功; 杨慧; 马堃
Original assignee: Xi'an Shangtang Intelligent Technology Co ltd
Current assignee: Xi'an Shangtang Intelligent Technology Co ltd
Priority date: 2022-08-05
Filing date: 2022-08-05
Publication date: 2022-11-08

Abstract

The embodiment of the disclosure discloses a method, a device, equipment, a system and a storage medium for detecting physiological states, wherein the method comprises the following steps: acquiring at least two frames of thermal images of an object to be detected; extracting mouth regions of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected; and carrying out milk spitting detection on a group of mouth areas, and determining a detection result.

Description

Method, device, equipment, system and storage medium for detecting physiological state

Technical Field

The present disclosure relates to, but not limited to, the field of computer vision technologies, and in particular, to a method, an apparatus, a device, a system, and a storage medium for detecting a physiological state.

Background

Currently, detection devices for detecting whether infants spit milk are increasingly popular. Such detection devices are typically integrated into clothing or wearable items and provide contact detection via their own sensors. The general contact detection mode requires that the user wears the detection device to the body part, which is inconvenient and fast. Meanwhile, the detection device may be dropped or damaged along with the user's movement (e.g., turning over), which affects the accuracy of the detection result, and further causes the user's abnormal state to be unable to be determined in time.

Disclosure of Invention

In view of this, the embodiments of the present disclosure at least provide a method, an apparatus, a device, a system and a storage medium for detecting a physiological status.

The technical scheme of the embodiment of the disclosure is realized as follows:

in one aspect, an embodiment of the present disclosure provides a method for detecting a physiological state, including: acquiring at least two frames of thermal images of an object to be detected; extracting mouth regions of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected; and carrying out milk spitting detection on a group of mouth areas, and determining the detection result.

In another aspect, an embodiment of the present disclosure provides a device for detecting a physiological state, including: the acquisition module is used for acquiring at least two frames of thermal images of the object to be detected; the first extraction module is used for extracting the mouth region of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected; and the first determining module is used for carrying out milk spitting detection on a group of mouth areas and determining a detection result.

In still another aspect, the present disclosure provides a computer device, which includes a memory and a processor, where the memory stores a computer program that is executable on the processor, and the processor executes the computer program to implement some or all of the steps in the method.

In another aspect, an embodiment of the present disclosure provides a system for detecting a physiological state, including: the thermal imager is used for acquiring at least two frames of thermal images of the object to be detected; the computer equipment is used for acquiring at least two frames of thermal images of the object to be detected; extracting a mouth region of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected; and carrying out milk spitting detection on a group of mouth areas, and determining the detection result.

In yet another aspect, the present disclosure provides a computer-readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements part or all of the steps of the above method.

In yet another aspect, the disclosed embodiments provide a computer program comprising computer readable code, which when run in a computer device, a processor in the computer device executes some or all of the steps for implementing the above method.

In yet another aspect, the disclosed embodiments provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program, which when read and executed by a computer, implements some or all of the steps of the above method.

In the embodiment of the disclosure, first, subsequent milk regurgitation detection can be performed based on at least two frames of thermal images by acquiring the at least two frames of thermal images of the object to be detected. Therefore, compared with the method for acquiring the color image of the object to be detected, the thermal image is less influenced by the environmental factors where the brightness waiting processing object is located, and the mouth region of the object to be detected can be simply and accurately extracted from the thermal image to obtain a group of mouth regions. Then, milk spitting detection can be carried out on a group of mouth areas, and detection results can be accurately and quickly obtained, so that the physiological state of the object to be detected can be known in time, and the dangerous situation of the object to be detected is reduced.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the technical aspects of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.

Fig. 1 is a schematic flow chart illustrating an implementation of a method for detecting a physiological status according to an embodiment of the present disclosure;

fig. 2 is a schematic flow chart illustrating an implementation of a method for detecting a physiological state according to an embodiment of the present disclosure;

fig. 3 is a schematic flow chart illustrating an implementation of a method for detecting a physiological status according to an embodiment of the present disclosure;

fig. 4 is a schematic flow chart illustrating an implementation of a method for detecting a physiological state according to an embodiment of the present disclosure;

fig. 5A is a schematic diagram illustrating a physiological status detection system according to an embodiment of the present disclosure;

fig. 5B is a schematic diagram of a building architecture of various hardware devices according to an embodiment of the present disclosure;

fig. 6A is a schematic view of a forehead region obtained from a thermal image according to an embodiment of the present disclosure;

fig. 6B is a schematic view of a sub-nasal region captured from a thermal image provided by an embodiment of the present disclosure;

fig. 6C is a schematic diagram of a pre-filtered respiration signal according to an embodiment of the disclosure;

FIG. 6D is a schematic diagram of a filtered respiration signal according to an embodiment of the disclosure;

fig. 6E is a schematic diagram of a face region for determining a current heart rate according to an embodiment of the present disclosure;

fig. 6F is a schematic diagram of a heartbeat signal in a time domain according to an embodiment of the disclosure;

fig. 6G is a schematic diagram of a heartbeat signal in a frequency domain according to an embodiment of the present disclosure;

fig. 7 is a schematic structural diagram illustrating a physiological status detecting device according to an embodiment of the present disclosure;

fig. 8 is a hardware entity diagram of a computer device according to an embodiment of the present disclosure.

Detailed Description

For the purpose of making the purpose, technical solutions and advantages of the present disclosure clearer, the technical solutions of the present disclosure are further elaborated with reference to the drawings and the embodiments, the described embodiments should not be construed as limiting the present disclosure, and all other embodiments obtained by a person of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present disclosure.

In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict. Reference to the terms "first/second/third" merely distinguishes similar objects and does not denote a particular ordering with respect to the objects, it being understood that "first/second/third" may, where permissible, be interchanged in a particular order or sequence so that embodiments of the disclosure described herein can be practiced in other than the order shown or described herein.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs. The terminology used herein is for the purpose of describing the disclosure only and is not intended to be limiting of the disclosure.

Embodiments of the present disclosure provide a method of detecting a physiological state, which may be performed by a processor of a computer device. The computer device refers to a device with physiological state detection capability, such as a server, a notebook computer, a tablet computer, a desktop computer, a smart television, a set-top box, a mobile device (e.g., a mobile phone, a portable video player, a personal digital assistant, a dedicated messaging device, and a portable game device). Fig. 1 is a schematic flow chart of an implementation of a method for detecting a physiological state according to an embodiment of the present disclosure, as shown in fig. 1, the method includes the following steps S101 to S103:

step S101, at least two frames of thermal images of an object to be detected are obtained.

Here, the object to be detected may be understood as an object capable of physiological state detection, and the object to be detected may include persons or animals, etc., such as infants or patients lacking in self-care ability, drivers driving vehicles, and pet cats or pet dogs, etc. The thermal image may be an image recording heat radiated outward from the object to be inspected, also referred to as an infrared thermal image. In implementation, a thermal imager and other devices can be used for shooting an object to be detected to obtain a video stream containing the object to be detected; and performing frame extraction on the video stream to obtain a multi-frame thermal image. The multiframe thermal images of the object to be detected can be continuously acquired or discontinuously acquired. And acquiring single-frame thermal images of the object to be detected for multiple times by using a thermal imager and other devices according to a preset time interval to obtain multi-frame thermal images.

The thermal imager can acquire thermal images of the object to be detected in a non-contact manner by using an infrared thermal imaging technology, and the infrared thermal imaging technology can comprise a passive mode, an active mode and the like. The passive infrared thermal imaging can be realized by collecting infrared radiation emitted by an object to be detected, and the active infrared thermal imaging can be realized by collecting infrared radiation reflected by the object to be detected and actively irradiating the object to be detected by using an infrared light source. For example: the thermal imager for passive infrared thermal imaging is arranged right above the crib, so that the thermal image containing the infant is collected by the thermal imager, and meanwhile, the thermal image of the infant is collected by the thermal imager, and the influence of infrared radiation on the infant can be reduced.

Step S101 includes: receiving a thermal image video containing an object to be detected uploaded by a user, and acquiring a plurality of frames of thermal images from the thermal image video. For example: and (4) performing frame extraction on the thermal image video according to the preset interval frame number to obtain a multi-frame thermal image.

Step S102, extracting the mouth region of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected.

Here, a face region of the object to be detected may be extracted from the thermal image, and then a mouth region may be extracted from the face region, where the face region may refer to a region including a face of the object to be detected in the thermal image, and the mouth region may refer to a region including a mouth in the face region. Because the mouth region is small relative to the size of the thermal image, the efficiency and accuracy of the determination of the mouth region may be improved as compared to extracting the small mouth region directly from the thermal image.

The computer equipment can utilize the trained face detection model to carry out face detection on the thermal image to obtain a face area; and extracting a mouth region from the face region based on the preset association attribute of the mouth relative to the face region. The face detection model may be a preset machine learning model, such as a neural network model, for performing face detection. The associated attributes of the mouth with respect to the face region may include: the position of the mouth part, which is positioned at the lower middle part of the face, is fixed, and the proportion of the size of the mouth part relative to the size of the face belongs to a fixed range. Then, a region of a preset size may be extracted from a position lower than the middle of the face region as the mouth region.

In implementation, the target detection model can be used to detect the face region to obtain the target region. The target detection model may be a trained Convolutional Neural Network (CNN) for obtaining a target region (e.g., a mouth region, an eye region, etc.) from the face region. The face detection model and the target detection model may be integrated to form a complete detection model, which may be part of the complete detection model. That is, the complete detection model can be used to detect the thermal image, so as to obtain the mouth region located in the face region.

Step S102 includes: performing face key point detection on each thermal image to obtain face key point information; and extracting the mouth region from the thermal image according to the face key point information. For example, the face key point information includes the type and the coordinates of the face key point, and the face key point of the mouth may be screened from all the face key points based on the type of the face key point, so as to determine the mouth region based on the coordinates of the face key point of the mouth.

And step S103, carrying out milk spitting detection on a group of mouth areas, and determining a detection result.

Here, the detection result may include the presence and absence of milk regurgitation, and the detection result may also include milk regurgitation probabilities for characterizing the presence and absence of milk regurgitation. For example: the milk spitting probability is more than 50%, and the milk spitting is determined to exist; the milk spitting probability is less than or equal to 50%, and the milk spitting is determined to be absent. There is a time-sequential association between each mouth region in a set of mouth regions. For example: a first frame of thermal images of the first mouth region is acquired at a first time, a second frame of thermal images of the second mouth region is acquired at a second time, and a third frame of thermal images of the third mouth region is acquired at a third time.

In practice, pixel value differences between the mouth regions may be determined based on pixel values of pixels in the mouth regions; determining the change values of the attributes such as the mouth temperature and the like between different acquisition times based on the pixel value difference and the corresponding relation between the preset pixel value and the temperature; and determining the milk spitting probability of the object to be detected based on the change value of the at least one attribute and the corresponding relation between the preset change value and the milk spitting probability. The attribute of the mouth may further include other attributes such as shape and opening size, and the determination manner of the change value of the other attributes is similar to the determination manner of the change value of the temperature, which is not limited herein.

Due to the temperature difference between the milk and the skin surface, the mouth shape also changes when the subject to be tested spits milk. Step S103 may include: determining a temperature change value corresponding to the pixel value difference based on the pixel value difference between a group of mouth regions and the corresponding relationship between the pixel value and the temperature; meanwhile, based on the pixel value difference between the mouth group areas, the positions of the upper lip and the lower lip are identified, the distance change between the upper lip and the lower lip of the mouth between different acquisition times is obtained, and the distance change is determined as the shape change value of the mouth; and determining that the object to be detected spits the milk under the condition that the shape change value and the temperature change value meet the preset conditions. For example: and determining that the mouth of the object to be detected is converted from a closed state to an open state, and the temperature of the mouth region is increased, and determining that the preset condition is met.

In the embodiment of the disclosure, first, the subsequent milk spitting detection can be performed based on at least two frames of thermal images by acquiring the at least two frames of thermal images of the object to be detected. Therefore, compared with the method for acquiring the color image of the object to be detected, the thermal image is less influenced by the environmental factors where the brightness waiting processing object is located, and the mouth region of the object to be detected can be simply and accurately extracted from the thermal image to obtain a group of mouth regions. Then, the milk spitting detection can be carried out on a group of mouth areas, and the detection result can be accurately and quickly obtained, so that the physiological state of the object to be detected can be known in time, and the dangerous condition of the object to be detected is reduced.

An embodiment of the present disclosure provides a method for detecting a physiological state, as shown in fig. 2, the method includes the following steps S201 to S205:

steps S201 to S202 correspond to steps S101 to S102, respectively, and specific embodiments of steps S101 to S102 may be referred to in implementation.

In step S203, the pixel characteristics of each mouth region are determined.

Here, the pixel feature may be a feature corresponding to a pixel point, the pixel feature may include at least one of texture features, color features, gradient features, or spatial relationships, and the pixel feature may be represented in a feature matrix. The texture features characterize surface properties of the objects in the thermal image, the color features characterize colors of the objects in the thermal image, the gradient features characterize shapes and structures of the objects in the thermal image, and the spatial relationship features characterize spatial locations (e.g., overlaps, belongings, etc.) between the plurality of extracted objects in the thermal image. In implementation, the feature extraction model may be used to extract features of the mouth region to obtain pixel features of the mouth region. The feature extraction model may be a trained neural network for performing pixel feature extraction.

Step S203 includes: based on the pixel value of each pixel in the mouth region, the pixel characteristics of the mouth region are determined. For example: determining the size of the mouth area as the dimension of the pixel characteristics of the mouth area, and initializing a matrix of the dimension, wherein the matrix is used for characterizing the pixel characteristics of the mouth area. And determining the pixel value of each pixel in the mouth area as the element value of the element at the corresponding position in the matrix, thereby obtaining the pixel characteristic of the mouth area.

Step S204, based on the characteristic difference between the pixel characteristics of the mouth area, determining at least one of temperature change information, shape change information and milk bubble identification result represented by a group of mouth areas.

Here, the computer device may determine a sub-temperature value for each mouth region in the set of mouth regions based on pixel characteristics of the mouth region, may determine a temperature difference between the sub-temperature values for the mouth region of the last frame of thermal image and the sub-temperature values for the mouth region of the first frame of thermal image, and may determine the temperature difference as temperature change information characterized by the set of mouth regions. The method comprises the following steps: the pixel mean of the mouth region may be determined based on the pixel characteristics of the mouth region, so that the sub-temperature value corresponding to the mouth region is determined based on the pixel mean and the preset correspondence between the pixel value and the temperature. For example: the sub-temperature value for the mouth region of the last frame of the thermal image is 155 and the sub-temperature value for the mouth region of the first frame of the thermal image is 125, and the temperature change information between the mouth region of the last frame of the thermal image and the mouth region of the first frame may be determined to be 30 and used as the temperature change information characterizing a set of mouth regions.

Step S204 includes: based on the feature difference between the pixel features of the mouth region of the consecutive frames, it is determined whether the shape of the mouth has changed, that is, the shape change information of the mouth region is determined. The method comprises the following steps: the computer device may determine a feature distance between pixel features of the mouth region of the current frame thermal image and pixel features of the mouth region of the next frame thermal image by using a preset feature distance determination method, using a feature distance (e.g., euclidean distance, manhattan distance, chebyshev distance, or the like) between different pixel features as a feature difference, and determining the feature distance as the shape change information.

The object to be detected may be an infant, a circular milk foam may appear in the mouth area in the case of milk regurgitation by the infant, and the milk foam recognition result may include detected milk foam and undetected milk foam. Step S204 includes: and matching the milk foam object with the preset milk foam characteristics based on the characteristic difference between the pixel characteristics of the mouth area of the continuous frames, and determining whether milk foam exists in the mouth area. For example: the computer equipment can determine the pixel characteristics of the mouth region of the current frame thermal image, and the characteristic distance between the pixel characteristics of the mouth region of the next frame thermal image, and performs similarity comparison on the characteristic distance and the milk foam characteristics corresponding to the milk foam to obtain a comparison result; determining that a milk foam has been detected if the similarity between the feature distance and the milk foam feature is greater than a similarity threshold; and in the case that the similarity between the characteristic distance and the characteristic of the milk foam is less than or equal to a similarity threshold value, determining that the milk foam is not detected and the like.

Step S205 of determining the detection result based on at least one of the temperature change information, the shape change information, and the milk foam recognition result.

Here, the computer device may determine the corresponding milk regurgitation probability based on one or more of the temperature change information, the shape change information and the milk bubble recognition result, and a correspondence between one or more of the preset temperature change information, the preset shape change information and the milk bubble recognition result and the milk regurgitation probability. For example: when the temperature change belongs to a preset first temperature range and the shape change belongs to a preset first shape range, and the milk foam recognition result is that the milk foam is detected, the corresponding milk spitting probability is 70%; and if the temperature change belongs to a preset second temperature range and the shape change belongs to a preset second shape range, and the identification result of the milk foam is that the milk foam is detected, the corresponding milk spitting probability is 80%, and the like. In some embodiments, the computer device may also predict a milk spitting probability based on one or more of temperature change information, shape change information, and milk bubble recognition results using a trained probabilistic predictive model. The probability prediction model can be a trained neural network and is used for predicting the milk spitting probability.

In the embodiment of the disclosure, the temperature change information, the shape change information and the milk foam recognition result can be simply and accurately determined through the feature difference between the pixel features of each mouth region, and then the detection result can be more accurately determined based on one or more of the temperature change information, the shape change information and the milk foam recognition result. Meanwhile, the thermal imager is used for collecting the thermal image of the user, milk spitting detection is carried out based on the thermal image, and all-weather 24-hour milk spitting detection can be achieved without being influenced by illumination conditions.

In some embodiments, the step S204 may include the following steps S211 to S214:

step S211, determining a sub-temperature value of each mouth region based on the pixel characteristics of the mouth region.

Here, the sub-temperature value of the mouth region may be understood as a temperature that the mouth region of the current frame represents. The computer device may determine an average pixel value for the mouth region based on pixel values of pixels of the mouth region; and determining a sub-temperature value corresponding to the average pixel value based on the average pixel value and the corresponding relation between the pixel value and the temperature. For example: and determining the average pixel value of the mouth region to be 125 based on the pixel values of the pixels of the mouth region, and further determining the sub-temperature value represented by the mouth region of the current frame thermal image to be 35 ℃ based on the current average pixel value and the corresponding relation between the pixel values and the temperature.

Step S212, grouping the group of mouth regions to obtain a first group of mouth regions and a second group of mouth regions.

For example: a set of 10 mouth regions can be divided equally into two groups according to the acquisition time of the thermal image, the first group being the 5 mouth regions with the previous acquisition time and the second group being the 5 mouth regions with the next acquisition time.

Step S213, determining a first temperature value and a second temperature value of a set of mouth regions respectively based on the sub-temperature values of the first set of mouth regions and the sub-temperature values of the second set of mouth regions.

For example: determining a first temperature mean value corresponding to the first group of mouth areas based on the sub-temperature values of the first group of mouth areas, and determining the first temperature mean value as a first temperature value; and determining a second temperature mean value corresponding to the second group of mouth areas based on the sub-temperature values of the second group of mouth areas, and determining the second temperature mean value as a second temperature value.

Step S214, determining the temperature variation information based on a temperature difference value between the first temperature value and the second temperature value.

For example: if the first temperature value is determined to be 36 degrees celsius and the second temperature value is determined to be 42 degrees celsius, the temperature change information is determined to be 6 degrees celsius.

In the embodiment of the disclosure, the temperature change information can be more accurately determined through the first temperature value and the second temperature value of the group of mouth areas, and the situation that the temperature change information is determined to be abnormal is reduced.

In some embodiments, the step S204 may include the following steps S221 to S222:

step S221, determining an opening/closing degree of each mouth region based on the pixel characteristics of the mouth region.

Here, the opening and closing degree may refer to a size of closing or opening of the mouth. In implementation, the position coordinates of the upper lip and the lower lip can be recognized based on the pixel characteristics of the mouth region; and determining the distance between the upper lip and the lower lip based on the position coordinates of the upper lip and the lower lip, and determining the distance between the upper lip and the lower lip as the opening and closing degree. If the distance between the upper and lower lips is determined to be 0 based on the pixel characteristics of the mouth region, the degree of opening and closing is 0, indicating that the mouth is closed.

Step S222 of determining a difference between opening and closing degrees of the mouth region, and determining the difference between the opening and closing degrees as the shape change information.

For example: if the degree of opening and closing of the mouth region in the first thermal image is determined to be 0.1 and the degree of opening and closing of the mouth region in the last thermal image is determined to be 0.9, the shape change information is determined to be 0.8. The computer device may also determine an average opening/closing degree of the first group of mouth regions and an average opening/closing degree of the second group of mouth regions, determine a difference between the average opening/closing degrees of the different groups of mouth regions, and determine the difference between the average opening/closing degrees of the different groups of mouth regions as the shape change information.

In the embodiment of the present disclosure, the opening and closing degree of the mouth region may be determined based on the pixel characteristics of the mouth region, so that the shape change information is accurately determined based on the opening and closing degree.

In some embodiments, the step S204 may include the following step S231:

step S231, identifying a feature difference between pixel features of the mouth region by using a milk bubble identification model, and determining the milk bubble identification result.

Here, the computer device may input the feature difference between the pixel features of the mouth region into the trained bubble recognition model for recognition, so as to obtain a recognition result, that is, whether a bubble exists. The milk foam recognition model may be a trained neural network that recognizes based on the geometry (typically circular) of the milk foam for recognizing the milk foam.

In the embodiment of the disclosure, the milk bubble identification model can be used for identifying the feature difference between the pixel features of the mouth region, and whether the milk bubbles exist can be determined quickly and stably.

In some embodiments, the step S205 may include the following step S2051:

step S2051, when the temperature change is smaller than a first temperature threshold and the shape change is larger than a first preset threshold, or when the temperature change is larger than a second temperature threshold and the identification result of the milk froth includes that the milk froth is detected, determining that the detection result is that the object to be detected is in a milk spitting state.

Here, after the infant drinks the milk having a low temperature, a milk spitting phenomenon may occur. During the milk spitting process of the baby, the temperature of the mouth part can be reduced, the shape of the mouth part can be changed, and the like. Therefore, in the case where the temperature change is smaller than the first temperature threshold and the shape change is larger than the preset threshold, it is determined that the infant is in the milk regurgitation state. Wherein the first temperature threshold may be less than the second temperature threshold. For example, the mouth region of the last frame of thermal image may change from-10 degrees celsius to-7 degrees celsius, less than a first predetermined temperature threshold, and change in shape to 0.6, greater than a predetermined threshold of 0.1, etc., to determine that the infant is in a milk regurgitation state.

When the temperature of the milk for the infant to drink is higher than the temperature of the human body and milk spitting occurs, the temperature of the mouth may also rise, and milk foam or the like may occur. Therefore, in case the temperature variation is larger than the second temperature threshold and the identification result of the milk froth comprises identifying the milk froth, it may be determined that the infant is in a milk spitting state. For example, if the temperature change between the mouth region of the last frame of thermal image and the mouth region of the first frame is 10 degrees celsius, which is greater than the second preset temperature threshold of 7 degrees celsius, and a milk foam is detected, it can be determined that the infant is in a milk spitting state.

In some embodiments, the first temperature threshold may also be equal to the second temperature threshold, e.g., the first and second temperature thresholds are zero, etc. The computer device may determine that the object to be detected is in a milk spitting state based on one or more dimensional factors of temperature change information, shape change information, and milk bubble recognition results, for example: and determining that the object to be detected is in a milk spitting state and the like under the conditions that the temperature change is larger than the temperature threshold value, the shape change is larger than the preset threshold value and milk bubbles are detected.

In the embodiment of the disclosure, compared with the method for detecting milk regurgitation by using one-dimensional factor, the method for detecting milk regurgitation by using one-dimensional factor can determine the detection result, accurately calibrate the detection result of milk regurgitation detection by using multiple-dimensional factors (temperature, shape and the like), and improve the detection accuracy and the like.

An embodiment of the present disclosure provides a method for detecting a physiological state, as shown in fig. 3, the method includes the following steps S301 to S306:

step S301 corresponds to step S101, and reference may be made to the specific implementation of step S101; step S306 corresponds to step S103, and reference may be made to the specific implementation of step S103.

Step S302, acquiring a reference image of the object to be detected.

Here, the reference image may be used to determine a target region in the thermal image, and the reference image may include a color (Red Green Blue, RGB) image, a grayscale image, a thermal imaging image, or the like. The computer equipment can utilize a color camera to acquire a color image of an object to be detected; acquiring a gray image of an object to be detected by using a gray camera, or determining the gray image based on the acquired color image; thermographic images were acquired using Near-infrared (NIR) cameras. When the reference image is a thermal imaging image, the acquisition mode of the thermal imaging image and the acquisition mode of the thermal image can be different. For example: the thermal image may be acquired based on an active infrared thermal imaging technique, and the thermal image may be acquired based on a passive infrared thermal imaging technique. The thermal imaging image can be acquired based on an active infrared thermal imaging technology, so that the outline of the object to be detected in the acquired thermal imaging image is more obvious, and the region detection and the like can be more accurately and rapidly performed.

Step S303, detecting the reference image to obtain a candidate region.

Here, the candidate region may be understood as a mouth region in the reference image. The computer device may first detect the reference image using the face detection model to obtain a face region of the reference image. The face detection model may be understood as a preset machine learning model, such as a neural network model, for performing face detection. And then, detecting the face region of the reference image by using a target detection model to obtain a candidate region, wherein the target detection model can be a trained convolutional neural network.

Step S304, determining the mapping relation of the positions between the reference image and the thermal image.

Here, the reference image and the thermal image may be acquired using different types of devices, for example: thermal images are acquired using a thermal imager having passive infrared thermal imaging capability, and reference images are acquired using an NIR camera having active infrared thermal imaging capability. The reference image and the thermal image both contain the object to be detected, but because the thermal imager and the NIR camera do not use the angle for shooting the object to be detected, the position of the object to be detected in the reference image is different from the position of the object to be detected in the thermal image. In implementation, the reference image and the thermal image with the same acquisition time can be acquired, the matching point pair between the reference image and the thermal image is determined, and the mapping relation of the position between the reference image and the thermal image is determined based on the coordinate information of the matching point pair. Wherein the mapping relationship may be represented in the form of a transformation matrix. The method comprises the following steps: detecting corners in the reference image and the thermal image respectively, matching the corners in the reference image and the corners in the thermal image, taking key corners such as joint points and the like in different images as matching point pairs, and determining a conversion matrix between the reference image and the thermal image based on a plurality of matching point pairs.

Step S305, extracting the mouth region in each thermal image based on the position of the candidate region in the reference image and the mapping relationship, so as to obtain a group of mouth regions.

Here, the computer device may determine a mapping region in the thermal image corresponding to the candidate region in the reference image, determining the mapping region as a mouth region. For example, the positions of the candidate regions in the reference image can be represented as (1, 1), (1, 2), (2, 2) and (2, 1), and then based on the transformation matrix, the positions of the corresponding mapping regions (i.e., mouth regions) in the thermal image can be determined as (2, 2), (2, 3.5), (3.5 ) and (2.5, 2), etc. For a group of non-mouth regions, the extraction may also be performed by using a reference image, which is not limited herein.

In the embodiment of the disclosure, since the reference image may be an RGB image or a thermal imaging image, the contour of the object to be detected in the reference image is clearer, so that the candidate region may be accurately determined from the reference image first. Then, based on the position of the candidate region and the mapping relation between the reference image and the thermal image, the mouth region is accurately acquired from the thermal image, and the accuracy of the mouth region is improved.

An embodiment of the present disclosure provides a method for detecting a physiological state, as shown in fig. 4, the method includes the following steps S401 to S407:

steps S401 to S403 correspond to steps S101 to S103, respectively, and the detailed implementation of steps S101 to S103 can be referred to.

Step S404, extracting a non-mouth region of the object to be detected from each of the at least two frames of thermal images to obtain a group of non-mouth regions of the object to be detected.

Here, the non-mouth region may be understood as a region other than the mouth in the face region in the thermal image, for example, the non-mouth region may be a forehead region, an eye region, a temple region, an ear region, or the like. The mouth and non-mouth regions may be rectangular, circular, elliptical, irregular, etc. in shape.

In some embodiments, the computer device may acquire multiple regions simultaneously from each thermal image, such as a forehead region, a mouth region, and an eye region from the thermal image of the current frame. The number of regions extracted may or may not be the same between different thermal images. For example: there are situations in which portions of a thermal image are not able to extract non-mouth regions in multiple thermal images, and the number of non-mouth regions extracted varies from thermal image to thermal image.

In the process of extracting a plurality of areas from the thermal image, the position, size and other information of the mouth area and the non-mouth area in the thermal image can be determined at the same time. Then, the computer device may determine the type of the target (e.g., mouth, eye) in the region according to the position of the extracted region in the face region, e.g., the first region is located on the middle of the face region, and the first region is determined to be the forehead region; the second area is located in the lower middle of the face area, and the second area is determined to be a mouth area and the like. Since multiple regions can be extracted from each thermal image, regions having the same location in different thermal images can be determined as a group. For example, a first region and a second region are extracted from the first frame of thermal images, and a third region and a fourth region are extracted from the second frame of thermal images. If the position of the first area is determined to be located on the upper part of the middle part of the face area in the first frame of thermal image, the position of the second area is determined to be located on the lower part of the middle part of the face area in the first frame of thermal image, the position of the third area is determined to be located on the upper part of the middle part of the face area in the second frame of thermal image, and the position of the fourth area is determined to be located on the lower part of the middle part of the face area in the second frame of thermal image. Then, the first region and the third region may be determined as a first group, and the second region and the fourth region may be determined as a second group. That is, after extracting the plurality of regions, the forehead region may be determined as one group and the mouth region may be determined as another group.

Step S405, determining the physiological parameters of the object to be detected based on a group of the non-mouth regions.

Here, the physiological parameter may be understood as a parameter for characterizing a physiological state (e.g., a normal state and an abnormal state) of the subject to be detected, and may include a type of body temperature, a breathing rate, a heart rate, or a blood pressure. The temperature of different body parts of the object to be detected can be determined according to the color or the gray scale of different areas of the thermal image, and the physiological parameters which can be represented by the body parts are determined based on the temperature or the change of the temperature of the body parts. The method comprises the following steps: determining the temperature corresponding to the forehead area as the body temperature of the object to be detected; since the temperature of the exhaled air of the human body is higher than the temperature of the skin, the breathing frequency of the object to be detected can be determined based on the change of the temperature of the lower nasal region.

Step S405 includes: based on a set of forehead regions, the computer device may determine a body temperature of the subject to be detected; based on a set of sub-nasal regions, the breathing frequency of the subject to be examined, etc. may be determined. The type and the number of the physiological parameters to be determined may be one physiological parameter, or may be multiple physiological parameters, or may be a breathing rate, or a heart rate, or the like.

Step S406, based on the physiological parameters, adjusting the detection result to obtain an adjusted detection result.

Here, the computer device may adjust the detection result based on at least one physiological parameter, including: if the physiological parameter belongs to the first range, increasing the milk spitting probability for representing the detection result; if the physiological parameter falls within the second range, the milk spitting probability used for characterizing the detection result is reduced. If the current body temperature is determined to be higher than 38 ℃ or lower than 35 ℃, increasing the milk spitting probability for representing the detection result; and in the case that the current body temperature is determined to be less than or equal to 38 ℃ or more than or equal to 35 ℃, reducing the milk spitting probability for representing the detection result.

Step S407, in response to the adjusted detection result meeting the preset condition, generating alarm information, and sending the alarm information to the selected processing object.

Here, the alarm information may include information such as a current detection result, a physiological parameter, and a current time, which is used to inform the processing object to view a physiological state of the object to be detected. The treatment object may be a caregiver who cares the object to be detected, such as a parent of an infant. And the computer equipment judges whether the adjusted detection result meets a preset condition or not so as to determine whether the object to be detected is in a milk spitting state or not. If the preset condition is that the milk spitting probability representing the detection result exceeds 0.7, the milk spitting probability that the milk spitting detection result of the object to be detected is 0.75 is determined based on the mouth area, the detection result is adjusted based on the current body temperature, the adjusted milk spitting probability that the milk spitting detection result is 0.9 is obtained, and the object to be detected is determined to spit milk. Under the condition that the object to be detected is determined to spit milk, alarm information can be generated and sent to a processing object which is associated in advance, so that the processing object can find the abnormal state of the object to be detected in time.

In the embodiment of the disclosure, the physiological parameter of the object to be detected can be determined through the non-mouth region, so that the detection result of the milk spitting detection is further calibrated based on the physiological parameter, and the accuracy of the detection result is improved. Meanwhile, when the adjusted detection result meets the preset condition, the object to be detected is determined to be in an abnormal state, then alarm information is generated, and the alarm information is sent to a preselected processing object, so that the processing object can know the physiological state of the object to be detected in time, and the danger of the object to be detected is reduced.

In some embodiments, the physiological parameter comprises a current body temperature, the non-mouth region being a forehead region; the above step S405 may include the following steps S411 to S413:

in step S411, a sub-temperature value of each forehead region is determined based on a pixel value of a pixel of each forehead region.

Here, the sub-temperature value may be understood as a temperature characterized by a forehead region of the current frame, and the computer device may determine an average pixel value of the forehead region based on pixel values of pixels of the forehead region, so as to determine the sub-temperature value corresponding to the average pixel value based on a preset correspondence between the pixel values and the temperature. If the average pixel value of the forehead region is determined to be 125 based on the pixel values of the pixels of the forehead region, the sub-temperature value represented by the forehead region of the current frame thermal image may be determined to be 35 degrees celsius based on the preset correspondence between the pixel values and the temperatures.

Step S412, determining the temperature mean of the sub-temperature values as a group of temperature values of the forehead area.

Here, the temperature value of a group of forehead regions can be understood as a mean value of temperatures characterized by the forehead regions of the plurality of frames of thermal images. For example, a group of forehead regions includes a first forehead region and a second forehead region, the sub-temperature value of the first forehead region is 35 degrees celsius, the sub-temperature value of the second forehead region is 40 degrees celsius, the temperature mean value may be determined to be 37.5 degrees celsius, and then the temperature value may be determined to be 37.5 degrees celsius.

Step S413, determining the current body temperature of the object to be detected based on the temperature value and a preset association relationship.

Here, the current body temperature may be understood as determining the body temperature of the object to be detected based on the current thermography, and the computer device may preset a mapping relationship between the temperature value and the actual body temperature of the object to be detected. For example, the temperature value is 0 degrees celsius, 2 degrees celsius for the actual body temperature, 30 degrees celsius for the actual body temperature, 32 degrees celsius for the actual body temperature, or the like. Then, in the case where the temperature value is determined to be 37.5 degrees celsius, the current body temperature of the object to be detected may be determined to be 39.5 degrees celsius.

In the embodiment of the disclosure, the sub-temperature value of each forehead area can be simply and accurately determined based on the pixel value of the forehead area. Then, a group of forehead area temperature values can be determined based on the sub-temperature values of each forehead area, so that the current body temperature can be obtained more accurately based on the temperature values, and the error of the current body temperature is reduced.

In some embodiments, the physiological data includes a current respiratory rate, the non-mouth region is a lower nasal region; the step S405 may include the following steps S421 to S423:

in step S421, the pixel characteristics of each of the under nose regions are determined.

Here, the current respiratory rate may be a respiratory rate determined based on the current thermography image, and the under-nose region may be a region between the nose and lips of the subject to be detected, and may be used to determine the current respiratory rate. The computer equipment can utilize the feature extraction model to extract features of the subnasal region to obtain pixel features of the subnasal region. The feature extraction model may be a trained neural network for performing pixel feature extraction. It is also possible to determine a matrix of pixel values for all pixels based on the pixel value of each pixel in the subnasal area, which matrix of pixel values is taken as the pixel characteristic.

In step S422, a set of pixel feature sequences of the subnasal region is determined based on the pixel features of each subnasal region.

Here, a pixel feature sequence may be understood as a feature set comprising a plurality of pixel features. The computer equipment can perform dimension superposition on all the pixel characteristics to obtain a pixel characteristic sequence of the subnasal region. For example, if the pixel is characterized by 100 × 100 matrices and a group has 10 infranasal regions, then the 10 matrices may be combined by stitching to obtain a 1000 × 100 matrix as the pixel characterization matrix. In some embodiments, the sequence of the thermal images to which each of the subnasal regions belongs may be determined based on the acquisition time of each of the thermal images, and all of the pixel features may be combined in the order of the acquisition of the thermal images to obtain a sequence of pixel features. The acquisition time may be understood as the imaging time or acquisition time of the thermal images, the acquisition time being different for each frame of thermal image of the object to be examined.

Step S423, determining the current respiratory frequency of the object to be detected based on a group of pixel feature sequences of the under-nose region.

Here, the computer device may predict the current respiratory frequency based on the pixel feature sequence by using the respiratory detection model, so as to obtain the current respiratory frequency of the object to be detected. The respiration detection model may be a trained neural network used to determine the current respiration rate. For example, 100 frames of thermography images of the infant are obtained, and if each thermography image has a subnasal region, the pixel characteristics of 100 subnasal regions can be determined, so as to determine a group of pixel characteristic sequences of the subnasal regions; then, inputting the pixel feature sequence into a respiration detection model to obtain that the current respiration rate of the infant is 20 times per minute.

In the embodiment of the disclosure, a group of pixel feature sequences of the lower nasal region can be determined based on the pixel feature of each lower nasal region, and then the current respiratory rate can be accurately and stably determined based on the pixel feature sequences.

In some embodiments, the step S422 may include the following steps S4221 to S4222:

step S4221, determining a temporal sequence between the under-nose regions based on the acquisition time of each of the thermal images and the affiliation between the under-nose regions and the thermal images.

Here, the temporal order may be a sequential order of imaging between the lower nasal regions. Since the computer device may also determine the acquisition time of each thermal image during the acquisition of the thermal images, for example, the acquisition time of a first frame of thermal images may be a first time and the acquisition time of a second frame of thermal images may be a second time. In the process of extracting the region, the affiliation between the region and the thermal image can also be determined. For example, the first and second regions belong to a first frame of thermal images, the third and fourth regions belong to a second frame of thermal images, and so on.

In practice, the acquisition time of the thermal image may be determined as the acquisition time of the corresponding under-nose region based on the acquisition time and the relationship, so that the under-nose regions are sorted based on the acquisition time of the under-nose regions, resulting in a temporal order between the under-nose regions. For example: determining that the first subnasal region belongs to the first frame of thermal image and the second subnasal region belongs to the second frame of thermal image, the first frame of thermal image being acquired at a time earlier than the second frame of thermal image, determining that the temporal order between the first subnasal region and the second subnasal region is such that the first subnasal region is acquired earlier, the second subnasal region is acquired later, and so on.

Step S4222, determining the pixel feature sequence based on the pixel features and the time sequence.

For example: the first under-nose region is collected in front, the second under-nose region is collected in back, and pixel features of the second under-nose region can be spliced to pixel features of the first under-nose region to obtain a pixel feature sequence, so that the pixel feature sequence has a relation of a plurality of under-nose regions in time sequence, and subsequent accurate determination of physiological parameters is facilitated.

In the embodiment of the disclosure, the pixel feature sequences related in time sequence can be accurately determined based on the time sequence between the nasal lower regions, so that the current respiratory frequency determined based on the pixel feature sequences related in time sequence is more accurate.

In some embodiments, the step S423 may include the following steps S4231 to S4234:

step S4231, performing dimension reduction processing on the pixel feature sequence to obtain a respiratory signal of the pixel feature sequence in a time domain.

A respiration signal is understood here to mean a signal which is generated on the basis of the respiration of the subject to be examined, for example a signal which changes over time in the amplitude of the respiration. The dimension reduction processing can filter out the characteristics which are useless for determining the current respiratory frequency in the pixel characteristic sequence, and is beneficial to improving the accuracy of the current respiratory frequency. For example: the dimensionality reduction processing may be performed by Principal Component Analysis (PCA), singular Value Decomposition (SVD), multidimensional Scaling (MDS), or the like.

Step S4231 includes: the computer device may determine a breathing signal that varies in intensity with time based on a mean value of pixel values for each of the lower nasal region in the sequence of pixel features, provided that the mean value is taken as the intensity of the breathing signal and the time of acquisition of the thermal image to which the lower nasal region belongs is taken as an argument of the breathing signal. Taking the matrix characterization pixel characteristics as an example: the pixel feature sequence includes 10 matrices corresponding to pixel features, element values of elements in the matrix corresponding to each pixel feature are pixel values of pixels in the under nose region, a mean value of each pixel feature (that is, an average pixel value of all pixels in the under nose region) is determined based on the element values of each element in the matrix corresponding to each pixel feature, the mean value is normalized, and the normalized mean value is used as an intensity (also referred to as an amplitude) at a corresponding time in a respiratory signal. For example, the breathing signal may be: the time interval is 10 milliseconds, and the corresponding amplitude (i.e., the mean after treatment) may be (-0.1, -0.2,0,0.15,) in turn.

Step S4232, converting the respiratory signal in the time domain into a respiratory signal in the frequency domain.

The breathing signal may be a continuous signal or may be a discrete signal. A respiration signal in the frequency domain is understood to mean a signal that produces a change in the intensity of the respiration as a function of frequency, e.g. a first frequency corresponding to a first intensity, a second frequency corresponding to a second intensity, etc. In implementation, the computer device may convert the respiratory signal in the time domain to obtain the respiratory signal in the frequency domain, which is helpful to determine the current respiratory rate simply and accurately. For example: the fourier transform may be used to transform the respiration signal in the time domain.

In other embodiments, before step S4232, the method may further include: the filtering processing (such as band-pass filtering) is performed on the respiratory signal in the time domain, and the filtered respiratory signal in the time domain is converted into the respiratory signal in the frequency domain, which is beneficial to improving the accuracy of the current respiratory frequency.

In step S4233, a reference respiratory frequency is determined based on a peak value located in a first frequency range in the respiratory signal in the frequency domain.

The reference respiratory rate is understood to be the number of breaths determined based on the number of frames of the acquired thermal image, i.e. the unit of the reference respiratory rate is every number of frames. For example, with 100 thermal images, the number of breaths determined is 20, and the reference respiratory rate may be 20 per 100 frames.

Here, the computer device may determine the reference respiratory frequency based on a distribution of the respiratory signal over the frequency domain, including: and determining the frequency corresponding to the peak value in the preset first frequency range in the respiratory signal on the frequency domain as the reference respiratory frequency. For example, when the object to be detected is a human body, since the respiratory frequency of the human body is generally 10 to 30 times per minute (which may be converted, about 10 to 40 times per 100 frames), the respiratory frequency of the human body is taken as a first frequency range, a peak value in the frequency range of 10 to 40 times per 100 frames in the respiratory signal on the frequency domain is determined, and the frequency corresponding to the peak value is taken as a reference respiratory frequency 20 times per 100 frames.

In other embodiments, the computer device may determine, from the respiration signal in the frequency domain, a frequency with an intensity greater than a preset intensity threshold, determine a frequency average of the frequencies with intensities greater than the preset intensity threshold, and determine the frequency average as the reference respiration frequency. For example, in the frequency range of every 100 frames 20-40 times in the respiratory signal in the frequency domain, the corresponding intensity is greater than the preset intensity threshold (e.g., 10), and the average value of the frequency range of every 100 frames 20-40 times can be used as the reference respiratory frequency, which is 30 times every 100 frames.

And step S4234, converting the reference respiratory frequency to obtain the current respiratory frequency.

Here, the current breathing rate is understood to be a breathing rate based on time, i.e. the unit of the current breathing rate is times per minute. The computer device may perform unit conversion on the reference respiratory rate based on the number of utilized thermal images and the frame rate at which the thermal images are acquired, to obtain the current respiratory rate. For example: the number of frames of the acquired thermal images is 60 frames, the determined reference respiratory rate is 30 times per 60 frames, and the preset frame rate for acquiring the thermal images is 40 frames per minute, so that the current respiratory rate can be determined to be 20 times per minute.

In the embodiment of the disclosure, the dimension reduction is performed on the pixel feature sequence to obtain the respiratory signal in the time domain, so that useless information can be filtered, and the accuracy of subsequently determining the current respiratory frequency is improved. Then, the reference respiratory frequency can be obtained simply and accurately by converting the respiratory signal in the time domain into the respiratory signal in the frequency domain. Meanwhile, the reference respiratory frequency is determined based on the peak value in the first frequency range, so that the interference of other frequency ranges can be further reduced, and the accuracy of the current respiratory frequency is improved.

In some embodiments, the physiological data comprises a current heart rate, the non-mouth region being a face region; the step S405 may include the following steps S431 to S432:

and step S431, according to the acquisition time among the face regions, performing feature extraction on a group of face regions to obtain a group of time features and space features of the face regions.

Here, the current heart rate may be understood as a heart rate determined based on a current thermal image, and the computer device may determine a pixel mean value corresponding to each face region directly based on a pixel feature of each face region, and further determine a temperature value corresponding to the pixel mean value based on the pixel mean value and a preset corresponding relationship between a pixel value and a temperature; and determining the difference value between the temperature values corresponding to the face areas as the temperature change information of the face areas, and determining the current heart rate corresponding to the temperature change information of the face areas based on the temperature change information of the face areas and the corresponding relation between the preset temperature change and the heart rate.

The temporal features may be used to characterize temporally transformed features between different face regions, and the spatial features may be used to characterize spatially transformed features between different face regions. The computer device may determine the acquisition time of the thermal image to which the face region belongs as the acquisition time of the face region. For example: the acquisition time of the first frame of thermal image is a first moment, and the acquisition time of the face area in the first frame of thermal image is a first moment; the acquisition time of the second frame of thermal image is the second moment, and then the acquisition time of the face region in the second frame of thermal image is the second moment.

When step S431 is implemented, a group of face regions may be sorted according to the acquisition time, so as to obtain a group of face regions associated in time sequence; and then, carrying out feature extraction on a group of face regions related in time sequence to obtain the time features and the space features of the group of face regions. For example: and performing feature extraction on a group of face regions by using the trained three-dimensional convolutional neural network to obtain time features and space features of the group of face regions which are related in time sequence. In addition, the three-dimensional convolutional neural network can also extract information of different color channels (such as a red channel, a green channel and a blue channel) in the human face area, and the accuracy of subsequently determining the current heart rate is improved.

Step S432, determining the current heart rate of the object to be detected based on the temporal feature and the spatial feature.

Here, the computer device may predict a current heart rate of the object to be detected based on the temporal feature and the spatial feature using the heart rate detection model; the heart rate detection model may be a trained neural network for predicting the current heart rate. For example: acquiring 100 frames of thermographs of the infant, and assuming that each thermograph has a face area, determining time characteristics and space characteristics corresponding to the 100 face areas, inputting the time characteristics and the space characteristics into a heart rate detection model, so as to obtain the current heart rate of the infant which is 80 times per minute.

According to the method and the device, the time characteristics and the space characteristics of a group of face areas related in time sequence can be accurately obtained according to the acquisition time among the face areas, and then the current heart rate can be simply and accurately determined based on the time characteristics and the space characteristics.

In some embodiments, the step S432 may include the following steps S4321 to S4324:

step S4321, based on the temporal features and the spatial features, determining a group of heartbeat signals of the face region in a time domain.

Here, the heartbeat signal may be understood as a signal generated based on the heartbeat of the object to be detected, and the heartbeat signal may be a continuous signal or a discrete signal. E.g. a signal that varies in amplitude over time in the heartbeat. For example, the heartbeat signal may be a Photoplethysmography (PPG). In implementation, the heartbeat signal in the time domain can be predicted based on the temporal and spatial features by using a trained Long Short Term Memory network (LSTM).

Step S4322, converting the heartbeat signal in the time domain into a heartbeat signal in the frequency domain.

Here, the computer device may convert the heartbeat signal in the time domain to obtain the heartbeat signal in the frequency domain. For example: the heartbeat signal in the time domain can be converted by adopting a Fourier transform mode.

Before step S4322, the method may further include: the filtering processing (such as high-pass filtering or low-pass filtering) is performed on the heartbeat signal in the time domain, and the filtered heartbeat signal in the time domain is converted to obtain the heartbeat signal in the frequency domain, which is beneficial to improving the accuracy of the heartbeat signal.

Step S4323, determining a peak value in a second frequency range in the heartbeat signal in the frequency domain as a reference heart rate.

Here, the heartbeat signal in the frequency domain may be understood as a signal that produces a change in the intensity of the heartbeat with a change in frequency, e.g., a first frequency corresponding to a first intensity, a second frequency corresponding to a second intensity, etc. The computer device may determine, based on the heartbeat signal in the frequency domain, a frequency corresponding to a peak value in a preset frequency range as a reference heart rate, where the reference heart rate may be understood as a number of heartbeats determined based on a number of frames of the acquired thermal image, that is, a unit of the reference heart rate may be every number of frames. The method comprises the following steps: the frequency corresponding to the peak in the second frequency range in the heartbeat signal in the frequency domain may be determined as the reference frequency. For example, with 100 thermal images, the number of heartbeats is determined to be 80, and the reference heart rate may be 80 per 100 frames. When the object to be detected is a human body, the heart rate of the human body is generally 45-240 beats per minute (Bmp) (which can be converted into 0.75-4 Hz), the 0.75-4 Hz is used as a second frequency range, the peak value in the frequency range of 0.75-4 Hz in the heartbeat signal on the frequency domain is determined, and the frequency corresponding to the peak value is determined as the reference heart rate by 2.1 Hz.

Step S4324, converting the reference heart rate to obtain the current heart rate.

Here, the current heart rate may be understood as a time-based heart rate, i.e. the unit of the current heart rate may be the second minute. The computer device may perform unit conversion on the reference heart rate based on the number of frames of the thermal images utilized and the frame rate at which the thermal images are acquired, resulting in a current heart rate. For example, a thermal image having 60 frames per frame, a reference heart rate of 30 frames per 60 frames per frame, a preset frame rate for acquiring thermal images of 160 frames per minute, and a current heart rate of 80 frames per minute may be utilized.

In the embodiment of the disclosure, the heartbeat signal in the time sequence can be determined based on the time characteristic and the spatial characteristic, and then the reference heart rate can be accurately determined subsequently based on the spatial signal in the frequency domain. Meanwhile, the reference heart rate is determined based on the peak value in the second frequency range, so that the interference of other frequency ranges can be further reduced, and the accuracy of the current heart rate is improved.

The application of the physiological status detection system provided by the embodiment of the present disclosure in an actual scene is described below, which takes an object to be detected as an infant, physiological parameters include a current body temperature, a current respiratory rate, and a current heart rate, and describes an example of a scene of detecting milk regurgitation due to malformation of the infant.

Milk regurgitation of newborn infants is a common phenomenon, and infants are easily choked due to milk regurgitation, and even suffocate under severe conditions. In the related art, whether the baby spits milk or not can be detected by using a detecting device hung on the ear of the baby or a detecting device arranged on the bib. However, the detection device needs to be worn, and is not convenient and fast enough, and the contact wearing can affect the comfort of the baby. And along with the action of the baby, the detection device can be caused to fall off or even be damaged, and the milk spitting detection is further influenced.

As shown in fig. 5A, the detection system 500 may include an acquisition component 501 (thermal imager 5011, NIR camera 5012, color camera 5013), a processing component 502, a detection component 503 (body temperature detection unit 5031, respiratory rate detection unit 5032, heart rate detection unit 5033, milk regurgitation detection unit 5034), and an alarm component 504.

First, at least two frames of thermal images of an object to be detected can be acquired by the acquisition component 501, wherein the acquisition component 501 may include a thermal imager 5011 for acquiring based on a passive infrared thermal imaging technology, an NIR camera 5012 for acquiring based on an active infrared thermal imaging technology, a color camera 5013 for acquiring color images, and the like. The NIR camera 5012 and the color camera 5013 may be used to acquire a reference image so that a target area in a thermal image acquired by the thermal imager 5011 may be accurately determined based on the reference image. As shown in fig. 5B, the acquisition component 501 may be located directly above the crib 511 for acquiring multiple frames of thermal images of the infant 512. The thermal image can then be transmitted to an associated computer device 513 for subsequent processing, the computer device 513 can include a processing component 502, a detection component 503, an alert component 504, and the like.

The processing component 502 can then detect multiple frames of thermal images resulting in multiple face regions. The detection component 503 may determine milk spitting detection results and at least one physiological parameter of the infant based on the face region.

Next, the detection assembly 503 may include a body temperature detection unit 5031, a breathing frequency detection unit 5032, a heart rate detection unit 5033, a milk regurgitation detection unit 5034, and the like. The body temperature detecting unit 5031 may obtain a group of forehead regions from the plurality of face regions, and determine the current body temperature of the infant based on the group of forehead regions. As shown in fig. 6A, a forehead area 601 may be acquired from the thermal image. The respiratory rate detection unit 5032 can obtain a group of sub-nasal regions from the plurality of face regions, and determine the current respiratory rate of the infant based on the group of sub-nasal regions. As shown in fig. 6B, the under-nose region 602 may be acquired from the thermal image. In determining the current respiratory rate, a set of pixel feature sequences of the lower nasal region may be determined based on the pixel features of each lower nasal region, and further, a respiratory signal in a time domain may be determined based on the set of pixel feature sequences of the lower nasal region. The filtering processing (such as band-pass filtering) can be performed on the respiratory signal in the time domain, and then the respiratory signal after the filtering processing is converted into the respiratory signal in the frequency domain, so that the current respiratory frequency can be determined based on the respiratory signal in the frequency domain. As shown in fig. 6C, the amplitude of the respiration signal may vary with time. As shown in fig. 6D, the filtered breathing signal is smoother, which helps to improve the accuracy of the subsequent determination of the current breathing frequency.

The heart rate detecting unit 5033 may determine a plurality of face regions as target regions, and determine the current heart rate of the infant based on a set of face regions. As shown in fig. 6E, a face region of consecutive frames, such as a face region in consecutive 60-frame thermal images, may be included. In the process of determining the current heart rate, the heartbeat signals of a group of face regions in the time domain can be determined based on the pixel characteristics of each face region. Then, the filtered heartbeat signal may be converted into a heartbeat signal in the frequency domain, and then the current heart rate may be determined based on the heartbeat signal in the frequency domain and the peak value in the second frequency range. As shown in fig. 6F, the amplitude of the heartbeat signal may vary with time. Fig. 6G is a schematic diagram of a heartbeat signal in a frequency domain according to an embodiment of the present disclosure. As shown in fig. 6G, the second frequency range may be determined by a first range threshold 603 and a second range threshold 604, e.g., the first range threshold 603 may be 0.9, the second range threshold 604 may be 2.5, the determined peak 605 may correspond to a frequency of 2.1, etc.

The milk regurgitation detecting unit 5034 may obtain a group of mouth regions from the plurality of face regions, determine the pixel characteristics of each mouth region, and accordingly may determine the temperature change information, the shape change information, and the milk bubble recognition result represented by the group of mouth regions based on the characteristic difference between the pixel characteristics of the mouth regions, and further may determine the physiological parameter, etc.

Finally, the processing component 502 adjusts the detection result of the milk spitting detection based on the physiological parameter to obtain an adjusted detection result, and in case that the detection result based on the adjustment satisfies the preset condition, it may be determined that the infant is in the milk spitting state. After determining that the infant is in the milk spitting state, the alarm component 504 can be used for generating alarm information, and the alarm information can be notified to a processing object (such as parents of the infant) in real time in a local area network transmission mode and the like, so that the processing object is reminded of processing in time, and the danger of suffocation and the like is reduced.

In the above embodiment, the thermal image of the object to be detected is acquired by adopting a non-contact detection mode, the milk regurgitation detection result and at least one physiological parameter of the object to be detected are detected, and the detection result is calibrated based on the physiological parameter. In the milk spitting detection process, the object to be detected does not need to wear detection devices such as sensors and the like, and the movement of the object to be detected is not influenced. Meanwhile, the thermal image of the object to be detected is acquired by adopting the thermal imager based on the passive infrared thermal imaging technology, so that the infrared radiation can be reduced. And because the difficulty of face recognition of the acquired thermal image is higher, the privacy of the object to be detected can be protected, and the possibility of privacy disclosure is reduced.

Based on the foregoing embodiments, the disclosed embodiments provide a device for detecting a physiological state, where the device includes units and modules included in the units, and may be implemented by a processor in a computer device; of course, the implementation can also be realized through a specific logic circuit; in implementation, the Processor may be a Central Processing Unit (CPU), a Microprocessor Unit (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like.

Fig. 7 is a schematic structural diagram of a physiological status detecting device according to an embodiment of the present disclosure, and as shown in fig. 7, a physiological status detecting device 700 includes: an obtaining module 710, a first extracting module 720 and a first determining module 730, wherein:

an acquiring module 710, configured to acquire at least two frames of thermal images of an object to be detected; a first extracting module 720, configured to extract a mouth region of the object to be detected in each of the at least two frames of thermal images, so as to obtain a group of mouth regions of the object to be detected; a first determining module 730, configured to perform milk spitting detection on a group of the mouth regions, and determine a detection result.

In some embodiments, the first determining module is further configured to: determining pixel characteristics of each mouth region; determining a set of at least one of temperature change information, shape change information, and a milk bubble recognition result characterized by the mouth region based on feature differences between pixel features of the mouth region; determining the detection result based on at least one of the temperature change information, the shape change information, and the milk foam recognition result.

In some embodiments, the first determining module is further configured to: determining a sub-temperature value for each of the mouth regions based on pixel characteristics of the mouth region; grouping a group of the mouth regions to obtain a first group of mouth regions and a second group of mouth regions; determining a set of first and second temperature values for the mouth regions based on the first and second sets of sub-temperature values for the mouth regions, respectively; determining the temperature change information based on a temperature difference value between the first temperature value and the second temperature value.

In some embodiments, the first determining module is further configured to: determining the opening and closing degree of each mouth region based on the pixel characteristics of the mouth region; determining a difference between opening and closing degrees of the mouth region, and determining the difference between the opening and closing degrees as the shape change information.

In some embodiments, the first determining module is further configured to: and identifying the characteristic difference between the pixel characteristics of the mouth area by using a milk bubble identification model, and determining the milk bubble identification result.

In some embodiments, the first determining module is further configured to: determining that the detection result is that the object to be detected is in a milk spitting state when the temperature change is smaller than a first temperature threshold value and the shape change is larger than a first preset threshold value, or when the temperature change is larger than a second temperature threshold value and the milk foam identification result comprises the condition that the milk foam is detected; wherein the first temperature threshold is less than the second temperature threshold.

In some embodiments, the first extraction module is further configured to: acquiring a reference image of the object to be detected; detecting the reference image to obtain a candidate region; determining a mapping relationship of positions between the reference image and the thermal image; and extracting the mouth region in each thermal image based on the position of the candidate region in the reference image and the mapping relation to obtain a group of mouth regions.

In some embodiments, the apparatus further comprises: a second extraction module, configured to extract a non-mouth region of the object to be detected from each of the at least two frames of thermal images, so as to obtain a group of non-mouth regions of the object to be detected; a second determining module, configured to determine a physiological parameter of the object to be detected based on a group of the non-mouth regions; wherein the physiological parameter is used for representing the physiological state of the object to be detected; the adjusting module is used for adjusting the detection result based on the physiological parameter to obtain an adjusted detection result; and the generating module is used for generating alarm information in response to the condition that the adjusted detection result meets the preset condition, and sending the alarm information to the selected processing object.

In some embodiments, the physiological parameter comprises a current body temperature, the non-mouth region being a forehead region; the second determining module is further configured to: determining a sub-temperature value for each of the forehead regions based on pixel values of pixels of each of the forehead regions; determining a temperature mean of the sub-temperature values as a set of temperature values for the forehead region; determining the current body temperature of the object to be detected based on the temperature value and a preset incidence relation; and the incidence relation is used for representing the mapping relation between the temperature value and the body temperature.

In some embodiments, the physiological data includes a current respiratory rate, the non-mouth region is a lower nasal region; the second determining module is further configured to: determining pixel characteristics for each of the sub-nasal regions; determining a set of sequences of pixel features of the subnasal region based on the pixel features of each of the subnasal regions; determining the current respiratory frequency of the object to be detected based on a set of pixel feature sequences of the subnasal region.

In some embodiments, the second determining module is further configured to: determining a temporal sequence between the sub-nasal regions based on an acquisition time of each of the thermal images and an affiliation between the sub-nasal regions and the thermal images; determining the sequence of pixel features based on the pixel features and the temporal order.

In some embodiments, the second determining module is further configured to: performing dimension reduction processing on the pixel characteristic sequence to obtain a respiratory signal of the pixel characteristic sequence on a time domain; converting the respiration signal in the time domain into a respiration signal in a frequency domain; determining a reference respiratory frequency based on peaks in the respiratory signal over the frequency domain that lie within a first frequency range; and converting the reference respiratory frequency to obtain the current respiratory frequency.

In some embodiments, the physiological data comprises a current heart rate, the non-mouth region being a face region; the second determining module is further configured to: according to the acquisition time between the face regions, performing feature extraction on a group of face regions to obtain a group of time features and space features of the face regions; determining the current heart rate of the object to be detected based on the temporal features and the spatial features.

In some embodiments, the second determining module is further configured to: determining a group of heartbeat signals of the face region in a time domain based on the time characteristic and the spatial characteristic; converting the heartbeat signal in the time domain into a heartbeat signal in the frequency domain; determining a peak value in a second frequency range in the heartbeat signal on the frequency domain as a reference heart rate; and converting the reference heart rate to obtain the current heart rate.

The above description of the apparatus embodiments, similar to the above description of the method embodiments, has similar beneficial effects as the method embodiments. In some embodiments, functions of or modules included in the apparatuses provided in the embodiments of the present disclosure may be used to perform the methods described in the above method embodiments, and for technical details not disclosed in the embodiments of the apparatuses of the present disclosure, please refer to the description of the method embodiments of the present disclosure for understanding.

It should be noted that, in the embodiment of the present disclosure, if the detection method of the physiological state is implemented in the form of a software functional module and is sold or used as a standalone product, it may also be stored in a computer readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a magnetic disk, or an optical disk. Thus, embodiments of the present disclosure are not limited to any specific hardware, software, or firmware, or any combination thereof.

The embodiment of the present disclosure provides a computer device, which includes a memory and a processor, where the memory stores a computer program that can be executed on the processor, and the processor implements some or all of the steps in the above method when executing the program.

The disclosed embodiment provides a detection system of physiological state, including: the thermal imager is used for acquiring at least two frames of thermal images of the object to be detected; the computer equipment is used for acquiring at least two frames of thermal images of the object to be detected; extracting mouth regions of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected; and carrying out milk spitting detection on a group of mouth areas, and determining the detection result.

Embodiments of the present disclosure provide a computer-readable storage medium on which a computer program is stored, the computer program implementing some or all of the steps of the above method when executed by a processor. The computer readable storage medium may be transitory or non-transitory.

The disclosed embodiments provide a computer program comprising computer readable code, where the computer readable code runs in a computer device, a processor in the computer device executes some or all of the steps for implementing the above method.

The disclosed embodiments provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program, which when read and executed by a computer, performs some or all of the steps of the above method. The computer program product may be embodied in hardware, software or a combination thereof. In some embodiments, the computer program product is embodied in a computer storage medium, and in other embodiments, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

Here, it should be noted that: the foregoing description of the various embodiments is intended to highlight various differences between the embodiments, which are the same or similar and all of which are referenced. The above description of the apparatus, storage medium, computer program and computer program product embodiments is similar to the description of the method embodiments above, with similar beneficial effects as the method embodiments. For technical details not disclosed in the embodiments of the disclosed apparatus, storage medium, computer program and computer program product, reference is made to the description of the embodiments of the method of the present disclosure for understanding.

It should be noted that fig. 8 is a schematic hardware entity diagram of a computer device in an embodiment of the present disclosure, and as shown in fig. 8, the hardware entity of the computer device 800 includes: a processor 801, a communication interface 802, and a memory 803, wherein:

the processor 801 generally controls the overall operation of the computer apparatus 800.

The communication interface 802 may enable the computer device to communicate with other terminals or servers via a network.

The Memory 803 is configured to store instructions and applications executable by the processor 801, and may also buffer data (e.g., image data, audio data, voice communication data, and video communication data) to be processed or already processed by the processor 801 and modules in the computer apparatus 800, and may be implemented by a FLASH Memory (FLASH) or a Random Access Memory (RAM). Data may be transferred between the processor 801, the communication interface 802, and the memory 803 via the bus 804.

It should be appreciated that reference throughout this specification to "one embodiment" or "an embodiment" means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of the phrases "in one embodiment" or "in an embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. It should be understood that, in various embodiments of the present disclosure, the sequence numbers of the above steps/processes do not mean the execution sequence, and the execution sequence of each step/process should be determined by the function and the inherent logic thereof, and should not constitute any limitation on the implementation process of the embodiments of the present disclosure. The above-mentioned serial numbers of the embodiments of the present disclosure are merely for description, and do not represent the advantages or disadvantages of the embodiments.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one of 8230, and" comprising 8230does not exclude the presence of additional like elements in a process, method, article, or apparatus comprising the element.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above-described device embodiments are merely illustrative, for example, the division of the unit is only one logical function division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all the functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media that can store program codes, such as a removable Memory device, a Read Only Memory (ROM), a magnetic disk, or an optical disk.

Alternatively, the integrated unit of the present disclosure may be stored in a computer-readable storage medium if it is implemented in the form of a software functional module and sold or used as a separate product. Based on such understanding, the technical solutions of the present disclosure may be substantially or partially embodied in the form of a software product stored in a storage medium, and include several instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a removable storage device, a ROM, a magnetic or optical disk, or other various media that can store program code.

The methods disclosed in the several method embodiments provided in this disclosure may be combined arbitrarily without conflict to arrive at new method embodiments.

If the disclosed embodiment relates to personal information, a product applying the disclosed embodiment has been explicitly informed of personal information processing rules before processing the personal information, and obtains personal autonomous consent. If the disclosed embodiment relates to sensitive personal information, the product applying the disclosed embodiment obtains individual consent before processing the sensitive personal information, and simultaneously meets the requirement of 'express consent'.

The above description is only an embodiment of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present disclosure, and shall cover the scope of the present disclosure.

Claims

1. A method of detecting milk regurgitation comprising:

acquiring at least two frames of thermal images of an object to be detected;

extracting a mouth region of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected;

and carrying out milk spitting detection on a group of mouth areas, and determining the detection result.

2. The method of claim 1, wherein the milk spitting detection for a set of the mouth regions, determining a detection result, comprises:

determining pixel characteristics of each mouth region;

determining a set of at least one of temperature change information, shape change information, and a milk bubble recognition result characterized by the mouth region based on feature differences between pixel features of the mouth region;

determining the detection result based on at least one of the temperature change information, the shape change information, and the milk foam recognition result.

3. The method according to claim 2, wherein determining a set of temperature variation information characterized by the mouth region based on feature differences between pixel features of the mouth region comprises:

determining a sub-temperature value for each of the mouth regions based on pixel characteristics of the mouth region;

grouping a group of the mouth regions to obtain a first group of mouth regions and a second group of mouth regions;

determining a first and a second temperature value of a set of the mouth regions, respectively, based on the sub-temperature values of the first and second sets of mouth regions;

determining the temperature change information based on a temperature difference value between the first temperature value and the second temperature value.

4. The method according to claim 2 or 3, wherein the determining the shape change information characterized by the mouth region based on the feature difference between the pixel features of the mouth region comprises:

determining the opening and closing degree of each mouth region based on the pixel characteristics of the mouth region;

determining a difference between opening and closing degrees of the mouth region, and determining the difference between the opening and closing degrees as the shape change information.

5. The method according to any one of claims 2 to 4, wherein the determining a set of bubble recognition results characterized by the mouth region based on feature differences between pixel features of the mouth region comprises:

and identifying the characteristic difference between the pixel characteristics of the mouth area by using a milk bubble identification model, and determining the milk bubble identification result.

6. The method according to any one of claims 2 to 5, wherein determining the detection result based on at least one of the temperature change, the shape change and the milk bubble recognition result comprises:

in case the temperature variation is smaller than a first temperature threshold and the shape variation is larger than a first preset threshold, or,

when the temperature change is larger than a second temperature threshold value and the milk foam identification result comprises that the milk foam is detected, determining that the detection result is that the object to be detected is in a milk spitting state; wherein the first temperature threshold is less than the second temperature threshold.

7. The method of any one of claims 1 to 6, wherein the extracting a mouth region of the object to be detected from each of the at least two frames of thermal images to obtain a set of mouth regions of the object to be detected comprises:

acquiring a reference image of the object to be detected;

detecting the reference image to obtain a candidate region;

determining a mapping relationship of a location between the reference image and the thermal image;

and extracting the mouth region in each thermal image based on the position of the candidate region in the reference image and the mapping relation to obtain a group of mouth regions.

8. The method according to any one of claims 1 to 7, further comprising:

extracting a non-mouth region of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of non-mouth regions of the object to be detected;

determining physiological parameters of the object to be detected based on a group of the non-mouth regions; wherein the physiological parameter is used for representing the physiological state of the object to be detected;

adjusting the detection result based on the physiological parameter to obtain an adjusted detection result;

and generating alarm information in response to the condition that the adjusted detection result meets the preset condition, and sending the alarm information to a selected processing object.

9. The method of claim 8, wherein the physiological parameter includes a current body temperature, the non-mouth region being a forehead region; the determining the physiological parameter of the object to be detected based on a set of the non-mouth regions comprises:

determining a sub-temperature value for each of the forehead regions based on pixel values of pixels of each of the forehead regions;

determining a temperature mean of the sub-temperature values as a set of temperature values for the forehead region;

determining the current body temperature of the object to be detected based on the temperature value and a preset incidence relation; wherein the incidence relation is used for representing the mapping relation between the temperature value and the body temperature.

10. The method of claim 8, wherein the physiological data includes a current respiratory rate, and the non-mouth region is a lower nasal region; the determining the physiological parameter of the object to be detected based on a set of the non-mouth regions comprises:

determining pixel characteristics for each of the sub-nasal regions;

determining a set of sequences of pixel features of the sub-nasal region based on the pixel features of each of the sub-nasal regions;

determining the current respiratory frequency of the object to be detected based on a set of pixel feature sequences of the subnasal region.

11. The method of claim 10, wherein determining a set of sequences of pixel features of the sub-nasal region based on the pixel features of each of the sub-nasal regions comprises:

determining a temporal sequence between the sub-nasal regions based on an acquisition time of each of the thermal images and an affiliation between the sub-nasal regions and the thermal images;

determining the sequence of pixel features based on the pixel features and the temporal order.

12. The method according to claim 10 or 11, wherein the determining the current respiratory rate of the subject to be detected based on a set of pixel feature sequences of the sub-nasal region comprises:

performing dimension reduction processing on the pixel characteristic sequence to obtain a respiratory signal of the pixel characteristic sequence on a time domain;

converting the respiration signal in the time domain into a respiration signal in the frequency domain;

determining a reference respiratory frequency based on peaks in the respiratory signal over the frequency domain that lie within a first frequency range;

and converting the reference respiratory frequency to obtain the current respiratory frequency.

13. The method of claim 8, wherein the physiological data includes a current heart rate, the non-mouth region is a face region; the determining the physiological parameter of the object to be detected based on a set of the non-mouth regions comprises:

according to the acquisition time among the face regions, carrying out feature extraction on a group of face regions to obtain a group of time features and space features of the face regions;

determining the current heart rate of the object to be detected based on the temporal features and the spatial features.

14. The method according to claim 13, wherein the determining the current heart rate of the object to be detected based on the temporal feature and the spatial feature comprises:

determining a group of heartbeat signals of the face region in a time domain based on the time characteristic and the spatial characteristic;

converting the heartbeat signal in the time domain into a heartbeat signal in the frequency domain;

determining a peak value in a second frequency range in the heartbeat signal on the frequency domain as a reference heart rate;

and converting the reference heart rate to obtain the current heart rate.

15. A device for detecting a physiological condition, comprising:

the acquisition module is used for acquiring at least two frames of thermal images of the object to be detected;

the first extraction module is used for extracting the mouth region of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected;

and the first determining module is used for carrying out milk spitting detection on a group of mouth areas and determining a detection result.

16. A computer device comprising a memory and a processor, the memory storing a computer program operable on the processor, wherein the processor implements the steps of the method of any one of claims 1 to 14 when executing the program.

17. A physiological condition detection system, comprising:

the thermal imager is used for acquiring at least two frames of thermal images of the object to be detected;

the computer equipment is used for acquiring at least two frames of thermal images of the object to be detected; extracting a mouth region of the object to be detected from each thermal image of the at least two frames of thermal images to obtain a group of mouth regions of the object to be detected; and carrying out milk spitting detection on a group of mouth areas, and determining the detection result.

18. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 14.