CN117475505A

CN117475505A - Sleeping gesture recognition method based on dark quilt environment

Info

Publication number: CN117475505A
Application number: CN202310695790.9A
Authority: CN
Inventors: 孙柏青; 刘腾; 李勇; 张秋豪; 杨俊友
Original assignee: Shenyang University of Technology
Current assignee: Shenyang University of Technology
Priority date: 2023-06-13
Filing date: 2023-06-13
Publication date: 2024-01-30

Abstract

A sleeping gesture recognition method based on a dark quilt environment relates to the acquisition of human body sleeping gesture images based on an infrared camera and a depth sensor when a human body quilt is covered in the dark environment, and then human body features in the infrared image and the depth image are recognized, namely: the method comprises the steps of extracting pixel coordinates of a central point of a minimum rectangular frame corresponding to features and depth values corresponding to the coordinates, forming a mapping relation with sleeping gestures through the feature data, and training a sleeping gesture recognition model capable of being practically applied through a BP neural network.

Description

Sleeping gesture recognition method based on dark quilt environment

Technical Field

The invention relates to a human body quilt in dark environment, which is characterized in that human body sleeping posture images are acquired based on an infrared camera and a depth sensor, and then human body characteristics in the infrared image and the depth image are identified, namely: the method comprises the steps of extracting pixel coordinates of a central point of a minimum rectangular frame corresponding to features and depth values corresponding to the coordinates, forming a mapping relation with sleeping gestures through the feature data, and training a sleeping gesture recognition model capable of being practically applied through a BP neural network.

Background

In the current study of researchers on sleeping postures, kun methou et al propose a three-dimensional human body posture deducing framework based on a single image of deep learning, and the framework trains a human body posture bone recognition model by combining acquired 2D information and 3D information, but the method does not involve the problem that certain parts of a human body are blocked in the study process. See reference Zhou K, cai J, li Y, et al, adversaril 3d human pose estimation via multimodal depth supervision[J ]. ArXiv preprint arXiv:1809.07921,2018.

The method proposed by shaangjun Liu et al is Under the Cover Imaging via Thermal Diffusion (uci) and is also based on an infrared thermal imaging camera to capture an image of the covered body posture, and then training a model through a neural network, wherein the final model recognition rate reaches 98%, and the recognition rate of the method is high, but the method cannot be widely popularized for the current household because the infrared thermal imaging camera is relatively expensive for a common household in terms of price. See references for details: liu S, ostadibas S.separating under the cover, A physics guided learning approach for in-bed pose estimation [ C ]. Medical Image Computing and Computer Assisted Intervention-MICCAI 2019:22nd International Conference,Shenzhen,China,October 13-17,2019,Proceedings,Part I22.Springer International Publishing,2019:236-245.

Yu Yin et al propose a pyramid-based multi-modal fusion method by fusing two images at a time, then extracting rough body gestures and body features through a neural network, and then inputting the two parameters into an SMPL model to encode a 3D grid of the human body. The method is characterized in that the method is suitable for the image acquired by an infrared thermal imaging camera and the depth image acquired by a depth sensor, and based on the image information, the final body posture identification can be completed. See references for details: YIN Y, robinson J P, fu Y.Multimod in-bed pose and shape estimation under the blankets [ C ]. Proceedings of the 30th ACM International Conference on Multimedia.2022:2411-2419.

Li Chenguang et al use the camera KinectV2 developed by Microsoft corporation, and use the skeleton tracking technology in the apparatus to collect human body joint information, then calibrate each group of joint information through data preprocessing, then select the angle corresponding to the joint point used and the distance between the joint points as the characteristics, and convert these characteristics into the information needed in the constructed gesture model, finally obtain the human body gesture recognition model with high recognition rate through the model network matching algorithm of small sample learning, but in Li Chenguang experimental design, the solution in the case that important joint parts of the body are blocked is not considered. See references for details: li Chenguang human body posture recognition research based on KinectV2 [ D ]. Qin Royal island: yan Shanda, 2021.

Yang Mingjian et al use a single person pose estimation algorithm and a multi-person pose detection algorithm to identify the pose of sleep. The human body multiple joints are positioned in the research process of the human body, then the relation among the joints is searched to be mapped with sleeping positions, finally, the left side lying, right side lying, supine and prone models are designed, and the final sleeping position recognition accuracy rate can reach 92.5%, while the shielding problem is considered in the research process of Yang Mingjian and the like, the problem is well solved, but the research ignores an environmental problem, and when people sleep at night, people are usually in dark black environment, and useful information cannot be displayed in an RGB image. See references for details: yang Mingjian, li Jinglin, guo Ruikun, tang Xiao. OpenPose-based human sleep pose recognition implementation and research [ J ]. Physical experiments, 2019, 39 (08): 45-49.

She Yinqiu et al use a computer vision method to recognize sleeping gestures, and the recognition flow is as follows: the method comprises the steps of (1) collecting image information of a person during sleeping by using a camera, (2) carrying out image preprocessing operation, (3) classifying samples by using an image segmentation technology, and (4) carrying out classification test on the samples by using a neural network, and outputting the samples into four sleeping positions. The final recognition rate is 73%, and the recognition rate is low. See references for details: she Yinqiu study of human sleep position recognition System based on computer vision [ D ]. Mashan: university of Anhui industries, 2013.

Duan Bowen et al classify 9 sleeping postures based on analysis of human body pressure values, firstly collect pressure values of a human body in the sleeping process through a pressure sensor, then preprocess and calibrate each group of pressure values to obtain a data set which can be used for classification, finally classify and test the data collection through 5 different supervised learning algorithms, and the best recognition result is 88.33%, and although the recognition rate is higher than that of a model of She Yinqiu et al, the recognition method is still lower than that of Yang Mingjian et al. See references for details: duan Bowen. Sleeping gesture recognition and get-up intention prediction method based on intelligent bed [ D ]. Shenyang: university of Shenyang Industrial science 2021.

By combining the above researches, two methods for gesture recognition are currently mainly performed, one is based on computer vision images, and the other is based on pressure values. In the research based on computer vision, the obtained result has the highest accuracy in the research based on the infrared thermal imaging camera, but the cost of the camera is higher, and in the process of recognizing the sleeping gesture, residual heat is easy to leave at the original position, so that the accuracy of recognizing the sleeping gesture in real time can be reduced. In the research based on the pressure sensor, the recognition rate is further required to be improved, the information acquired by the pressure sensor is limited by the distribution of the pressure sensor, the more the pressure sensor is, the higher the cost is, and when the pressure sensor is fewer, the data volume is less, so that the persuasion of the research is influenced. Therefore, the method collects sleep image data in dark and covered environments; and then, head-face characteristics and body trunk characteristics of sleeping gesture recognition are needed to be carried out by using the acquired data, so that the mapping relation of the sleeping gesture is completed.

Disclosure of Invention

Aiming at the problems, the invention provides a method for combining three-dimensional information of head and face features and three-dimensional information of body and trunk features based on infrared images and depth images to finish mapping of 4 sleeping positions, and the method also solves the problem of identifying different sleeping positions in dark environment under the condition that a human body is covered.

In order to achieve the above purpose, the invention adopts the following technical scheme that the invention comprises the following steps:

step one: designing an experimental scene for synchronously acquiring an infrared image and a depth image;

step two: after the infrared image and the depth image are synchronously acquired, extracting the head, the left eyebrow, the right eye, the left ear, the right ear, the nose, the mouth and the body trunk on the infrared image, storing the pixel coordinates of the central points of the rectangular frames with the characteristics, inputting the stored file into a target recognition algorithm, outputting a trained model with the 10 characteristics, and automatically extracting the pixel coordinates of the central points of the smallest rectangular frames with the characteristics corresponding to the 10 characteristics in the infrared image acquired on line through the trained model, and storing the pixel coordinates;

step three: indexing the extracted pixel coordinates of the central points of the characteristic rectangular frames into corresponding depth images, storing the extracted pixel coordinates of the central points of the characteristic rectangular frames behind the pixel coordinates of the central points of the corresponding characteristic rectangular frames, changing two-dimensional data into three-dimensional data, and finally adding 10 groups of three-dimensional data to each detected group of sleeping posture images, wherein the detected characteristics are marked as 1, if the characteristics are not detected, the characteristics are automatically marked as 0, the pixel coordinates of the central points corresponding to the undetected characteristics are complemented with 0, the completed group of data is stored, and then manually marking what sleeping posture type the group of data belongs to;

step four: carrying out normalization processing on the data in the third step, and training a class 4 sleeping gesture recognition model;

step five: dividing the data set manufactured in the fourth step into a training set and a testing set, training a sleeping gesture recognition model, testing the trained model, and verifying the reliability of the model.

As a preferable scheme, the experimental scene of the first step of the invention is as follows: the image collected by the camera is the upper body area of the human body except the legs, and the vertical distance from the camera of the depth sensor to the bed surface is 1.1 m to 1.3 m; and the camera is positioned in the area which is vertically above the bed surface and is 1/3 to 1/2 of the bed surface by taking the bed head as a starting point.

As another preferable scheme, the step two of the invention is as follows: after the infrared image and the depth image are synchronously acquired, extracting the head, the left eyebrow, the right eye, the left eye, the right ear, the nose, the mouth and the body trunk on the infrared image by using a Labelimg labeling tool, storing the pixel coordinates of the center points of the rectangular frames of the features, storing the pixel coordinates in a 'txt' file format, inputting the 'txt' files into a YOLOv5 target recognition algorithm, outputting a trained model of the 10 features by the algorithm, automatically extracting the pixel coordinates of the center points of the minimum rectangular frames corresponding to the 10 features in the infrared image acquired on line by using the trained model, and storing the pixel coordinates in the 'txt' file.

As another preferable scheme, the step three of the invention is as follows: the pixel coordinates of the central points of the extracted characteristic rectangular frames are indexed into corresponding depth images, the extracted pixel coordinates are stored behind the pixel coordinates of the central points of the corresponding characteristic rectangular frames, two-dimensional data are changed into three-dimensional data, 10 groups of three-dimensional data are finally shared in each detected sleeping position image, the detected characteristics are marked as 1, if the characteristics are not detected, the characteristics are automatically marked as 0, the central point pixel coordinates corresponding to the undetected characteristics are automatically complemented with 0, the completed group of data are stored in a 'txt' file, and then the four types of sleeping positions are respectively represented by numbers 0, 1,2 and 3, so that the data of one group of images are completely extracted, and the complete original data extracted from one group of images are obtained.

As another preferable scheme, the sleeping posture image in the third step of the invention is an infrared image and a depth image corresponding to the infrared image.

Secondly, the invention comprises the following steps: performing normalization processing on 1409 groups of data in the third step, training a 4-type sleeping gesture recognition model by using a BP neural network, wherein the normalization process is as follows: and step two, acquiring the pixel coordinates of the central point of the rectangular frame, acquiring the depth value of the central point on the depth image in an index mode, and performing normalization calculation, wherein the calculation formula is as follows:

(x _min ，y _min ),(x _max ，y _max ) Pixel coordinates of upper left corner and lower right corner of rectangular frame, respectively, (x) _center ，y _center ) The pixel coordinates of the center point of the rectangular frame are respectively the normalized three-dimensional pixel coordinates, width and height of the image are respectively the width pixel point and the height pixel point of the image, and D and D respectively represent the vertical distance from the center point to the camera and the vertical distance from the camera to the bed surface.

Equation 1:

equation 2:

equation 3:

equation 4:

equation 5:

in addition, the BP neural network has the structure that: the input layer is 40 nodes, the output layer is 4 nodes, the hidden layer is 3 layers, and the node number of each layer is 64 nodes, 32 nodes and 64 nodes respectively.

The invention has the beneficial effects that.

In the process that a person is covered and sleeping at night, the sleeping posture of the person can be detected in real time with low cost and high accuracy, and the recognition rate is as high as 99 percent (as shown in figure 7), and the method can be efficiently used in the snore stopping process, namely, the snoring improving effect is achieved by detecting the sleeping posture of a snorer and then changing the current snoring sleeping posture, and also can be used in the treatment process of a patient suffering from bedsores, the sleeping posture of the patient suffering from bedsores is detected, the current sleeping posture is kept for a long time by judging, then a warning is sent, the current sleeping posture is changed, and the occurrence of pressure sores is greatly improved.

Drawings

The invention is further described below with reference to the drawings and the detailed description. The scope of the present invention is not limited to the following description.

Fig. 1 is a structural diagram of a sleeping posture detection experiment platform of the invention.

Fig. 2 is an image of the labeling process of the present invention.

FIG. 3 is a diagram of three-dimensional data of features extracted by the present invention and raw data of notes after the notes are made.

Fig. 4 is a data diagram of the present invention after normalization processing is performed on the extracted three-dimensional data.

Fig. 5 is a normalized example diagram.

Fig. 6 is a structural diagram of a BP neural network according to the present invention.

FIG. 7 is a table of recognition results according to the present invention.

The specific embodiment is as follows:

as shown, the present invention includes the steps of:

scene description: fig. 1 is an experimental scene, the comfort area of the depth sensor is between 0.8 meter and 2.5 meters, the image to be acquired is the upper body area of the human body except the legs, and the acquired depth data error value is minimum and the data is stable when the distance is between 1.1 meter and 1.3 meters through multiple experiments, so that the vertical distance from the finally selected camera to the bed surface is between 1.1 meter and 1.3, and the position of the camera is in the area which is directly above the bed surface at the position of 1/3 to 1/2 of the bed surface with the head as the starting point. The infrared image is a two-dimensional image, the distance has no influence on the infrared image, the accuracy of the depth sensor is influenced by the distance, the distance from the camera to the bed is only required to be considered, the depth sensor on the camera and the infrared camera are integrated, and the name of the camera is Kinect2.0 somatosensory equipment.

Step two: after the infrared image and the depth image are synchronously acquired, extracting the head, the left eyebrow, the right eye, the left eye, the right ear, the nose, the mouth and the body trunk on the infrared image by using a Labelimg labeling tool, storing the pixel coordinates of the center points of the rectangular frames of the 10 features, storing the pixel coordinates in a 'txt' file format, inputting the 'txt' files into a YOLOv5 target recognition algorithm, outputting a trained model of the 9 features by the algorithm, automatically extracting the pixel coordinates of the center points of the minimum rectangular frames corresponding to the 10 features in the infrared image acquired on line by using the trained model, and storing the pixel coordinates in the 'txt' file;

step three: indexing the pixel coordinates of the extracted central points of the characteristic rectangular frames into corresponding depth images, storing the pixel coordinates behind the pixel coordinates of the central points of the corresponding characteristic rectangular frames, changing two-dimensional data into three-dimensional data, and finally obtaining 10 groups of three-dimensional data in each detected group of sleeping posture images (infrared images and depth images corresponding to the infrared images), wherein the detected characteristics are marked as 1, if the characteristics are not detected, the characteristics are automatically marked as 0, the pixel coordinates of the central points corresponding to the undetected characteristics are automatically complemented with 0, the completed group of data is stored in a 'txt' file, and then four types of sleeping postures are defined by the study, the four types of sleeping postures are respectively represented by numbers 0, 1,2 and 3, and the data of one group of images are extracted completely, and the figure 3 is the complete original data extracted from one group of images;

step four: the invention needs to normalize 1409 groups of data in the third step so as to train a class 4 sleep posture recognition model by using a BP neural network conveniently, and fig. 4 is normalized data.

Taking fig. 5 as a normalization example graph, the normalization process is explained: taking the normalization of the center point of the rectangular frame in fig. 5 as an example, firstly, obtaining the pixel coordinate of the center point of the rectangular frame in the second step, then obtaining the depth value of the center point on the depth image in an index mode, and then performing normalization calculation, wherein the calculation formula is as follows:

Equation 1:

equation 2:

equation 3:

equation 4:

equation 5:

step five: through designing BP neural network, and BP neural network's design structure is shown in FIG. 6, and this input layer is 40 nodes (40 data are had to a set of data), and the output layer is 4 nodes (need to carry out classification recognition to four kinds of sleepings), and the hidden layer has 3 layers, and the node number of each layer is 64 nodes, 32 nodes, 64 nodes respectively. And then dividing the data set manufactured in the fourth step into a training set and a testing set, training a sleeping gesture recognition model, testing the trained model, and verifying the reliability of the model.

The sleeper is subjected to real-time acquisition of sleeping posture images through a camera in the sleeping posture detection experimental platform in fig. 1, 10 characteristics in fig. 2 are identified through 10 characteristic models trained on a PC end, three-dimensional data in fig. 3 can be obtained at the output end of the model, after normalization processing is carried out on the data, the data are finally input into the trained neural network sleeping posture identification model, and corresponding sleeping postures can be output, namely when 0 is output: supine position, output 1: left lateral recumbent position, output 2: right lateral recumbent position, output 3: prone position.

Claims

1. The sleeping gesture recognition method based on the dark quilt environment is characterized by comprising the following steps of:

2. The sleeping gesture recognition method based on the dark covered environment according to claim 1, wherein the experimental scene of the first step is as follows: the image collected by the camera is the upper body area of the human body except the legs, and the vertical distance from the camera of the depth sensor to the bed surface is 1.1 m to 1.3 m; and the camera is positioned in the area which is vertically above the bed surface and is 1/3 to 1/2 of the bed surface by taking the bed head as a starting point.

3. The sleeping posture recognition method based on the dark covered environment according to claim 1, characterized by comprising the following steps: after the infrared image and the depth image are synchronously acquired, extracting the head, the left eyebrow, the right eye, the left eye, the right ear, the nose, the mouth and the body trunk on the infrared image by using a Labelimg labeling tool, storing the pixel coordinates of the center points of the rectangular frames of the features, storing the pixel coordinates in a 'txt' file format, inputting the 'txt' files into a YOLOv5 target recognition algorithm, outputting a trained model of the 10 features by the algorithm, automatically extracting the pixel coordinates of the center points of the minimum rectangular frames corresponding to the 10 features in the infrared image acquired on line by using the trained model, and storing the pixel coordinates in the 'txt' file.

4. The sleeping posture recognition method based on the dark covered environment according to claim 1, characterized by comprising the following steps: the pixel coordinates of the central points of the extracted characteristic rectangular frames are indexed into corresponding depth images, the extracted pixel coordinates are stored behind the pixel coordinates of the central points of the corresponding characteristic rectangular frames, two-dimensional data are changed into three-dimensional data, 10 groups of three-dimensional data are finally shared in each detected sleeping position image, the detected characteristics are marked as 1, if the characteristics are not detected, the characteristics are automatically marked as 0, the central point pixel coordinates corresponding to the undetected characteristics are automatically complemented with 0, the completed group of data are stored in a 'txt' file, and then the four types of sleeping positions are respectively represented by numbers 0, 1,2 and 3, so that the data of one group of images are completely extracted, and the complete original data extracted from one group of images are obtained.

5. The sleeping posture identification method based on the dark quilt environment according to claim 1, wherein the sleeping posture image in the third step is an infrared image and a depth image corresponding to the infrared image.

6. The sleeping posture recognition method based on the dark covered environment according to claim 1, characterized by comprising the following steps: performing normalization processing on 1409 groups of data in the third step, training a 4-type sleeping gesture recognition model by using a BP neural network, wherein the normalization process is as follows: and step two, acquiring the pixel coordinates of the central point of the rectangular frame, acquiring the depth value of the central point on the depth image in an index mode, and performing normalization calculation, wherein the calculation formula is as follows:

Equation 1:

equation 2:

equation 3:

equation 4:

equation 5:

7. the sleeping gesture recognition method based on the dark covered environment according to claim 6, wherein the BP neural network has the following structure: the input layer is 40 nodes, the output layer is 4 nodes, the hidden layer is 3 layers, and the node number of each layer is 64 nodes, 32 nodes and 64 nodes respectively.