CN112766185A

CN112766185A - Head posture monitoring method, device and system based on deep learning

Info

Publication number: CN112766185A
Application number: CN202110090638.9A
Authority: CN
Inventors: 金梅; 李翔宇; 张立国; 李圆圆; 李义辉; 马子荐; 杨曼
Original assignee: Yanshan University
Current assignee: Yanshan University
Priority date: 2021-01-22
Filing date: 2021-01-22
Publication date: 2021-05-07
Anticipated expiration: 2041-01-22
Also published as: CN112766185B

Abstract

The invention provides a method, a device and a system for monitoring head postures based on deep learning. The method comprises the following steps: s1, collecting historical image data; s2, training a neural network; s3, acquiring a real-time image; s4, transmitting the preprocessed real-time image into a first neural network to obtain a human face boundary box needing to be monitored; s5, determining the angle of the face image by the second neural network; s6, return to step S3. The system comprises: the device comprises a data acquisition module, an image processing module and an alarm module. The device comprises: a bed, a camera mounting bracket, a computer, and an alarm. The method adopts a first neural network and a second neural network, wherein the first neural network selects the faces appearing in the image by using an improved YOLOv3 algorithm box, so that the problem that a plurality of faces appear in a monitoring range is solved; the second neural network uses the improved VGG16 network to extract a plurality of characteristics from the image and perform fusion, so that the head posture can be monitored in real time.

Description

Head posture monitoring method, device and system based on deep learning

Technical Field

The invention relates to an artificial intelligence technology, in particular to a method, a device and a system for monitoring head postures based on deep learning.

Background

At present, a plurality of methods and applications for detecting the head posture exist, but the head posture of a person is mainly the head posture of the person when the person sits and stands, for example, the head posture of the driver is identified, the head posture of the person when the person lies down is not identified, but the requirements exist, for example, the head is deviated to one side when the person lies down through image detection, the posture is called as a pillow-removing lying posture, the posture is used for better keeping the respiratory tract unobstructed for the patient, respiratory tract obstruction caused by tongue falling is prevented, vomit is prevented from entering the trachea by mistake to cause asphyxia, headache caused by intracranial pressure reduction is prevented, a wound is prevented from being dragged, and asphyxia caused by milk spitting or lying prone sleep can be prevented for the baby. The traditional monitoring adopts vital sign (such as body temperature, electrocardiogram and the like) monitoring, if the vital sign data appear unusually to report to the police, but the problem that adopts vital sign monitoring to exist is that it is late, if in wrong sleep posture, the vital sign appears unusually often just can produce after a few minutes, can only adopt remedial measures this moment, but can't prevent. If manual supervision is adopted, a large amount of manpower and material resources are wasted.

Disclosure of Invention

In order to overcome the defects of the prior art, the recognition that the head is in the posture of being rested on a pillow and lying is realized, and the problem that a plurality of human faces appear in the monitoring range is solved. The invention provides a head posture monitoring method based on deep learning, which comprises the following steps:

s1, generating a data set for training a neural network by using the collected historical image data;

the historical image data mainly comprises a face data set, face data sets with different angles and a bed data set;

s2, training the neural network by using the data set to obtain the weight values of the first neural network and the second neural network;

s21, using 80% of the face data set, the face data set at different angles and the bed data set for training, called as a training set, and using the rest 20% for verification, called as a verification set;

s22, inputting the face data set and the bed data set in the data set of the step S21 into a first neural network for training, and obtaining a parameter model;

s23, inputting the face data sets with different angles in the data set of the step S21 into a second neural network for training, and obtaining a parameter model;

s231, the training process of the second neural network is to input the face data sets of different angles into the second neural network, and the second neural network is an improved Vgg16 network;

s232, the specific structure of the second neural network is as follows: the convolutional layers, the pooling layer, the BN layer and the ReLu activation function are sequentially stacked and repeated for four times, the convolutional layers are sequentially convolutional layers C1, C2, C3 and C4, the output of each ReLu activation function is respectively output to the next convolutional layer and the convolutional layer C5, because of the four times of repetition, four ReLu activation functions are provided, each ReLu activation function is connected with one convolutional layer C5, the four convolutional layers C5 respectively output a first characteristic, a second characteristic, a third characteristic and a fourth characteristic, the output of the last ReLu activation function is output as an output characteristic after being output as FC3 through 3 full connection layers, the first characteristic, the second characteristic, the third characteristic and the fourth characteristic and the output characteristic after FC3 are input to the characteristic fusion layer, the characteristic fusion layer inputs the processing result to the softmax, the head posture is judged according to the probability value, and the monitoring of the head posture is realized;

s233, training by using data sets of different angles to obtain weights of a second neural network, and accelerating feature collection of different angles in real-time monitoring;

s3, acquiring a real-time image needing head posture monitoring, and preprocessing the real-time image to obtain a preprocessed image;

the method comprises the steps that a camera shoots an obtained real-time picture, and collected real-time images are preprocessed to obtain preprocessed images, wherein the preprocessing comprises filtering and noise reduction of the real-time images;

s4, transmitting the preprocessed image into a first neural network for determining a face bounding box to be monitored in the preprocessed image, wherein the first neural network is divided into two parts, the first part identifies all faces and bed bounding boxes, the second part screens a plurality of face bounding boxes to obtain the face bounding box to be monitored, and the steps specifically comprise:

s41, rapidly determining a boundary box of the face and the bed according to the parameter model of the first neural network trained in the S22;

s42, determining the boundary frame and the limiting condition of the face and the bed according to S41, and determining the boundary frame of the face of the monitored person;

s421, firstly, setting the face selection area as A and the bed selection area as B, which must satisfy

If the human face area cannot be detected, the person is considered to be unmanned in bed; if the face frame selection area can be detected but is not in the bed frame selection area, the person is still absent in the bed; if no person is in the bed, returning to the step S3, otherwise executing the step S422;

s422, if only one face frame selection area is in the bed frame selection area, determining the face image of the monitored person in the area, executing step S5, if a plurality of face boundary frames are in the bed boundary frame, eliminating interference of irrelevant persons, and executing step S423;

in step S423, considering that the head pose is monitored during sleep, and therefore the head motion amplitude is small, using the face bounding box obtained in step S41,

in the case of (1), the contents in a plurality of face bounding boxes are taken out, wherein A is a face selection area and B is a bed selection areaA domain. Suppose that the k-th personal face bounding box, i (i ≦ k) th personal face bounding box take 4 consecutive frames of images, f_i1、f_i2、f_i3、f_i4Respectively making difference D between two adjacent frames of images in each bounding box area_i，D_i1＝|f_i3-f_i2|∩|f_i2-f_i1|，D_i2＝|f_i4-f_i3|∩|f_i3-f_i2I, then solving the union set of the differences to obtain a difference image D_i＝D_i1∪D_i2Giving a threshold value T, and carrying out binarization operation on the image to obtain a binarized image R_i：

1 represents the point of motion and 0 represents the background, the percentage of 1 to the total pixels in the bounding box is calculated, and R is calculated_iThe face boundary box with the minimum proportion is the face boundary box of the monitored person;

s5, determining the angle of the face image by the second neural network;

and S6, returning to the step S3, and continuously monitoring the head posture. Meanwhile, if the head is in the non-occipital lying position posture for 250 continuous frames or more than 600 frames in one minute, an alarm signal is sent out to prompt that the head posture of the monitored person is in the non-occipital lying position posture.

Preferably, the different-angle face data sets acquire image data of different head rotation angles of a plurality of persons, and the regions have discontinuous attributes and can be represented by discontinuous head postures; the right middle is 0 degrees, then the head is marked at intervals of 15 degrees in the left-right direction X, the head pitch angle Y is also marked at intervals of 15 degrees, and the range of the head rotation is shown by using (X, Y); the four directions of the upper direction, the lower direction, the left direction and the right direction are all at the maximum angle of 90 degrees, so that 13 angles exist in the upper direction, the lower direction and the left direction and the right direction, and the total number of 169 deflection angles are provided for (X, Y), due to the limitation of a body, the left direction and the right direction can be rotated to-90 degrees and 90 degrees only when the pitch angle is 0 degree, the heads at other pitch angles can not be rotated to-90 degrees and 90 degrees left and right, and therefore 145 head posture data sets are determined; each head pose has a corresponding angle label; the head is required to lean to one side when the user sleeps on the back, so that 145 head postures can be divided into two major classes of the head-lying posture with pillow removed and the head-lying posture without pillow removed; when the absolute value of X is more than or equal to 75 degrees and less than or equal to 90 degrees and the absolute value of Y is more than or equal to 0 degrees and less than or equal to 30 degrees, the head is regarded as the lying posture with no pillow removal, and the rest are regarded as the lying postures without pillow removal.

Preferably, in step S22, the face data set and the bed data set in the data set of step S21 are input to a first neural network for training, and a parametric model is obtained, specifically including the steps of;

s221, predicting coordinates (x, y) of the center point of the target boundary box, height and width (w, h) of the boundary box, a bed, a human face, a background and confidence by using a YOLOv3 algorithm;

s222, correcting the boundary frame through backward propagation, meanwhile, predicting the detected object in the image, and if IoU of the boundary frame selected by the predicted boundary frame and the boundary frame selected by the manual frame is less than 0.6, considering that the error of the predicted target boundary frame is too large, the network continuously transfers the error to the previous stage in the backward direction to ensure that IoU of the boundary frame selected by the predicted boundary frame and the manual frame is more than or equal to 0.6;

and S223, training by using the data set to obtain a weight of the first neural network, and using the weight to accelerate the judgment speed of the boundary box of the face and the bed of the neural network in real-time monitoring.

Preferably, in step S5, the second neural network determines an angle of the face image, and the specific steps are as follows:

s51, cutting the human face bounding box area obtained in the step S4 from the preprocessed image, performing resize operation on the human face area to ensure that the human face area image with the image size of 224 multiplied by 224 input into the multi-feature fusion convolution neural network module is put into a second neural network to determine the head pose;

s52, feature fusion: every time the picture passes through one convolution layer and one pooling layer, a 1 x 1 convolution layer C5 is added before feature extraction, and the feature extraction sequentially comprises the following steps: the first feature, the second feature, the third feature, the fourth feature, and the output feature after FC 3;

s53, backward propagation of the first, second, third, fourth and FC3 post-signatures, total signature loss:

where T represents a feature function, n represents a total of n features, Li represents a feature loss function of the ith feature, and p_i，kRepresenting the probability that the ith feature is predicted to be the kth class;

and S54, determining the angle of the face image, and judging whether the head is in a non-resting and lying position.

The application also discloses a system for monitoring the head posture by using the head posture monitoring method based on deep learning, which comprises the following steps: the device comprises a data acquisition module, an image processing module and an alarm module;

the data acquisition module is used for executing the step S3 and acquiring the images of the patient bed and different angles;

the image processing module is used for executing steps S4 and S5, processing the acquired image data, specifically, obtaining parameter models of two neural networks through preprocessing, putting a real-time image into a first neural network frame to select a face image, putting the detected network into a second neural network, and realizing the determination of the head posture through multi-feature fusion;

the alarm module is used for sending out an alarm signal if the non-sleeper-removal lying posture is maintained for 250 continuous frames or more than 600 continuous frames within one minute.

The application also discloses a device for monitoring the head posture by using the head posture monitoring method based on deep learning, which comprises the following steps: a bed, a camera fixing bracket, a computer and an alarm,

the camera is arranged right above the bed head through the camera fixing support, so that the camera can shoot a picture facing to a human face;

the camera is connected with the computer through a network, and the computer is used for executing the steps S4, S5 and S6, processing the collected image data and judging whether the head lies on the side or not according to the pictures obtained from the camera;

the alarm is connected with the computer through a network and used for giving an alarm.

Compared with the prior art, the invention has the following beneficial effects:

1. the method adopts a first neural network and a second neural network, wherein the first neural network selects the faces appearing in the image by using an improved YOLOv3 algorithm box, so that the problem that a plurality of faces appear in a monitoring range is solved;

2. the second neural network extracts a plurality of characteristics from the image by using the improved VGG16 network and fuses the characteristics, the head posture can be monitored in real time, and the device can immediately give an alarm when the head is not in the lying posture of the user.

Drawings

FIG. 1 is a schematic structural diagram of a head posture monitoring device based on deep learning;

FIG. 2 is a schematic structural diagram of a head posture monitoring system based on deep learning;

FIG. 3 is a schematic diagram of steps of a head posture monitoring method based on deep learning;

FIG. 4 is a schematic diagram of a process for improving algorithm processing of an image by using YOLOv 3;

FIG. 5 is a schematic diagram of a second neural network architecture;

FIG. 6 is a schematic diagram of fusion of different feature layers of an image using a second neural network feature.

Reference numerals:

1. the bed, 2, the camera, 3, the camera fixed bolster, 4, the computer, 5, the siren.

Detailed Description

In order to better understand the technical solution of the present invention, the following detailed description is made with reference to the accompanying drawings and examples. In the drawings, like reference numbers can indicate functionally identical or similar elements. While the various aspects of the embodiments are presented in drawings, the drawings are not necessarily drawn to scale unless specifically indicated.

The invention discloses a head posture monitoring method based on deep learning, as shown in fig. 3, S1, generating a data set for training a neural network by using collected historical image data; the historical image data mainly comprises a face data set, face data sets with different angles and a bed data set.

Wherein the different angle face data sets collect image data of different head rotation angles of a plurality of persons, and the regions have discontinuous properties and can be represented by discontinuous head gestures. The right center is set to 0 °, the head is marked at every 15 ° in the left-right direction X, and similarly, the head pitch angle Y is also marked at every 15 °, and the range of head rotation is represented by (X, Y). Since the maximum angle is 90 ° in all of the four directions, there are 13 angles in the up-down direction and 13 angles in the left-right direction, (X, Y) have 169 deflection angles in total. Therefore, only when the pitch angle is 0 degrees, the left and right heads can rotate to 90 degrees and 90 degrees, the left and right heads of other pitch angles cannot rotate to 90 degrees and 90 degrees, and finally 145 head posture data sets are determined. Each head pose has a corresponding angle label. The head is required to lean to one side when the user goes to rest the pillow, so that the 145 head postures can be divided into two main types of the lying posture of the user who goes to rest the pillow and the lying posture of the user who does not go to rest the pillow. When the absolute value of X is more than or equal to 75 degrees and less than or equal to 90 degrees and the absolute value of Y is more than or equal to 0 degrees and less than or equal to 30 degrees, the head is regarded as the lying posture with no pillow removal, and the rest are regarded as the lying postures without pillow removal.

And eliminating the condition that the face data sets at different angles are particularly fuzzy and the face cannot be distinguished. And labeling the removed data set, and manually framing to select a face area in the image.

Similarly, the bed data set is rejected if the bed cannot be distinguished. Labeling the data set after the data set is removed, and manually framing out the bed area in the image.

And S2, training the neural network by using the data set to obtain parameter models of the first neural network and the second neural network.

S21, 80% of the face data set, the face data set at different angles, and the bed data set are used for training, called training set, and the remaining 20% are used for verification, called verification set.

S22, inputting the face data set and the bed data set in the data set of the step S21 into a first neural network for training, and obtaining a parameter model.

S221, coordinates (x, y) of the center point of the target boundary box, height and width (w, h) of the boundary box, bed, face, background and confidence are predicted by using a YOLOv3 algorithm.

S222, the boundary frame is corrected by backward propagation, and the detected object in the image is predicted, and if IoU (intersection ratio) between the predicted boundary frame and the boundary frame selected by the manual frame is less than 0.6, it is considered that the error of the predicted target boundary frame is too large, and the network continuously transfers the error backward to the previous stage, so as to ensure that IoU between the predicted boundary frame and the boundary frame selected by the manual frame is 0.6 or more.

The existing YOLOv3 algorithm can locate and track all faces appearing in the video, but only the face of the person to be detected needs to be tracked in the application. Because one camera only aims at one bed and does not consider the situation that a plurality of people sleep on one bed, but a plurality of faces can appear on one image by considering various factors such as accompanying personnel and the like, a face boundary frame of a monitored person needs to be determined on the image.

S223, the weight of the first neural network can be obtained through training by using the data set, and the weight is used for accelerating the judgment speed of the boundary box of the face and the bed of the neural network in real-time monitoring.

And S23, inputting the different-angle face data sets in the data set of the step S21 into a second neural network for training, and obtaining a parameter model.

S231, the training process of the second neural network is to input the face data sets of different angles into the second neural network, the second neural network is an improved Vgg16 network, and the Vgg16 network is sensitive to different kinds of features but not sensitive to the same kind of objects, so that improvement is performed on the basis of the Vgg16 network.

S232, the second neural network is shown in FIG. 5, and the specific structure is as follows: the convolutional layer, the pooling layer, the BN layer and the ReLu activation functions are sequentially stacked and repeated for four times, the output of each ReLu activation function is respectively output to the next convolutional layer and the convolutional layer C5, because of the four times of repetition, four ReLu activation functions are provided, each of the four ReLu activation functions is connected with one convolutional layer C5, each of the four convolutional layers C5 respectively outputs a first characteristic, a second characteristic, a third characteristic and a fourth characteristic, the output of the last ReLu activation function is output as an output characteristic after being processed by 3 full connection layers to be FC 25, the output characteristics after the first characteristic, the second characteristic, the third characteristic and the fourth characteristic and the FC3 are input into the characteristic fusion layer, the characteristic fusion layer inputs the processing result into the softmax layer, and the head posture is judged according to the probability value 3535 3, so that the monitoring of the head posture is realized.

As shown in fig. 6, where C denotes a convolution layer, P denotes a pooling layer, FC denotes a full-link layer, and S denotes a softmax classifier, feature extraction is performed on an image once per convolution and pooling layer. The convolution layers all use convolution kernels of 3 x 3, 2 convolution kernels of 3 x 3 are stacked on C1 and C2, 3 convolution kernels of 3 x 3 are stacked on C3 and C4, the convolution kernels of 3 x 3 can be stacked to replace convolution kernels of larger scale, operation parameters can be reduced by stacking the convolution kernels of 3 x 3, and operation speed is improved. The convolutional layer contains a BN layer (Batch Normalization) and a reli (Rectified linear unit), activation function. After the convolution kernels are stacked, a BN layer is added after the convolution kernels are stacked, and a ReLu activation function is added after the BN layer, so that the training and convergence speed of the network is accelerated, and the problems of gradient explosion and gradient disappearance are solved. The pooling layers are all subjected to maximal pooling operation with convolution kernel size of 2 x 2 and step length of 2, a 1 x 1 convolution layer is added behind the pooling layers, and the down-sampling and the up-sampling are carried out firstly, so that the feature extraction is clearer.

And S233, training the data sets at different angles to obtain weights of the second neural network, so as to accelerate feature collection at different angles in real-time monitoring.

And S3, acquiring a real-time image needing head posture monitoring, and preprocessing the real-time image to obtain a preprocessed image.

The method comprises the steps of shooting an obtained real-time picture by a camera, preprocessing the acquired real-time picture to obtain a preprocessed image, wherein the preprocessing comprises filtering and denoising the real-time picture.

And S4, transmitting the preprocessed image into a first neural network, and determining a human face bounding box which needs to be monitored in the preprocessed image. The first neural network is divided into two parts, the first part identifies all human faces and bed boundary frames, and the second part screens a plurality of human face boundary frames to obtain the human face boundary frames to be monitored.

And S41, rapidly determining the boundary box of the face and the bed according to the parameter model of the first neural network trained in the S22.

s421, using the face area of the monitored person only in bed, firstly setting the face selection area as A and the bed selection area as B, which must satisfy

If the human face area can not be detected, the person is considered to be not in bed. If the face frame selection area can be detected but is not in the bed frame selection area, no person is in the bed. If no person is in the bed, the process returns to step S3, otherwise, step S422 is executed.

S422, if only one face frame selection area is in the bed frame selection area, the face image of the monitored person in the range is determined, step S5 is executed, if a plurality of face boundary frames are in the bed boundary frame, the interference of irrelevant persons is required to be eliminated, and step S423 is executed.

in the case of (1), the contents in the face bounding boxes are extracted, wherein A is a face frame selection area and B is a bed frame selection area. Suppose that the k-th personal face bounding box, i (i ≦ k) th personal face bounding box take 4 consecutive frames of images, f_i1、f_i2、f_i3、f_i4Respectively making difference D between two adjacent frames of images in each bounding box area_i，D_i1＝|f_i3-f_i2|∩|f_i2-f_i1|，D_i2＝|f_i4-f_i3|∩|f_i3-f_i2I, then solving the union set of the differences to obtain a difference image D_i＝D_i1∪D_i2Giving a threshold value T, and carrying out binarization operation on the image to obtain a binarized image R_i：

1 represents the point of motion and 0 represents the background, the percentage of 1 to the total pixels in the bounding box is calculated, and R is calculated_iThe face bounding box with the smallest proportion is the face bounding box of the monitored person.

And S5, determining the angle of the face image by the second neural network.

And S51, cutting the human face bounding box area obtained in the step S4 from the preprocessed image, and performing resize (stretching and compressing) operation on the human face area to ensure that the human face area image with the image size of 224 multiplied by 224 input into the multi-feature fusion convolution neural network module is put into a second neural network.

S52, the characteristics of the face region image can be rapidly obtained through the second neural network parameter model S22. The method for sequentially extracting the features comprises the following steps: the first feature, the second feature, the third feature, the fourth feature, and an output feature after FC 3.

where T represents a feature function, n represents a total of n features, Li represents a loss function of the ith feature, p_i，kRepresenting the probability that the ith feature is predicted as the kth class.

From the first feature to the output feature after FC3, the feature is abstracted more and more, the receptive field of the convolutional neural network is gradually increased, the learned feature is smaller and more concrete, and the detailed information can be focused more and more. From the perspective of the feature map, the neural network focuses more on learning the color and overall shape from feature 1, the contour of the graph from feature two, and the facial features such as hair, nose, eyes, mouth, and the like of the human face with different angles and different shapes from the output features after

features

3, 4 and FC 3.

And S6, returning to the step S3, and continuously monitoring the head posture. Meanwhile, if the head is in the non-occipital lying position posture for 250 continuous frames or more than 600 continuous frames within one minute, an alarm signal is sent out to prompt the head posture of the monitored person to be the non-occipital lying position posture.

The invention also discloses a head posture monitoring system based on deep learning, which specifically comprises a data acquisition module, an image processing module and an alarm module; wherein, the data acquisition module executes the step S3 to acquire the sickbed and images at different angles; the image processing module is used for executing S4 and S5 and processing the acquired image data, specifically, parameter models of two neural networks are obtained through preprocessing, a real-time image is put into a first neural network frame to select a face image, then the detected network is put into a second neural network, and the determination of the head posture is realized through multi-feature fusion; the alarm module is used for sending out a signal if 250 continuous frames or more than 600 continuous frames in one minute are in the non-unbuckled lying position.

The invention also discloses a head posture monitoring device based on deep learning, which specifically comprises a bed 1, a camera 2, a camera fixing support 3, a computer 4 and an alarm 5, wherein the camera is arranged right above the bed head through the camera fixing support, so that the camera can shoot a picture right facing to a human face, the camera is connected with the computer through a network, the computer is used for executing S4, S5 and S6, processing the acquired image data, judging whether the head is lying on side or not according to the picture acquired from the camera, and the alarm is connected with the computer through the network and used for alarming.

Finally, it should be noted that: the above-mentioned embodiments are only used for illustrating the technical solution of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A head posture monitoring method based on deep learning is characterized in that: which comprises the following steps:

under the condition of (1), taking out contents in a plurality of face bounding boxes, wherein A is a face frame selection area, and B is a bed frame selection area; suppose that the k-th personal face bounding box, i (i ≦ k) th personal face bounding box take 4 consecutive frames of images, f_i1、f_i2、f_i3、f_i4Respectively making difference D between two adjacent frames of images in each bounding box area_i，D_i1＝|f_i3-f_i2|∩|f_i2-f_i1|，D_i2＝|f_i4-f_i3|∩|f_i3-f_i2I, then solving the union set of the differences to obtain a difference image D_i＝D_i1∪D_i2Giving a threshold value T, and carrying out binarization operation on the image to obtain a binarized image R_i：

s5, determining the angle of the face image by the second neural network;

s6, returning to the step S3, and continuously monitoring the head posture; meanwhile, if the head is in the non-occipital lying position posture for 250 continuous frames or more than 600 frames in one minute, an alarm signal is sent out to prompt that the head posture of the monitored person is in the non-occipital lying position posture.

2. A deep learning based head pose monitoring method according to claim 1, wherein:

the face data sets of different angles acquire image data of different head rotation angles of a plurality of people, and the areas have discontinuous attributes and can be represented by discontinuous head postures; the right middle is 0 degrees, then the head is marked at intervals of 15 degrees in the left-right direction X, the head pitch angle Y is also marked at intervals of 15 degrees, and the range of the head rotation is shown by using (X, Y); the four directions of the upper direction, the lower direction, the left direction and the right direction are all at the maximum angle of 90 degrees, so that 13 angles exist in the upper direction, the lower direction and the left direction and the right direction, and the total number of 169 deflection angles are provided for (X, Y), due to the limitation of a body, the left direction and the right direction can be rotated to-90 degrees and 90 degrees only when the pitch angle is 0 degree, the heads at other pitch angles can not be rotated to-90 degrees and 90 degrees left and right, and therefore 145 head posture data sets are determined; each head pose has a corresponding angle label; the head is required to lean to one side when the user sleeps on the back, so that 145 head postures can be divided into two major classes of the head-lying posture with pillow removed and the head-lying posture without pillow removed; when the absolute value of X is more than or equal to 75 degrees and less than or equal to 90 degrees and the absolute value of Y is more than or equal to 0 degrees and less than or equal to 30 degrees, the head is regarded as the lying posture with no pillow removal, and the rest are regarded as the lying postures without pillow removal.

3. A deep learning based head pose monitoring method according to claim 1, wherein:

in the step S22, the face data set and the bed data set in the data set of the step S21 are input to a first neural network for training, and a parameter model is obtained, specifically including the steps of;

4. A deep learning based head pose monitoring method according to claim 1, wherein:

in the step S5, the second neural network determines the angle of the face image, and the specific steps are as follows:

5. A system for head pose monitoring using the deep learning based head pose monitoring method of claim 1, wherein: it includes: the device comprises a data acquisition module, an image processing module and an alarm module;

6. An apparatus for head pose monitoring using the deep learning based head pose monitoring method of claim 1, wherein: it includes: a bed, a camera fixing bracket, a computer and an alarm,