CN114758354A - Sitting posture detection method and device, electronic equipment, storage medium and program product - Google Patents

Sitting posture detection method and device, electronic equipment, storage medium and program product Download PDF

Info

Publication number
CN114758354A
CN114758354A CN202210303780.1A CN202210303780A CN114758354A CN 114758354 A CN114758354 A CN 114758354A CN 202210303780 A CN202210303780 A CN 202210303780A CN 114758354 A CN114758354 A CN 114758354A
Authority
CN
China
Prior art keywords
image
target object
key
coordinates
sitting posture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210303780.1A
Other languages
Chinese (zh)
Inventor
蔡馥励
康宏伟
白钰
张经纬
张程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202210303780.1A priority Critical patent/CN114758354A/en
Publication of CN114758354A publication Critical patent/CN114758354A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The application provides a sitting posture detection method, a sitting posture detection device, electronic equipment, a storage medium and a program product. The method comprises the following steps: acquiring a first image of a target object acquired by an image acquisition device; determining whether the first image is a key frame; performing key point detection on the first image according to a key point detection mode corresponding to the result of whether the first image is a key frame or not to obtain coordinates of key points of the target object in the first image; acquiring a sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image; if the sitting posture detection result represents that the target object has a sitting posture problem, alarm information is output, and the alarm information is used for prompting the target object to adjust the sitting posture. This application has improved the accuracy and the efficiency that the position of sitting detected.

Description

Sitting posture detection method and device, electronic equipment, storage medium and program product
Technical Field
The present disclosure relates to image processing technologies, and in particular, to a sitting posture detecting method, a sitting posture detecting device, an electronic apparatus, a storage medium, and a program product.
Background
Compared with the traditional desk lamp which is only used as a lighting tool, the intelligent desk lamp is more and more popular among people due to the functional diversity. Take the intelligent desk lamp that possesses the position of sitting detection function as an example, can install the camera on this intelligent desk lamp to gather the image including the user. From the image it can be determined whether the user's sitting posture is standard.
At present, a common sitting posture detection method mainly uses a deep learning model to detect key points of a face of an image including a user frame by frame, and judges whether the sitting posture of the user is standard or not according to the detection result of the key points of the face. In addition, in some related arts, an infrared detection device may be further installed on the table lamp for detecting a sitting posture profile of a user. And then determining the sitting posture of the user according to the sitting posture profile of the user.
However, the above-mentioned existing sitting posture detecting method has the problem of low detecting accuracy and detecting efficiency.
Disclosure of Invention
The application provides a sitting posture detection method, a sitting posture detection device, electronic equipment, a storage medium and a program product, so that the accuracy and the efficiency of sitting posture detection are improved.
In a first aspect, the present application provides a sitting posture detection method, the method comprising:
acquiring a first image of a target object acquired by an image acquisition device;
determining whether the first image is a key frame;
performing key point detection on the first image according to a key point detection mode corresponding to the result of whether the first image is a key frame or not to obtain coordinates of key points of the target object in the first image;
Obtaining a sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image;
if the sitting posture detection result represents that the target object has a sitting posture problem, alarm information is output and used for prompting the target object to adjust the sitting posture.
Optionally, the performing, according to a key point detection manner corresponding to a result of whether the first image is a key frame, key point detection on the first image to obtain coordinates of key points of the target object in the first image includes:
if the first image is not a key frame, acquiring an optical flow change vector between the first image and the first key frame, and predicting coordinates of key points of the target object in the first image according to the optical flow change vector and the coordinates of the key points of the target object in the first key frame; the first key frame is a key frame which is acquired by the image acquisition device in a key frame cache pool and is closest to the first image acquisition time, and the optical flow change vector is used for representing the motion direction and the motion speed of the target object;
and if the first image is a key frame, performing key point detection on the first image by adopting a depth learning algorithm, and acquiring coordinates of key points of the target object in the first image.
Optionally, the obtaining an optical-flow variation vector between the first image and the first key frame includes:
and acquiring the optical flow change vector by adopting a sparse optical flow field algorithm.
Optionally, the performing, by using a deep learning algorithm, key point detection on the first image to obtain coordinates of key points of the target object in the first image includes:
detecting key points of the first image by adopting a deep learning algorithm;
if the coordinates of the key points of the target object are successfully detected, caching the first image and the coordinates of the key points of the target object in the first image into the key frame cache pool;
if the coordinates of the key points of the target object are not successfully detected, predicting the coordinates of the key points of the target object in the first image according to the optical flow change vector between the first image and the first key frame and the coordinates of the key points of the target object in the first key frame.
Optionally, the caching the first image and the coordinates of the key points of the target object in the first image into the key frame cache pool includes:
Caching the first image and the coordinates of the key points of the target object in the first image into the key frame cache pool in a key value pair mode; wherein a key of the key-value pair is an identifier of a key frame, and values of the key-value pair include: the first image, and coordinates of key points of the target object in the first image.
Optionally, the determining whether the first image is a key frame includes:
if the key frame cache pool is empty, determining that the first image is a key frame;
and if the key frame is cached in the key frame cache pool, determining whether the first image is the key frame according to the brightness change between the first image and the first key frame.
Optionally, the determining whether the first image is a key frame according to the brightness change between the first image and the first key frame includes:
acquiring an absolute value of a brightness difference between the first image and the first key frame;
if the absolute value of the brightness difference is smaller than or equal to a preset threshold value, determining the first image non-key frame;
or, if the absolute value of the brightness difference is greater than the preset threshold, determining that the first image is a key frame.
Optionally, the obtaining, according to the coordinate of the key point of the target object in the first image, a sitting posture detection result of the target object includes:
acquiring chest orientation vectors of the target object in a three-dimensional coordinate system according to the coordinates of the key points of the target object in the first image; the origin of the three-dimensional coordinate system is the chest center point of the target object;
calibrating the coordinates of the key points of the target object in the first image according to the chest orientation vector of the target object to obtain the calibrated coordinates of the key points of the target object; a two-dimensional coordinate system where the coordinates of the key points of the calibrated target object are located is parallel to a two-dimensional coordinate system where the image acquisition device is located;
and acquiring a sitting posture detection result of the target object by using the calibrated coordinates of the key points of the target object.
Optionally, the key points include: the chest orientation vector of the target object in a three-dimensional coordinate system is obtained according to the coordinates of the key points of the target object in the first image, and the chest orientation vector comprises the following steps:
Acquiring a shoulder vector of the target object in the first image and a coordinate of a shoulder center point according to the coordinate of the left shoulder and the coordinate of the right shoulder of the target object in the first image;
acquiring a torso vector of the target object according to the coordinates of the shoulder central point and the coordinates of the chest central point of the target object in the first image;
and obtaining a chest orientation vector of the target object according to the shoulder vector and the trunk vector of the target object in the first image.
Optionally, the calibrating the coordinates of the key points of the target object in the first image according to the chest orientation vector of the target object to obtain the calibrated coordinates of the key points of the target object includes:
acquiring a projection vector of the chest orientation vector on a vertical plane of the three-dimensional coordinate system according to the chest orientation vector of the target object;
acquiring a rotation matrix of the two-dimensional coordinate system according to the projection vector;
and calibrating the coordinates of the key points of the target object in the first image by using the rotation matrix to obtain the calibrated coordinates of the key points of the target object.
Optionally, the obtaining a sitting posture detection result of the target object by using the calibrated coordinates of the key points of the target object includes:
using the calibrated coordinates of the key points of the target object to acquire detection parameters of the target object, wherein the detection parameters include at least one of the following: the head left and right inclination, the upper body left and right inclination, the shoulder jaw difference, and the head to body ratio;
and acquiring a sitting posture detection result of the target object according to the detection parameters of the target object.
Optionally, if the sitting posture detection result indicates that the target object has a sitting posture problem, the sitting posture detection result of the target object includes: the target object has a sitting posture problem and a category to which the sitting posture problem belongs;
the alarm information further includes: the category to which the sitting posture problem belongs.
Optionally, before the acquiring the first image of the target object acquired by the image acquisition device, the method further includes:
receiving a sitting posture detection instruction;
or when the target object exists in the acquisition range according to the image acquired by the image acquisition device, starting a sitting posture detection function.
In a second aspect, the present application provides a seating posture detecting device, the device comprising:
the acquisition module is used for acquiring a first image of a target object acquired by the image acquisition device;
a processing module for determining whether the first image is a key frame; performing key point detection on the first image according to a key point detection mode corresponding to a result of whether the first image is a key frame or not to obtain coordinates of key points of the target object in the first image; obtaining a sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image;
and the output module is used for outputting alarm information when the sitting posture problem exists in the target object according to the sitting posture detection result representation, wherein the alarm information is used for prompting the target object to adjust the sitting posture.
In a third aspect, the present application provides an electronic device, comprising: at least one processor, a memory;
the memory stores computer execution instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the electronic device to perform the method of any of the first aspects.
In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement the method of any one of the first aspect.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed by a processor, performs the method of any one of the first aspects.
According to the sitting posture detection method, the sitting posture detection device, the electronic equipment, the storage medium and the program product, the key point detection mode of the first image of the target object is determined by judging whether the first image is the key frame or not. By detecting the coordinates of the key points obtained by the first image, the sitting posture detection result of the target object can be obtained. By the method, the electronic equipment can adopt different key point detection modes aiming at the key frame and the non-key frame, so that the accuracy and the efficiency of key point identification are guaranteed, and the accuracy of judging whether the sitting posture of the target object has problems or not based on the coordinates of the key points is improved. When the sitting posture detection result represents that the target object has the sitting posture problem, the electronic equipment can output alarm information to prompt the target object to adjust the sitting posture.
Drawings
In order to more clearly illustrate the technical solutions in the present application or the prior art, the following briefly introduces the drawings needed to be used in the description of the embodiments or the prior art, and obviously, the drawings in the following description are some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive labor.
FIG. 1 is a schematic diagram of an intelligent desk lamp;
fig. 2 is a schematic flow chart of a sitting posture detecting method provided by the present application;
fig. 3 is a schematic flowchart of a method for obtaining a sitting posture detection result of a target object according to coordinates of key points of the target object according to the present application;
FIG. 4 is a schematic diagram of a three-dimensional coordinate system with a thorax orientation vector provided by the present application;
fig. 5 is a schematic view of a scene where an image capturing device captures a first image according to the present application;
FIG. 6 is a schematic flow chart of another sitting posture detecting method provided by the present application;
fig. 7 is a schematic structural diagram of a sitting posture detecting apparatus provided in the present application;
fig. 8 is a schematic structural diagram of an electronic device provided in the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the inventive concepts in any manner, but rather to illustrate the inventive concepts to those skilled in the art by reference to specific embodiments.
Detailed Description
To make the purpose, technical solutions and advantages of the present application clearer, the technical solutions in the present application will be clearly and completely described below with reference to the drawings in the present application, and it is obvious that the described embodiments are some, but not all embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The following first explains a concept of partial nouns to which the present application relates:
and (3) key frame: the key frame is the frame where the key action is located in the motion change of the target object. The target objects in the non-key frames usually have no motion change, or have small motion transformation.
Optical flow variation vector: the optical flow variation vector is a two-dimensional vector. The optical flow variation vector includes the velocity and direction of the modal motion in the time-varying image.
Key value pair (kv): each key will correspond to a value.
Image coordinate system: and establishing a rectangular coordinate system with the upper left corner of the image as an origin and the pixel as a unit.
RGB image: refers to a three primary color (Red Green Blue, RGB) image.
The RGBD image refers to an RGB image and a Depth (Depth) map corresponding to the RGB image.
Fig. 1 is a schematic structural diagram of an intelligent desk lamp. As shown in FIG. 1, the intelligent desk lamp can comprise an illuminating lamp, a camera, a display screen and other components. Wherein the camera may be used to capture images including the body of the user. In some embodiments, the intelligent desk lamp may perform sitting posture detection on the image using a sitting posture detection algorithm pre-stored in the intelligent desk lamp to determine whether the sitting posture of the user is standard.
It should be understood that fig. 1 is only used for illustrating a part of the structure of the intelligent desk lamp related to the present application, and the present application does not limit the shape, distribution of components, and whether other components are included.
At present, the existing sitting posture detection method mainly comprises the following steps: and detecting the sitting posture of the user based on a deep learning algorithm. When the method is used for detecting the sitting posture of a user, the key points of the face, such as the key points of the left eye, the right eye, the nose and the like, are identified by a pre-trained deep learning model for each single frame of image collected by the camera. If all the key points in the frame of image can be identified, which indicates that the user does not seriously lower the head, determining the sitting posture standard of the user. If all the key points are not reached from the frame image, the user is seriously lowered, and the key point identification fails, then the user is determined to be out of standard sitting posture.
That is, the above method can only be used to detect if there is a sitting movement of the user with a severe head-lowering. However, in fact, the inventor finds through research that the behavior of sitting posture deviation can also include: head left-right inclination, upper body left-right inclination, too close to the desktop, too close to the display screen of the intelligent desk lamp, and the like. For example, the above method cannot detect a case where the upper body inclination of the user seriously causes the head to be out of the image capturing range, but both the shoulder and the chest are still within the image capturing range. Therefore, the existing sitting posture detection method has the problem of poor accuracy. In addition, the method needs to use a deep learning model for image recognition for each frame of image, and the deep learning model is generally complex in structure, which may result in low sitting posture detection efficiency.
In addition, there are some related arts that propose to mount an infrared device on an intelligent desk lamp, and to use the infrared device to acquire an infrared image including a user, and to extract a user sitting posture profile from the infrared image. And then, comparing and judging the obtained sitting posture profiles of the users to determine whether the sitting postures of the users are standard or not. However, the infrared image is often formed with poor accuracy due to the interference light, which in turn results in poor accuracy of detecting the sitting posture of the user. In addition, the user sitting posture contour can be extracted only after a series of image processing such as image enhancement and binarization processing needs to be carried out on the infrared image, and whether the user sitting posture is standard or not can be judged only according to the sitting posture contours of a plurality of users. Therefore, this method also has a problem of low recognition efficiency.
In consideration of the problems of low accuracy and low efficiency of the existing sitting posture detection method, the sitting posture detection method based on the RGB image is provided for judging whether the image is a key frame or not and adopting different key point detection modes for the key frame and a non-key frame so as to ensure the accuracy of key point detection under the condition of large motion change in the key frame and improve the key point detection efficiency when the non-key frame (namely, small motion change of a target object) is detected.
It should be understood that the main body of the sitting posture detecting method provided by the present application may be any electronic device with a processing function. Optionally, the image capturing device for capturing the RGB image and the device for performing the sitting posture detection processing may be integrated in the same electronic device. Illustratively, the electronic device may be, for example, a smart home device such as a smart desk lamp having an image capture device, or an image capture device such as a camera having a processing function. In some embodiments, the image capturing device and the device for performing the sitting posture detecting process may also be disposed in different electronic devices, which is not limited in this application.
The technical solution of the present application will be described in detail with reference to specific examples. These several specific embodiments may be combined with each other below, and details of the same or similar concepts or processes may not be repeated in some embodiments.
Fig. 2 is a schematic flow chart of a sitting posture detecting method provided by the present application. As shown in fig. 2, the method comprises the steps of:
s101, a first image of the target object acquired by the image acquisition device is acquired.
Illustratively, the image capture device may be, for example, a camera. Taking the intelligent desk lamp shown in fig. 1 as an example, the image capturing device may be a camera mounted on the intelligent desk lamp.
The target object may be at least one user that the image capturing device can capture within a capture range. When the first image includes at least one user, the electronic device may use each user in the first image as a target object. Alternatively, the electronic device may identify the target object from the at least one user after acquiring the first image. In this implementation, it should be understood that the present application is not limited to how the electronic device identifies the target object from the at least one user.
S102, determining whether the first image is a key frame.
In some embodiments, the electronic device may, for example, perform key frame identification on each frame of the first image to determine whether the frame of image is a key frame. Optionally, it may be determined whether the first image is a key frame by referring to any one of existing key frame identification methods, which is not described herein again.
In some embodiments, the electronic device can also determine whether the first image is a key frame based on the first image and an adjacent frame image of the frame of the first image after the first image is acquired. Alternatively, the electronic device may acquire a motion variation between the first image and the target frame, and then determine whether the first image is a key frame according to the motion variation. For example, the target frame may be an image of a frame preceding the first image, or a key frame predetermined by the electronic device.
S103, performing key point detection on the first image according to a key point detection mode corresponding to the result of whether the first image is a key frame or not to obtain coordinates of key points of the target object in the first image.
Illustratively, the above-mentioned key points may include, for example, at least one of: left eye, right eye, nose, chin, left shoulder, right shoulder, chest center point, etc. The coordinates of the key points of the target object in the first image can be coordinates in a two-dimensional coordinate system, so that the sitting posture detection efficiency based on the coordinates can be improved.
For example, the electronic device may generate an identifier characterizing "the first image is a key frame" when the first image is determined to be a key frame, and may generate an identifier characterizing "the first image is a non-key frame" when the first image is determined to be a non-key frame. For example, the electronic device may store in advance a mapping relationship between an identifier of a result indicating whether the first image is a key frame and a key point detection manner. Optionally, the electronic device may determine, according to the identifier of the result indicating whether the first image is a key frame and the mapping relationship, whether the first image is a key point detection manner corresponding to the result indicating whether the first image is a key frame.
Optionally, the key point detection manner corresponding to the "first image is a key frame" may be a key point detection manner with higher accuracy of a detection result, so as to ensure accuracy of key point detection on the target object under the condition that the motion change of the target object in the key frame is larger. The key point detection mode corresponding to the first image non-key frame can be a key point detection mode with high detection efficiency, the motion change of the target object in the non-key frame is small, the key point detection is easy to realize, the calculation redundancy is reduced through the key point detection mode with high detection efficiency, and the efficiency of key point detection on the target object is improved.
And S104, acquiring a sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image.
Optionally, the sitting posture detection result of the target object may be used to indicate that the target object does not have a sitting posture problem, or that the target object has a sitting posture problem. Further, in some embodiments, the sitting posture detection result of the target object can also be used for representing the sitting posture problem of the target object. Illustratively, the category may include, for example, at least one of: the head of the target object inclines leftwards (or rightwards) seriously, the upper body of the target object inclines leftwards (or rightwards seriously, the near-sighted desktop of the target object, the near-sighted display screen of the target object and the like.
As a possible implementation manner, after acquiring coordinates of key points of the target object, the electronic device may determine, based on the coordinates of the key points, at least one parameter for determining whether there is a problem in the sitting posture of the target object, and acquire a sitting posture detection result of the target object according to the at least one parameter.
As another possible implementation manner, the electronic device may further determine, based on the coordinates of the key points of the target object, an orientation of the target object relative to the image acquisition device, and calibrate the coordinates of the key points based on the orientation, so as to improve accuracy of a sitting posture detection result of the target object determined based on the coordinates of the key points.
S105, determining whether the sitting posture detection result is used for representing that the target object has a sitting posture problem.
Optionally, the electronic device may generate an identifier for characterizing the sitting posture detection result after acquiring the sitting posture detection result of the target object. The marks corresponding to different sitting posture detection results are different. Then, the electronic device may determine, based on the identifier corresponding to the sitting posture detection result, whether the sitting posture detection result is used for representing that the target object has a sitting posture problem.
If the sitting posture detection result is used for representing that the target object has a sitting posture problem, the electronic device may execute step S106 to output alarm information.
If the sitting posture detection result is used for representing that the target object does not have a sitting posture problem, optionally, the electronic device may continue to return to perform steps S101-S105 to continuously detect the sitting posture of the target object based on the next frame of the first image, and perform step S106 to output alarm information when the target object has the sitting posture problem.
And S106, outputting alarm information for prompting the target object to adjust the sitting posture.
The electronic device may display the alarm information through a display device, for example, or broadcast the alarm information through a voice output device. Illustratively, taking the terminal as an intelligent desk lamp as shown in fig. 1 as an example, the intelligent desk lamp may display the alarm information through a display screen, for example. For example, the intelligent desk lamp may display a "please note sitting position! ". Or, for example, if the intelligent desk lamp further includes a voice output device, the intelligent desk lamp may also report an alarm message through the voice output device, for example, report a voice "please notice sitting posture".
In some embodiments, the electronic device may further perform step S106 when the sitting posture detection result is determined for a preset number of times to indicate that the target object has a sitting posture problem, so as to improve accuracy of outputting the alarm information by the electronic device. The preset times may be pre-stored in the electronic device by the user.
Alternatively, the electronic device may perform the sitting posture detection according to a preset frequency. In this implementation manner, when each sitting posture detection result determined in the preset duration is used to represent that the target object has a sitting posture problem, the step S106 is executed, so as to improve the accuracy of the electronic device in outputting alarm information. The preset frequency and the preset time length may be pre-stored in the electronic device for the user. Illustratively, the electronic equipment can carry out sitting posture detection once a second, and when the sitting posture detection result that obtains all was used for the target object of representation to have the sitting posture problem in 10 seconds, output alarm information.
Optionally, the sitting posture detection result of the target object includes: for example, the target object has a sitting posture problem, and the category to which the sitting posture problem belongs, the alarm information may further include: the category to which the sitting posture problem belongs. For example, taking the class to which the sitting posture problem belongs as the myopia desktop of the target object, the intelligent desk lamp may output "you are at the current myopia desktop, please work according to the standard sitting posture" through a display screen or a voice output device, for example.
In this embodiment, the manner of performing keypoint detection on the first image of the target object is determined by determining whether the first image is a keyframe. By detecting the coordinates of the key points obtained by the first image, the sitting posture detection result of the target object can be obtained. By the method, the electronic equipment can adopt different key point detection modes aiming at the key frames and the non-key frames, so that the accuracy and the efficiency of key point identification are guaranteed, and the accuracy of judging whether the sitting posture of the target object has problems or not based on the coordinates of the key points is improved. When the sitting posture detection result represents that the target object has the sitting posture problem, the electronic equipment can output alarm information to prompt the target object to adjust the sitting posture.
The following exemplifies the timing when the electronic device starts sitting posture detection: as a possible implementation manner, the electronic device may obtain the first image of the target object collected by the image collecting device after receiving the sitting posture detection instruction.
It should be understood that the application is not limited to how the electronic device receives the sitting posture detection instruction. Optionally, taking the electronic device as a terminal including a touch display screen as an example, the terminal may receive a sitting posture detection instruction input by a user through the touch display screen, for example. Or, taking the case that the terminal includes a microphone as an example, the terminal may further receive, as a sitting posture detection instruction, a voice instruction of the user for instructing to start sitting posture detection through the microphone.
As another possible implementation manner, the electronic device may also automatically detect whether to start sitting posture detection, so as to improve user experience. Optionally, when it is determined that the target object exists within the acquisition range according to the image acquired by the image acquisition device, the electronic device starts a sitting posture detection function, and then executes the step S101 to start sitting posture detection on the target object. In this implementation manner, it should be understood that the application is not limited to how the electronic device determines whether the target object exists in the acquisition range according to the image acquired by the image acquisition device. Optionally, it may be detected whether a target object exists by referring to any one of existing target detection methods, which is not described herein again.
As another possible implementation manner, the electronic device may further receive a sitting posture detection time period input by the user, and start sitting posture detection when the time period is reached, and stop sitting posture detection after the time period.
Alternatively, in some embodiments, after starting to perform the sitting posture detection, the electronic device may further receive a sitting posture detection termination instruction to stop performing the sitting posture detection. Or the electronic equipment can also close the sitting posture detection function when determining that the target object does not exist in the acquisition range according to the image acquired by the image acquisition device so as to save computing resources and electricity.
In this embodiment, the electronic device can start to perform sitting posture detection on the target object after receiving the sitting posture detection instruction, and through interaction between the user and the electronic device, the electronic device can perform sitting posture detection when the user needs the electronic device, so that the sitting posture detection accuracy is improved. Or, whether the electronic equipment can automatically detect the sitting posture or not, so that the automation of sitting posture detection is improved, and the sitting posture detection efficiency is improved.
How the electronic device determines whether the first image is a key frame is described in detail below:
As a possible implementation manner, the electronic device may determine whether the first image is a key frame according to whether the key frame buffer pool is empty.
Optionally, if the key frame buffer pool is empty, it indicates that the sitting posture detection is started, and no image is currently determined as a key frame, so that the electronic device may determine that the first image is a key frame.
If the relevant key frame is cached in the key frame cache pool, the electronic device may determine whether the first image is a key frame according to a luminance change between the first image and the first key frame. The first key frame is the key frame which is acquired by the image acquisition device in the key frame buffer pool and is closest to the first image acquisition time. The key frame closest to the first image acquisition time is used as the first key frame, so that the first image is compared with the latest key frame to determine the brightness change, the error accumulation is avoided, and the accuracy of sitting posture detection is further improved.
Alternatively, the luminance change between the first image and the first key frame may refer to, for example, a luminance difference between the first image and the first key frame, or an absolute value of the luminance difference between the first image and the first key frame.
It should be understood that the present application does not limit how the electronic device acquires the luminance change between the first image and the first key frame. For example, taking the luminance change between the first image and the first key frame as an absolute value of a luminance difference between the first image and the first key frame as an example, the electronic device may obtain the luminance difference between the first image and the first key frame by subtracting the first image and the first key frame by using an inter-frame difference method, for example. Then, the electronic device may take the absolute value of the luminance difference between the first image and the first key frame to obtain the absolute value of the luminance difference between the first image and the first key frame.
Still taking the above-mentioned luminance change between the first image and the first key frame as an example, if the luminance change between the first image and the first key frame is an absolute value of a luminance difference between the first image and the first key frame, in this implementation, optionally, after obtaining the absolute value of the luminance difference between the first image and the first key frame, the electronic device may determine whether the first image is a key frame by, for example:
the electronic device may determine whether the absolute value of the luminance difference is less than or equal to a preset threshold. Optionally, the preset threshold may be pre-stored in the electronic device by the user, for example.
If the absolute value of the brightness difference is smaller than or equal to the preset threshold, it is indicated that the motion change of the target object included in the first image relative to the target object included in the first keyframe is small. That is, the first image is not the frame where the key action is located in the motion change of the target object. Accordingly, the electronic device may determine the first image non-key frame.
If the absolute value of the brightness difference is greater than the preset threshold, it is indicated that the motion change of the target object included in the first image relative to the target object included in the first keyframe is large. That is, the first image is the frame in which the key action is located in the motion change of the target object. Thus, the electronic device can determine that the first image is a key frame.
In this embodiment, when the key frame buffer pool is empty, the first image is directly used as a key frame, so that the accuracy of the electronic device in determining whether the first image is a key frame is improved. And when the key frame is cached in the key frame cache pool, determining whether the first image is the key frame according to the brightness change between the first image and the first key frame. By the method, whether the motion change of the target object between the first image and the first key frame is large or not can be judged, and whether the first image is the key frame or not can be further determined.
The following describes how the electronic device performs key point detection on the first image when the first image is not a key frame, to obtain coordinates of key points of a target object in the first image, in detail:
as a possible implementation, the electronic device may acquire an optical-flow variation vector between the first image and the first key frame, and predict coordinates of a key point of the target object in the first image according to the optical-flow variation vector and coordinates of the key point of the target object in the first key frame.
As mentioned above, the first key frame is a key frame that is collected by the image collection device in the key frame buffer pool and is closest to the first image collection time. The optical flow variation vector is used for representing the motion direction and the motion speed of the target object.
Optionally, the number of the key frames cached in the key frame cache pool is not limited in the present application. Taking the frame image of the key frame closest to the first image acquisition time buffered in the key frame buffer pool all the time as an example, optionally, the electronic device may read the frame image from the buffer pool and directly use the frame image as the first key frame.
The electronic device can then obtain an optical-flow change vector between the first image and the first keyframe. Optionally, the electronic device may obtain the optical flow change vector by using a sparse optical flow field algorithm. By using the sparse optical flow field algorithm, the electronic equipment only needs to acquire the optical flow change vector based on the small number of pixel points of the first image and the small number of pixel points on the first key frame, so that the efficiency of acquiring the optical flow change vector is improved, and the efficiency of sitting posture detection is further improved.
In some embodiments, the electronic device may also employ a dense optical-flow field algorithm or the like that may be used to calculate an optical-flow change vector between the two images to obtain an optical-flow change vector between the first image and the first keyframe.
After acquiring the optical-flow variation vector between the first image and the first key frame, the electronic device may acquire the key point coordinates of the target object in the first key frame, and then predict the coordinates of the key point of the target object in the first image according to the optical-flow variation vector and the key point coordinates of the target object in the first key frame. For example, the key point coordinates of the target object in the first key frame may also be stored in the key frame buffer pool. That is, the electronic device may read the key point coordinates of the target object in the first key frame from the key frame buffer pool.
Optionally, for each key point coordinate of the target object in the first key frame, the electronic device may obtain the coordinate of the key point of the target object in the first image according to the sum of the key point coordinate in the first key frame and the optical flow variation vector, for example. For example, taking the nose of the target object in the first key frame as an example, the electronic device may take the sum of the coordinates of the nose of the target object in the first key frame and the optical-flow variation vector as the coordinates of the nose of the target object in the first image.
In the present embodiment, because the sitting posture detection is usually performed continuously for a period of time, there is a strong temporal and spatial correlation between adjacent frames collected by the image collecting apparatus. The coordinates of the key points of the target object are determined through the optical flow change vector between the first image and the first key frame, so that the above information can be utilized to the maximum extent, the accuracy of key point coordinate prediction is improved, and the accuracy of sitting posture detection is further improved.
The following describes in detail how the electronic device performs key point detection on the first image when the first image is a key frame to obtain coordinates of key points of a target object in the first image:
when the first image is a key frame, as a possible implementation manner, the electronic device may perform key point detection on the first image by using a depth learning algorithm, for example, to obtain coordinates of key points of the target object in the first image. Because the key frame is usually a corresponding image when the action change of the target object is large, the accuracy of detecting the key points in the key frame can be improved by detecting the key points of the key frame through a deep learning algorithm with high accuracy, and the accuracy of sitting posture detection based on the coordinates of the key points is further improved.
The deep learning algorithm may be a trained key point detection model pre-stored in the electronic device by the user. It should be understood that the present application does not limit how to train the above-mentioned keypoint detection model, obtain the deep learning algorithm, and an execution subject for training the above-mentioned keypoint detection model. In addition, the present application does not limit the above-described keypoint detection model. Optionally, any existing deep learning model that can be used as a key point detection model may be referred to, and details are not repeated here.
In this implementation manner, considering that the detection of the key point of the first image by using the deep learning algorithm may be successful or may also be unsuccessful, the electronic device may determine whether the coordinate of the key point of the target object is successfully detected after the detection of the key point of the first image by using the deep learning algorithm.
In some embodiments, the electronic device may determine that the coordinates of the key points of the target object are successfully detected when the coordinates of all the key points of the target object in the first image are acquired by adopting a deep learning algorithm. Or, the electronic device may further determine that the coordinates of the key points of the target object are not successfully detected when the coordinates of all the key points of the target object in the first image are not acquired by using a deep learning algorithm.
If the coordinates of the key points of the target object are not successfully detected, the electronic device may acquire the coordinates of the key points of the target object in the first image in another key point detection manner. Alternatively, the electronic device may predict coordinates of key points of the target object in the first image, for example, based on the optical-flow change vector between the first image and the first key frame, and the coordinates of the key points of the target object in the first key frame. Optionally, the method for acquiring the coordinates of the key points of the target object by the electronic device according to the optical flow variation vector may refer to the method described in the foregoing embodiment, and details are not described here again.
In the implementation mode, when the key point coordinate is not successfully detected through the deep learning algorithm, the key point in the first image is predicted by using the optical flow variation vector, so that the probability of missing detection of the key point is effectively reduced. By the method, even if part of key points of the target object exceed the acquisition range of the image acquisition device, the coordinates of the key points can be estimated through the optical flow change of surrounding pixels, and the robustness of sitting posture detection is improved.
If the coordinates of the key points of the target object are successfully detected, the electronic device may further cache the first image and the coordinates of the key points of the target object in the first image in a key frame cache pool. Because the accuracy of the deep learning algorithm is higher, the coordinates of the key points of the key frames in the key frame cache pool are guaranteed by caching the coordinates of the key points obtained based on the deep learning algorithm in the key frame cache pool, so that the error accumulation is avoided, and the accuracy of sitting posture detection is further improved.
In this implementation manner, optionally, the electronic device may cache the first image and the coordinates of the key points of the target object in the first image in a key frame cache pool in a key value pair manner. The key of the key-value pair may be an identifier of a key frame, and the value of the key-value pair may include: the first image, and coordinates of key points of the target object in the first image. Illustratively, the identification of the key frame may be, for example, a timestamp corresponding to the key frame. By the method, the electronic equipment can quickly acquire the key frame and the coordinates of the key points of the target object in the key frame according to the identification of the key frame, so that the sitting posture detection efficiency of the electronic equipment is improved.
In some embodiments, after the key point coordinates of the first image are successfully obtained through a deep learning algorithm, the electronic device may further take the first image as a newly determined key frame, and replace an original key frame in the key frame cache pool and coordinates of a key point of the original key frame with the first image and coordinates of a key point of the first image, so as to update the key frame cache pool. By the method, the storage resources of the electronic equipment are saved, and the sitting posture detection efficiency of the electronic equipment based on the key frames in the key frame cache pool is improved.
In this embodiment, when the first image is a non-key frame, coordinates of key points of a target object in the first image are predicted by using an optical flow field algorithm; and when the first image is a key frame, performing key point detection on the first image by using a depth learning algorithm to obtain coordinates of key points of the target object in the first image. The efficiency of key point detection by the optical flow field algorithm is generally higher than that of key point detection by the deep learning algorithm, so that the sitting posture detection efficiency of the electronic equipment is improved by the method. Moreover, the motion change of the target object in the key frame is large, so that the accuracy of key point detection on the key frame can be guaranteed through a deep learning algorithm with high accuracy. And because the motion change of the target object in the non-key frame is small, the accuracy of key point detection can be guaranteed by using the optical flow field algorithm in combination with the above information. Therefore, by the method, the accuracy of key point detection is guaranteed, meanwhile, the efficiency of key point detection is improved, and the accuracy and the efficiency of sitting posture detection are further improved.
How the electronic device obtains the sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image is described in detail below:
Fig. 3 is a schematic flowchart of a method for obtaining a sitting posture detection result of a target object according to coordinates of key points of the target object according to the present application. As shown in fig. 3, as a possible implementation manner, the foregoing step S104 may include the following steps:
s201, according to the coordinates of the key points of the target object in the first image, obtaining the chest orientation vector of the target object in a three-dimensional coordinate system.
And the coordinates of the key points of the target object in the first image are the coordinates in an image coordinate system. The origin of the three-dimensional coordinate system is the center point of the chest of the target object. In some embodiments, the starting point of the above-mentioned thorax orientation vector may coincide with the origin of the three-dimensional coordinate system. For example, fig. 4 is a schematic diagram of a three-dimensional coordinate system where a chest orientation vector is located according to the present application. As shown in fig. 4, the vector C is a chest orientation vector. The origin of the three-dimensional coordinate system may be the center point of the chest of the target object.
Alternatively, the electronic device may obtain the chest orientation vector of the target object in the three-dimensional coordinate system according to the coordinates of the key point of the target object in the first image by, for example:
first, the electronic device may obtain a shoulder vector of the target object in the first image and coordinates of a shoulder center point according to the coordinates of the left shoulder and the coordinates of the right shoulder of the target object in the first image. For example, fig. 5 is a schematic view of a scene where an image capturing device provided by the present application captures a first image. As shown in fig. 5, the shoulder vector of the target object may be, for example, as shown by vector S in fig. 5.
The electronic device may subtract the left shoulder coordinate from the right shoulder coordinate of the target object, for example, to obtain a shoulder vector of the target object. The electronic device may divide the sum of the coordinate in the x-axis direction in the right shoulder coordinate and the coordinate in the x-axis direction in the left shoulder coordinate of the target object by 2 to obtain the coordinate in the x-axis direction of the coordinate of the shoulder center point. And the electronic device may divide the sum of the coordinate in the y-axis direction in the right shoulder coordinate of the target object and the coordinate in the y-axis direction in the left shoulder coordinate by 2 to obtain the coordinate in the y-axis direction of the coordinate of the shoulder center point.
Then, the electronic device may obtain a torso vector of the target object based on the coordinates of the shoulder center point and the coordinates of the chest center point of the target object in the first image. Illustratively, the torso vector may be, for example, a vector T as shown in fig. 5. Alternatively, the electronic device may subtract the coordinates of the center point of the chest from the coordinates of the center point of the shoulder, for example, to obtain the torso vector of the target object.
After obtaining the shoulder vector and the torso vector of the target object, the electronic device may obtain a chest orientation vector of the target object according to the shoulder vector and the torso vector of the target object in the first image. Illustratively, the electronic device may obtain the chest orientation vector of the target object, for example, by the following formula (1):
C=T×S (1)
Where T represents the torso vector of the target subject, S represents the shoulder vector of the target subject, and C represents the chest orientation vector of the target subject. Therefore, formula (1) indicates that the electronic device may cross-multiply the torso vector of the target object and the shoulder vector of the target object, and obtain a normal vector of the torso vector of the target object and the shoulder vector of the target object as the chest orientation vector of the target object.
S202, calibrating the coordinates of the key points of the target object in the first image according to the chest orientation vector of the target object to obtain the coordinates of the key points of the calibrated target object.
And the two-dimensional coordinate system where the coordinates of the key points of the calibrated target object are located is parallel to the two-dimensional coordinate system where the image acquisition device is located.
Optionally, the electronic device may calibrate the coordinates of the key points, for example, in the following manner, to obtain the calibrated coordinates of the key points of the target object:
as a possible implementation, the electronic device may transform the coordinates of the key points of the target object in the first image into a two-dimensional coordinate system parallel to the two-dimensional coordinate system in which the image acquisition apparatus is located, based on the above-mentioned chest orientation vector.
First, the electronic device may acquire a projection vector of the chest orientation vector on a vertical plane in the three-dimensional coordinate system from the chest orientation vector of the target object. Illustratively, the breast orientation vector is (x)c,yc,zc) And the above mentioned vertical plane is xoz plane of the three-dimensional coordinate system as an example, the electronic device may set the coordinate of the breast orientation vector in the x-axis direction to zero to obtain the projection vector (0, y) of the breast orientation vector on the vertical plane in the three-dimensional coordinate systemc,zc)。
Then, the electronic device may obtain a rotation matrix of the two-dimensional coordinate system from the projection vector. Optionally, the electronic device may first calculate an included angle between the projection vector and the first axis, and obtain a rotation matrix of the two-dimensional coordinate system according to the included angle. The first axis is an axis of the three-dimensional coordinate system perpendicular to the two-dimensional coordinate system where the image capturing device is located, for example, a y-axis shown in fig. 4.
Optionally, the included angle between the projection vector and the first axis may be an included angle between the projection vector and the first axis in a positive direction, or an included angle between the projection vector and the first axis in a negative direction, which is not limited in the present application.
Taking an angle between the projection vector and the first axis as an angle in a positive direction of the projection vector and the first axis as an example, the electronic device may obtain a rotation matrix of the two-dimensional coordinate system through the following formula (2):
Figure BDA0003563969760000131
Wherein, R represents a rotation matrix of a two-dimensional coordinate system, and theta represents an included angle between the projection vector and the positive direction of the first axis.
Wherein
Figure BDA0003563969760000132
Wherein
Figure BDA0003563969760000133
After obtaining the rotation matrix of the two-dimensional coordinate system, the electronic device may calibrate the coordinates of the key points of the target object in the first image using the rotation matrix, to obtain the calibrated coordinates of the key points of the target object. Optionally, for the coordinates of each key point of the target object in the first image, the electronic device may multiply the coordinates of the key point by the rotation matrix, and the obtained result is used as the coordinates of the key point of the calibrated target object.
For example, assuming that the left-eye key point coordinates of the target object are (x, z), the electronic device may acquire the calibrated left-eye key point coordinates by the following formula (3).
Figure BDA0003563969760000141
Wherein the content of the first and second substances,
Figure BDA0003563969760000142
the calibrated coordinates of the key points of the left eye.
And S203, acquiring a sitting posture detection result of the target object by using the coordinates of the key points of the calibrated target object.
Optionally, the electronic device may obtain the sitting posture detection result of the target object by, for example, judging whether the coordinates of the key points of each calibrated target object are within a preset coordinate range. The preset coordinate ranges corresponding to different key points can be different. The preset coordinate range corresponding to each key point may be pre-stored in the electronic device by the user. For example, the electronic device may determine a sitting posture detection result of the target object to be used for representing that the target object does not have a sitting posture problem when the coordinates of the key points of each calibrated target object are within a preset coordinate range. If the coordinates of the key points of the calibrated target object are not within the preset coordinate range, the electronic equipment can determine the sitting posture detection result of the target object to be used for representing that the target object has a sitting posture problem.
Or, the electronic device may further use the coordinates of the calibrated key points of the target object to obtain the detection parameters of the target object, and then obtain the sitting posture detection result of the target object according to the detection parameters of the target object. Wherein the detection parameter may include at least one of: head left and right inclination, upper body left and right inclination, shoulder jaw difference, and head to body ratio.
Optionally, the electronic device may determine whether the target object has a sitting posture problem based on the detection parameters. Or, if the sitting posture detection result indicates that the target object has a sitting posture problem, the electronic device may further determine the category to which the sitting posture problem of the target object belongs based on the detection parameters.
It should be understood that, the application is not limited to how the electronic device obtains the sitting posture detection result of the target object according to the detection parameters.
Exemplary, the above detection parameters include: the electronic device can determine the sitting posture detection result of the target object to be used for representing that the target object does not have the sitting posture problem when the head left and right inclination angle of the target object is within the preset head inclination angle range, the upper body left and right inclination angle of the target object is within the preset upper body inclination angle range, the shoulder jaw difference is larger than or equal to a first preset shoulder jaw difference threshold value, and the occupation ratio of the head relative to the body is smaller than or equal to a preset occupation ratio threshold value.
The electronic equipment can determine that the class to which the sitting posture problem of the target object belongs is the head left-right inclination angle of the target object when the head left-right inclination angle of the target object is outside the preset head inclination angle range. The electronic equipment can determine that the class to which the sitting posture problem of the target object belongs is the upper body left and right inclination angle of the target object when the upper body left and right inclination angle of the target object is out of the preset upper body inclination angle range. The electronic equipment can determine that the class to which the sitting posture problem of the target object belongs is the near-sighted desktop of the target object when the shoulder jaw difference of the target object is smaller than or equal to a second preset shoulder jaw difference threshold value. The electronic equipment can determine that the category to which the sitting posture problem of the target object belongs is a target object myopia screen when the proportion of the head of the target object to the body is larger than a preset proportion threshold value.
Optionally, the preset head inclination angle range, the preset upper body inclination angle range, the first preset shoulder-jaw difference threshold, the second preset shoulder-jaw difference threshold, and the preset proportion threshold may be pre-stored in the electronic device by the user. Wherein the first preset shoulder jaw difference threshold is greater than the second preset shoulder jaw difference threshold.
How the electronic device obtains the values of the detection parameters is exemplarily described below:
Taking the example that the detection parameter includes the head left-right inclination angle, the electronic device may determine the center points of the left eye and the right eye, for example, based on the coordinates of the key points of the left eye and the right eye. The head centerline of the target object is then determined from the center points of the left and right eyes, and the coordinates of the nose key points. Then, the electronic apparatus may acquire an angle at which the head center line of the target object deviates from the vertical direction rightward (or leftward) as the head left-right inclination angle.
Taking the example that the detection parameter includes the upper body left-right inclination angle, the electronic device may determine the center point of the left shoulder and the right shoulder, for example, based on the aforementioned key point coordinates of the left shoulder and the right shoulder. And then determining the center line of the upper body of the target object according to the center points of the left shoulder and the right shoulder and the coordinates of the key point of the center point of the chest. Then, the electronic apparatus may acquire an angle at which the center line of the upper body of the target object deviates from the vertical direction rightward (or leftward) as the upper body left-right inclination angle.
Taking the detection parameters including the shoulder-jaw difference as an example, the electronic device may determine the straight line where the left shoulder and the right shoulder are located according to the coordinates of the left shoulder and the coordinates of the right shoulder. Then, the electronic device may obtain a distance from the chin to a straight line where the left shoulder and the right shoulder are located according to the coordinates of the chin key point and the straight line where the left shoulder and the right shoulder are located. The electronic device can take the distance as a shoulder jaw difference of the target object. Through regarding the difference of shoulder jaw as detecting parameter, can effectively detect the myopia position of sitting homoenergetic under the condition such as can't shoot people's eye, desktop, and then improved the variety that the position of sitting detected, further improved the accuracy that the position of sitting detected.
Taking the detection parameter including the proportion of the head to the body as an example, the electronic device may determine a face coordinate frame corresponding to the target object according to each key point on the face, and obtain the area of the face coordinate frame. The face coordinate frame may refer to coordinates of each vertex of a detection frame determined by face detection. Then, the electronic device may determine the area of the body of the target object according to the coordinates of the key points such as the center points of the left shoulder, the right shoulder, and the chest. Then, the electronic device may obtain a quotient of the area of the face coordinate frame and the area of the body of the target object as a ratio of the head to the body.
In some embodiments, the detecting parameters may further include: the ratio of the head to the display screen, etc., and the present application does not limit this.
In this embodiment, the electronic device may acquire the chest orientation vector for calibrating the coordinates of the key points according to the coordinates of the key points of the target object, so that the electronic device does not need to additionally acquire other parameters to calibrate the coordinates of the key points. For example, the first image is not required to be an RGBD image, and human body reconstruction such as three-dimensional mesh (3-dimensional mesh) is not required to be performed, but the coordinates of the key points are directly mapped and transformed, so that the amount of calculation for coordinate calibration by the electronic device is reduced, and the efficiency of sitting posture detection is further improved.
Moreover, the coordinates of the key points are calibrated, so that a two-dimensional coordinate system where the coordinates of the calibrated key points are located is parallel to a two-dimensional coordinate system where the image acquisition device is located, namely, the human body is aligned, the human body directly faces the image acquisition device, the possibility of low accuracy of the coordinates of the key points caused by the problem of image acquisition angles is reduced, and the accuracy of sitting posture detection is further improved. In addition, by the method, the image acquisition device is not limited to be directly opposite to the target object, so that the flexibility of the use of the image acquisition device or the terminal equipment provided with the image acquisition device is improved.
Taking the electronic device as an intelligent desk lamp and the image acquisition device as a camera installed on the intelligent desk lamp as an example, fig. 6 is a schematic flow diagram of another sitting posture detection method provided by the application. As shown in fig. 6, the method may include the steps of:
step 1, the intelligent desk lamp receives an instruction which is input by a target object and used for indicating to start writing operation through a display screen.
And 2, initializing a key frame cache pool by the intelligent desk lamp, and acquiring an RGB first image of the target object from the camera.
And 3, judging whether the first image is a key frame or not.
And 3.1, when the key frame buffer pool is empty, determining that the first image is a key frame.
And 3.2, when the key frame buffer pool is not empty, performing difference on the first image and the first key frame by using an interframe difference method to obtain an absolute value of the brightness difference of the two frames of images. And when the absolute value of the brightness difference is larger than a preset threshold value, determining the first image as a key frame. Otherwise. The first image non-key frame is determined.
If the first image is a key frame, step 4.1 is performed. If the first image is not a key frame, step 4.2 is performed.
And 4.1, detecting key points by using a bone key point detection algorithm in a deep learning algorithm if the first image is a key frame. The key points comprise 7 key points of a left eye, a right eye, a nose, a chin, a left shoulder, a right shoulder and a chest center point.
And 4.1.1, if the bone key point is detected successfully, storing the first image and the corresponding coordinates of each bone key point into a cache pool.
And 4.1.2, if the detection of the skeletal key points fails, acquiring optical flow change vectors of the first image and the first key frame through a sparse optical flow field algorithm, and further obtaining skeletal key point coordinates predicted by using an optical flow method based on the optical flow change vectors.
And 4.2, if the first image is not a key frame, acquiring optical flow change vectors of the first image and the first key frame by using a sparse optical flow field algorithm, and further obtaining a skeleton key point coordinate predicted by using an optical flow method based on the optical flow change vectors.
And step 5, acquiring a chest orientation vector of the target object according to the coordinates of the left shoulder key point, the coordinates of the right shoulder key point and the coordinates of the chest central point of the target object, and calibrating the coordinates of the key points and the coordinates of the face detection coordinate frame based on the chest orientation vector to obtain the coordinates of the calibrated key points.
The specific implementation manner of step 5 may refer to the method described in the foregoing embodiment, and is not described herein again.
And 6, acquiring detection parameters of the target object by using the coordinates of the key points of the calibrated target object.
And 7, acquiring a sitting posture detection result of the target object based on the detection parameters of the target object.
For example, the mapping relationship between the detection parameter of the target object and the sitting posture detection result of the target object may be as shown in table 1 below:
TABLE 1
Sitting posture detection result Judgment standard corresponding to detection parameters
No sitting posture problem L1∈[-15°,15°]And L2 e-5 DEG, 5 DEG]D is not less than 5 and r is not more than 0.2
Severe left and right inclination of head L1 is more than or equal to 15 degrees or L1 is less than or equal to-15 degrees
Severe left and right inclination of upper body L2 is more than or equal to 5 degrees or L2 is less than or equal to-5 degrees
Myopia desktop d≤-8
Myopia screen r≥0.2
Where L1 denotes the left-right head inclination angle of the target object, L2 denotes the left-right upper body inclination angle of the target object, d denotes the shoulder-jaw difference of the target object, and r denotes the head-body ratio of the target object.
And 8, judging whether the sitting posture detection result represents that the target object has a sitting posture problem.
If yes, step 9 is executed to output alarm information.
If not, continuing to perform real-time monitoring, and circulating the steps 1-9 to perform sitting posture detection on the target object.
And 9, outputting alarm information by the intelligent desk lamp to prompt a target object to adjust the sitting posture.
In the embodiment, whether the current frame is a key frame is judged by an interframe difference method, and the key point of the key frame is predicted by a deep learning algorithm; and calculating an optical flow field for the non-key frame, estimating the position of a key point, and subsequently combining a corresponding judgment rule to judge the sitting posture type. By the method, each frame of image does not need to be subjected to execution of a deep learning algorithm, and missing detection of non-key frames can be effectively reduced, so that the stability of continuous frame sitting posture detection is improved. By the method, the poor sitting posture in reading, writing and other states can be fed back, supervised and prompted to the target object to recover the normal sitting posture, and the physical and psychological health of the target object is protected.
Fig. 7 is a schematic structural diagram of a sitting posture detecting device provided by the present application. As shown in fig. 7, the apparatus includes: an acquisition module 31, a processing module 32, and an output module 33. Wherein the content of the first and second substances,
the acquiring module 31 is configured to acquire a first image of the target object acquired by the image acquiring device.
A processing module 32 for determining whether the first image is a key frame; performing key point detection on the first image according to a key point detection mode corresponding to the result of whether the first image is a key frame or not to obtain coordinates of key points of the target object in the first image; and acquiring a sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image.
And the output module 33 is configured to output alarm information when the sitting posture detection result indicates that the target object has a sitting posture problem. Wherein, alarm information is used for prompting the target object to adjust the sitting posture.
Optionally, the processing module 32 is specifically configured to, when the first image is not a key frame, obtain an optical flow change vector between the first image and the first key frame, and predict coordinates of a key point of the target object in the first image according to the optical flow change vector and coordinates of the key point of the target object in the first key frame. The first key frame is a key frame which is collected by the image collecting device in the key frame buffer pool and is closest to the first image collecting time, and the optical flow change vector is used for representing the motion direction and the motion speed of the target object.
Or, the processing module 32 is specifically configured to, when the first image is a key frame, perform key point detection on the first image by using a deep learning algorithm, and acquire coordinates of key points of the target object in the first image.
Optionally, the processing module 32 is specifically configured to acquire the optical flow change vector by using a sparse optical flow field algorithm.
Optionally, the processing module 32 is specifically configured to perform keypoint detection on the first image by using a deep learning algorithm; when the coordinates of the key points of the target object are successfully detected, caching the first image and the coordinates of the key points of the target object in the first image into the key frame cache pool; predicting coordinates of key points of the target object in the first image according to an optical flow variation vector between the first image and a first key frame and the coordinates of the key points of the target object in the first key frame when the coordinates of the key points of the target object are not successfully detected.
Optionally, the processing module 32 is specifically configured to cache the first image and the coordinates of the key points of the target object in the first image in the key frame cache pool in a key value pair manner. Wherein a key of the key-value pair is an identifier of a key frame, and values of the key-value pair include: the first image, and coordinates of key points of the target object in the first image.
Optionally, the processing module 32 is specifically configured to determine that the first image is a key frame when the key frame buffer pool is empty; and when a key frame is cached in the key frame cache pool, determining whether the first image is the key frame according to the brightness change between the first image and the first key frame.
Optionally, the processing module 32 is specifically configured to obtain an absolute value of a luminance difference between the first image and the first key frame; when the absolute value of the brightness difference is smaller than or equal to a preset threshold value, determining the first image non-key frame; or when the absolute value of the brightness difference is larger than the preset threshold, determining that the first image is a key frame.
Optionally, the coordinates of the key points are coordinates in an image coordinate system. Optionally, the processing module 32 is specifically configured to obtain a chest orientation vector of the target object in a three-dimensional coordinate system according to the coordinates of the key points of the target object in the first image; calibrating the coordinates of the key points of the target object in the first image according to the chest orientation vector of the target object to obtain the calibrated coordinates of the key points of the target object; and acquiring a sitting posture detection result of the target object by using the calibrated coordinates of the key points of the target object. Wherein the origin of the three-dimensional coordinate system is the chest center point of the target object; and the two-dimensional coordinate system where the coordinates of the key points of the calibrated target object are located is parallel to the two-dimensional coordinate system where the image acquisition device is located.
Optionally, the key points include: the center points of the left shoulder, the right shoulder and the chest. Optionally, the processing module 32 is specifically configured to obtain, according to the coordinate of the left shoulder and the coordinate of the right shoulder of the target object in the first image, a shoulder vector of the target object in the first image and a coordinate of a shoulder center point; acquiring a torso vector of the target object according to the coordinates of the shoulder central point and the coordinates of the chest central point of the target object in the first image; and obtaining a chest orientation vector of the target object according to the shoulder vector and the trunk vector of the target object in the first image.
Optionally, the processing module 32 is specifically configured to obtain, according to the chest orientation vector of the target object, a projection vector of the chest orientation vector on a vertical plane of the three-dimensional coordinate system; acquiring a rotation matrix of the two-dimensional coordinate system according to the projection vector; and calibrating the coordinates of the key points of the target object in the first image by using the rotation matrix to obtain the calibrated coordinates of the key points of the target object.
Optionally, the processing module 32 is specifically configured to obtain a detection parameter of the target object by using the calibrated coordinates of the key point of the target object; and acquiring a sitting posture detection result of the target object according to the detection parameters of the target object. Wherein the detection parameters include at least one of: the head left and right inclination, the upper body left and right inclination, the shoulder jaw difference, and the head to body ratio.
Optionally, when the sitting posture detection result represents that the target object has a sitting posture problem, the sitting posture detection result of the target object includes: the target object has a sitting posture problem, and a category to which the sitting posture problem belongs. In this implementation, the alarm information further includes: the category to which the sitting posture problem belongs.
Optionally, the apparatus may further include a receiving module 34, configured to receive a sitting posture detection instruction before the first image of the target object is acquired by the image acquiring apparatus.
Or, the processing module 32 is further configured to, before the first image of the target object acquired by the image acquisition device is acquired, start a sitting posture detection function when it is determined that the target object exists within an acquisition range according to the image acquired by the image acquisition device.
The application provides a position of sitting detection device for carry out aforementioned position of sitting detection method embodiment, its realization principle is similar with technological effect, and it is no longer repeated here.
Fig. 8 is a schematic structural diagram of an electronic device provided in the present application. Illustratively, the electronic device may be, for example, an intelligent desk lamp, an intelligent camera, or the like. As shown in fig. 8, the electronic device 400 may include: at least one processor 401 and a memory 402.
The memory 402 stores programs. In particular, the program may include program code comprising computer operating instructions.
Memory 402 may comprise high-speed RAM memory and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The processor 401 is configured to execute computer-executable instructions stored in the memory 402 to implement the sitting posture detection method described in the foregoing method embodiments. The processor 401 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement the embodiments of the present Application.
Take above-mentioned electronic equipment as intelligent desk lamp for example, this intelligent desk lamp can also include: an image acquisition device. The image acquisition device may be configured to acquire a first image of a target object. Optionally, the intelligent desk lamp may further include a display screen, a voice broadcast device, and the like.
Optionally, the electronic device 400 may further include a communication interface 403. In a specific implementation, if the communication interface 403, the memory 402 and the processor 401 are implemented independently, the communication interface 403, the memory 402 and the processor 401 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. Buses may be classified as address buses, data buses, control buses, etc., but do not represent only one bus or type of bus.
Alternatively, in a specific implementation, if the communication interface 403, the memory 402, and the processor 401 are integrated into a chip, the communication interface 403, the memory 402, and the processor 401 may complete communication through an internal interface.
The present application also provides a computer-readable storage medium, which may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and in particular, the computer-readable storage medium stores program instructions, and the program instructions are used in the method in the foregoing embodiments.
The present application further provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the electronic device may read the execution instructions from the readable storage medium, and the execution of the execution instructions by the at least one processor causes the electronic device to implement the sitting posture detection method provided by the various embodiments described above.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (17)

1. A sitting posture detecting method, comprising:
acquiring a first image of a target object acquired by an image acquisition device;
determining whether the first image is a key frame;
performing key point detection on the first image according to a key point detection mode corresponding to the result of whether the first image is a key frame or not to obtain coordinates of key points of the target object in the first image;
obtaining a sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image;
and if the sitting posture detection result represents that the target object has a sitting posture problem, outputting alarm information, wherein the alarm information is used for prompting the target object to adjust the sitting posture.
2. The method according to claim 1, wherein the performing the keypoint detection on the first image according to the keypoint detection mode corresponding to the result of whether the first image is a keyframe to obtain the coordinates of the keypoints of the target object in the first image comprises:
if the first image is not a key frame, acquiring an optical flow change vector between the first image and the first key frame, and predicting the coordinates of the key point of the target object in the first image according to the optical flow change vector and the coordinates of the key point of the target object in the first key frame; the first key frame is a key frame which is acquired by the image acquisition device in a key frame cache pool and is closest to the first image acquisition time, and the optical flow change vector is used for representing the motion direction and the motion speed of the target object;
And if the first image is a key frame, performing key point detection on the first image by adopting a deep learning algorithm to obtain the coordinates of the key points of the target object in the first image.
3. The method of claim 2, wherein said obtaining an optical flow variation vector between said first image and a first keyframe comprises:
and acquiring the optical flow change vector by adopting a sparse optical flow field algorithm.
4. The method according to claim 2, wherein the performing the keypoint detection on the first image by using the deep learning algorithm to obtain the coordinates of the keypoints of the target object in the first image comprises:
performing key point detection on the first image by adopting a deep learning algorithm;
if the coordinates of the key points of the target object are successfully detected, caching the first image and the coordinates of the key points of the target object in the first image into the key frame cache pool;
if the coordinates of the key points of the target object are not successfully detected, predicting the coordinates of the key points of the target object in the first image according to the optical flow change vector between the first image and the first key frame and the coordinates of the key points of the target object in the first key frame.
5. The method of claim 4, wherein caching the first image and coordinates of the keypoints of the target object in the first image into the key frame cache pool comprises:
caching the first image and the coordinates of the key points of the target object in the first image into the key frame cache pool in a key value pair mode; wherein the key of the key-value pair is an identifier of a key frame, and the value of the key-value pair includes: the first image, and coordinates of keypoints of the target object in the first image.
6. The method of claim 2, wherein the determining whether the first image is a key frame comprises:
if the key frame cache pool is empty, determining that the first image is a key frame;
and if the key frame is cached in the key frame cache pool, determining whether the first image is the key frame according to the brightness change between the first image and the first key frame.
7. The method of claim 6, wherein determining whether the first image is a key frame based on a luminance change between the first image and the first key frame comprises:
Acquiring an absolute value of a luminance difference between the first image and the first key frame;
if the absolute value of the brightness difference is smaller than or equal to a preset threshold value, determining the first image non-key frame;
or, if the absolute value of the brightness difference is greater than the preset threshold, determining that the first image is a key frame.
8. The method according to any one of claims 1 to 7, wherein the coordinates of the key points are coordinates in an image coordinate system, and the obtaining of the sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image comprises:
acquiring a chest orientation vector of the target object in a three-dimensional coordinate system according to the coordinates of the key points of the target object in the first image; the origin of the three-dimensional coordinate system is the chest center point of the target object;
calibrating the coordinates of the key points of the target object in the first image according to the chest orientation vector of the target object to obtain the calibrated coordinates of the key points of the target object; the two-dimensional coordinate system where the coordinates of the key points of the calibrated target object are located is parallel to the two-dimensional coordinate system where the image acquisition device is located;
And acquiring a sitting posture detection result of the target object by using the calibrated coordinates of the key points of the target object.
9. The method of claim 8, wherein the keypoints comprise: the chest orientation vector of the target object in a three-dimensional coordinate system is obtained according to the coordinates of the key points of the target object in the first image, and the chest orientation vector comprises:
acquiring a shoulder vector of the target object in the first image and a coordinate of a shoulder center point according to the coordinate of the left shoulder and the coordinate of the right shoulder of the target object in the first image;
acquiring a trunk vector of the target object according to the coordinates of the shoulder central point and the coordinates of the chest central point of the target object in the first image;
and obtaining a chest orientation vector of the target object according to the shoulder vector and the trunk vector of the target object in the first image.
10. The method of claim 9, wherein the calibrating the coordinates of the key points of the target object in the first image according to the chest orientation vector of the target object to obtain calibrated coordinates of the key points of the target object comprises:
Acquiring a projection vector of the chest orientation vector on a vertical plane of the three-dimensional coordinate system according to the chest orientation vector of the target object;
acquiring a rotation matrix of the two-dimensional coordinate system according to the projection vector;
and calibrating the coordinates of the key points of the target object in the first image by using the rotation matrix to obtain the calibrated coordinates of the key points of the target object.
11. The method of claim 8, wherein the obtaining sitting posture detection results of the target object by using the calibrated coordinates of the key points of the target object comprises:
acquiring detection parameters of the target object by using the calibrated coordinates of the key points of the target object, wherein the detection parameters comprise at least one of the following items: the head left and right inclination, the upper body left and right inclination, the shoulder jaw difference, and the head to body ratio;
and acquiring a sitting posture detection result of the target object according to the detection parameters of the target object.
12. The method of claim 11, wherein if the sitting posture detection result indicates that the target subject has a sitting posture problem, the sitting posture detection result of the target subject comprises: the target object has a sitting posture problem and a category to which the sitting posture problem belongs;
The alarm information further includes: the category to which the sitting posture problem belongs.
13. The method of any one of claims 1-7, wherein prior to acquiring the first image of the target object acquired by the image acquisition device, the method further comprises:
receiving a sitting posture detection instruction;
or when the target object exists in the acquisition range according to the image acquired by the image acquisition device, starting a sitting posture detection function.
14. A sitting posture detecting apparatus, comprising:
the acquisition module is used for acquiring a first image of a target object acquired by the image acquisition device;
a processing module for determining whether the first image is a key frame; performing key point detection on the first image according to a key point detection mode corresponding to the result of whether the first image is a key frame or not to obtain coordinates of key points of the target object in the first image; acquiring a sitting posture detection result of the target object according to the coordinates of the key points of the target object in the first image;
and the output module is used for outputting alarm information when the sitting posture detection result represents that the target object has a sitting posture problem, wherein the alarm information is used for prompting the target object to adjust the sitting posture.
15. An electronic device, comprising: at least one processor, a memory;
the memory stores computer execution instructions;
the at least one processor executing the computer-executable instructions stored by the memory causes the electronic device to perform the method of any of claims 1-13.
16. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1-13.
17. A computer program product comprising a computer program, characterized in that the computer program realizes the method of any of claims 1-13 when executed by a processor.
CN202210303780.1A 2022-03-24 2022-03-24 Sitting posture detection method and device, electronic equipment, storage medium and program product Pending CN114758354A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210303780.1A CN114758354A (en) 2022-03-24 2022-03-24 Sitting posture detection method and device, electronic equipment, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210303780.1A CN114758354A (en) 2022-03-24 2022-03-24 Sitting posture detection method and device, electronic equipment, storage medium and program product

Publications (1)

Publication Number Publication Date
CN114758354A true CN114758354A (en) 2022-07-15

Family

ID=82326438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210303780.1A Pending CN114758354A (en) 2022-03-24 2022-03-24 Sitting posture detection method and device, electronic equipment, storage medium and program product

Country Status (1)

Country Link
CN (1) CN114758354A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116682041A (en) * 2023-06-06 2023-09-01 南京听说科技有限公司 Intelligent auxiliary card for teaching and examination and control system thereof

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116682041A (en) * 2023-06-06 2023-09-01 南京听说科技有限公司 Intelligent auxiliary card for teaching and examination and control system thereof
CN116682041B (en) * 2023-06-06 2023-12-12 南京听说科技有限公司 Intelligent auxiliary card for teaching and examination and control system thereof

Similar Documents

Publication Publication Date Title
CN108062536B (en) Detection method and device and computer storage medium
JP5873442B2 (en) Object detection apparatus and object detection method
US10839521B2 (en) Image processing apparatus, image processing method, and computer-readable storage medium
CN110998659B (en) Image processing system, image processing method, and program
CN110147744B (en) Face image quality assessment method, device and terminal
JP5991224B2 (en) Image processing apparatus, image processing method, and image processing program
US11398049B2 (en) Object tracking device, object tracking method, and object tracking program
WO2021169924A1 (en) Behavior prediction method and apparatus, gait recognition method and apparatus, electronic device, and computer readable storage medium
JP2019028843A (en) Information processing apparatus for estimating person's line of sight and estimation method, and learning device and learning method
WO2014106445A1 (en) Method and apparatus for detecting backlight
EP3241151A1 (en) An image face processing method and apparatus
WO2011161579A1 (en) Method, apparatus and computer program product for providing object tracking using template switching and feature adaptation
CN111008935B (en) Face image enhancement method, device, system and storage medium
CN109086724A (en) A kind of method for detecting human face and storage medium of acceleration
TWI721786B (en) Face verification method, device, server and readable storage medium
US20160093028A1 (en) Image processing method, image processing apparatus and electronic device
CN111723687A (en) Human body action recognition method and device based on neural network
US20180307896A1 (en) Facial detection device, facial detection system provided with same, and facial detection method
CN111639582B (en) Living body detection method and equipment
JPWO2012121137A1 (en) Image processing apparatus and image processing program
CN114758354A (en) Sitting posture detection method and device, electronic equipment, storage medium and program product
WO2022041953A1 (en) Behavior recognition method and apparatus, and storage medium
JP2012118927A (en) Image processing program and image processing device
CN113342157B (en) Eyeball tracking processing method and related device
CN115049819A (en) Watching region identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination