CN110705390A - Body posture recognition method and device based on LSTM and storage medium - Google Patents

Body posture recognition method and device based on LSTM and storage medium Download PDF

Info

Publication number
CN110705390A
CN110705390A CN201910875154.8A CN201910875154A CN110705390A CN 110705390 A CN110705390 A CN 110705390A CN 201910875154 A CN201910875154 A CN 201910875154A CN 110705390 A CN110705390 A CN 110705390A
Authority
CN
China
Prior art keywords
lstm
motion
characteristic information
body posture
action
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910875154.8A
Other languages
Chinese (zh)
Inventor
董洪涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201910875154.8A priority Critical patent/CN110705390A/en
Priority to PCT/CN2019/117890 priority patent/WO2021051579A1/en
Publication of CN110705390A publication Critical patent/CN110705390A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention relates to the technical field of biological recognition, and provides a body posture recognition method based on LSTM, which comprises the following steps: acquiring a motion video of a subject to be identified; extracting motion characteristic information in the obtained motion video of the main body to be identified through OpenPose; the action characteristic information at least comprises skeleton key point information; identifying the action standardization degree corresponding to the action characteristic information according to the action characteristic information and a body posture identification model generated by pre-training; the body posture recognition model is a target neural network model generated according to a preset standard action; and the target neural network model is generated by training according to the standard action characteristic information arranged according to the time sequence. According to the method, the video motion is not required to be cut into isolated features for recognition, learning recognition is carried out by matching with a neural network model, the body posture recognition process is fast and accurate, and the user experience is improved.

Description

Body posture recognition method and device based on LSTM and storage medium
Technical Field
The invention relates to the technical field of biological recognition, in particular to a body posture recognition method and device based on LSTM and a computer readable storage medium.
Background
Many activities, particularly athletic sports such as swimming, table tennis, diving, gymnastics, etc., require specific requirements for each minute action during training and competition, in order to achieve better training or competition results. In the process of physical training, most of the adjustment and correction of a plurality of actions are finished by a special coach in the process of professional instruction, and sportsmen can hardly find the action errors of the sportsmen in the process of sports. In the existing body-building games, such as aerobics games, broadcast gymnastics games, dance games and the like, the judgment is used for scoring the individual performance, and although the judgment is a professional, the objectivity of the performance is influenced by misjudgment, missing judgment, subjective bias and the like.
Therefore, it is desirable to have a computer capable of analyzing and judging the movement and body posture of the exerciser to some extent instead of the coach and the referee.
Most of the existing gesture scoring systems adopt a sensor and APP mode, a sensor is used for capturing a motion mode of a human body, and the APP is used for presenting relevant data. Cannot be used for precise action determination.
Motion recognition techniques involve computer vision, pattern recognition, and the like. The purpose of human body action recognition is to accurately recognize and classify the collected human body action characteristics and actions in a complete human body action posture library in time, match out and output an action posture with the highest similarity.
The current motion recognition technology is mainly divided into two types, one is based on an RGB image, and the other is based on a depth image. However, both methods have their own disadvantages, and the RGB images contain too much information, which is not conducive to the extraction of the motion posture features. In the depth image, the phenomenon that limbs are mutually shielded easily occurs, and the identification accuracy is influenced.
Therefore, there is a need for an action gesture recognition method that can increase the detection speed without losing the detection accuracy.
Disclosure of Invention
The invention provides a body posture recognition method based on an LSTM, an electronic device and a computer readable storage medium, and the method is mainly used for obtaining a trained body posture recognition model by taking a video set of standard actions arranged according to a time sequence as a training dictionary, taking a neural network LSTM comprising a Masking layer and a Softmax layer as a model, taking human skeleton characteristic points in the video of the standard actions as targets and taking a connection-meaning time classifier (CTC) as a training criterion to train the model.
In order to achieve the above object, the present invention provides a body posture recognition method based on LSTM, which comprises:
acquiring a motion video of a subject to be identified;
extracting motion characteristic information in the obtained motion video of the main body to be identified through OpenPose; the action characteristic information at least comprises skeleton key point information;
identifying the action standardization degree corresponding to the action characteristic information according to the action characteristic information and a body posture identification model generated by pre-training; the body posture recognition model is a target neural network model generated according to a preset standard action; and the target neural network model is generated by training according to the standard action characteristic information arranged according to the time sequence.
In one embodiment, the neural network of the body posture recognition model comprises a Masking layer, a Softmax layer, and an LSTM layer between the Masking layer and the Softmax layer.
In one embodiment, the objective function of the body posture recognition model is CTC; the body gesture recognition model outputs a Loss value through the CTC, and the smaller the output Loss value is, the higher the motion normalization of the body to be recognized corresponding to the Loss value is.
In one embodiment, the step of extracting, by openpos, the motion feature information in the motion video of the acquired subject to be recognized includes:
acquiring the motion video of the main body to be recognized by taking a preset time period or a beat as a time unit, and extracting image frames from the motion video of the main body to be recognized by utilizing OpenPose;
determining a plurality of skeleton key feature points of the main body to be identified according to the extracted image frame;
and integrating the plurality of skeleton key feature points into skeleton key point information of the main body to be identified.
In one embodiment, the skeleton key point information is (x, y, v), where x and y are abscissa information and v is state information of the skeleton key point; the state of the keypoints includes visible, invisible, and not within the graph.
In one embodiment, after the extracting, by openpos, motion feature information in the motion video of the acquired subject to be recognized, the method further includes: and storing the skeleton key point information in a JSON format.
In one embodiment, before the identifying the motion normalization corresponding to the motion feature information according to the motion feature information and the pre-trained body posture recognition model, the method further includes: and (3) performing iterative training on the body posture recognition model by using the CTC in a Softmax layer until the Loss value output by the CTC is greater than or equal to a set threshold value.
In addition, to achieve the above object, the present invention also provides an electronic device including: the body posture identifying program based on the LSTM is arranged in the memory, and when the body posture identifying program based on the LSTM is executed by the processor, the following steps are realized: s110, acquiring a motion video of a subject to be identified; s120, extracting the motion characteristic information in the motion video of the obtained subject to be identified through OpenPose; the action characteristic information at least comprises skeleton key point information; s130, identifying the action standardization degree corresponding to the action characteristic information according to the action characteristic information and a body posture identification model generated by pre-training; the body posture recognition model is a target neural network model generated according to a preset standard action; and the target neural network model is generated by training according to the standard action characteristic information arranged according to the time sequence.
In one embodiment, the step of extracting, by openpos, the motion feature information in the motion video of the acquired subject to be recognized includes:
acquiring the motion video of the main body to be recognized by taking a preset time period or a beat as a time unit, and extracting image frames from the motion video of the main body to be recognized by utilizing OpenPose; s320, determining a plurality of skeleton key feature points of the main body to be identified according to the extracted image frame; s330, integrating the plurality of skeleton key feature points into skeleton key point information of the main body to be recognized.
Further, to achieve the above object, the present invention also provides a computer readable storage medium, in which an LSTM based body posture identifying program is stored, and when the LSTM based body posture identifying program is executed by a processor, the steps of the LSTM based body posture identifying method as described above are implemented.
The method takes a video set of standard actions as a training dictionary, takes a neural network LSTM comprising a Masking layer and a Softmax layer as a model, takes human skeleton characteristic points in the video of the standard actions as targets and takes a connecting meaning time classifier CTC as a training criterion to train the model, thereby obtaining a trained body posture recognition model. The body posture recognition method based on the LSTM can realize the effects of shortening the model training period and improving the model training precision, does not need to cut video actions into isolated features for recognition, and is matched with a neural network model for learning and recognition; the method is quick and accurate, and the user experience is improved.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the LSTM-based body gesture recognition method of the present invention;
FIG. 2 is a schematic diagram of key feature points of the skeleton of the present invention;
FIG. 3 is a flowchart illustrating a method for obtaining skeleton key point information according to a preferred embodiment of the present invention;
FIG. 4 is a schematic diagram of an application environment of the LSTM-based body gesture recognition method according to an embodiment of the present invention;
the implementation, functional features and advantages of the objects of the present invention will be further described with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The invention provides a body posture recognition method based on LSTM. Referring to fig. 1, a flow chart of a preferred embodiment of the LSTM-based body gesture recognition method of the present invention is shown. The method may be performed by an apparatus, which may be implemented by software and/or hardware.
It should be noted that we use a Recurrent Neural Network (RNN) based on LSTM (Long-Short Term Memory) to build a basic framework for learning effective features and modeling a dynamic process of a time domain, thereby implementing end-to-end behavior identification and detection. The long-term and short-term memory network is a time cycle neural network and is suitable for processing and predicting important events with relatively long intervals and delays in a time sequence.
Specifically, the method is based on detection of human skeleton feature points of a dynamic video, an open source algorithm OpenPose algorithm proposed by the university of Kingjiron in a card is used for extracting human posture feature points in a game video, the extracted human posture feature points are compared with a model trained in a neural network, a difference range is obtained, the score with the minimum difference is the highest, and otherwise, the score is low, so that accurate recognition of the body posture is achieved.
In this embodiment, the body posture recognition method based on LSTM includes: step S110-step S130.
And S110, acquiring the motion video of the subject to be identified.
In a specific embodiment, the video of the action of the subject to be scored is the video of the action of a contestant in a gymnastic game. For the obtained motion video, preprocessing or denoising is performed firstly.
S120, extracting the motion characteristic information in the motion video of the obtained subject to be identified through OpenPose; the action characteristic information at least comprises skeleton key point information.
The movement of human body can be described by the movement of key nodes of some main skeletons (skeleton key points for short), therefore, as long as the combination and tracking of a plurality of key nodes of human body skeletons can form the portrayal of various behaviors such as dancing, walking, running and the like, and the behaviors are identified by the movement of the key nodes of the human body skeletons.
Of course, joint displacement information of the human body, body surface key point displacement information, and the like may be used as the motion characteristic information of the subject to be identified.
Specifically, the invention optimizes recognition performance in one embodiment by introducing common features of skeleton key points in actions into the LSTM network as constraints of network parameter learning. A certain behavioral action of a person is often closely related to a set of some feature key points of the skeleton, and the interaction of the nodes in this set.
In a specific embodiment, key points of the eighth set of broadcast gymnastics, such as the "nose", "knee", "ankle" and "hand", constitute a set of nodes with discriminative power.
FIG. 2 is a schematic diagram of key feature points of the framework of the present invention. Referring to fig. 2, the number of the skeleton key point feature points is multiple, specifically 18 as shown.
OpenPose is a real-time multi-person key point detection library and is written based on OpenCV and Caffe; OpenPose has very strong performance, and 18 skeleton key feature points of the nose, the neck, the right shoulder, the right elbow, the right wrist, the left shoulder, the left elbow, the left wrist, the right hip, the right knee, the right ankle, the left hip, the left knee, the left ankle, the right eye, the left eye, the right ear and the left ear are selected for detection according to the skeleton characteristics and dance motion characteristics of a human body.
The subject to be identified is the broadcast gymnastics performer, and the action video of the broadcast gymnastics performer takes eight beats of 1, 2, 3, 4, 5, 6, 7 and 8 as a time unit. The number of frames of the video is determined according to the actual size of the video, and in the case of a broadcast gymnastics, the one eight beat is about 8 seconds, and the one beat is about 200 frames in terms of the frame rate 25. If the length of the video is 8 seconds, the frame number is 200, and each frame has 18 × 2 values, 200 × 36 values are input, and if the number of values in the video is not enough, zero is used for padding. That is, first, the sequence is converted into a fixed-length sequence, the sequence in the present application includes 200 × 36 values, and if the selected video is less than the sequence of the length, 0 is complemented; the image frames are arranged in time sequence by taking eight beats as a fixed-length sequence and taking eight beats as a unit.
An exemplary description is as follows: the broadcast gymnastics (also known as "popular broadcast gymnastics") consists of 8 sections of free-hand gymnastics, totaling 4 minutes and 45 seconds. Each section of motion is divided into four eight beats; take lower limb movement as an example: the first eight-claps, 1, the left foot moves forward one step, the right foot moves backward to cushion the sole, the two arms bend the elbows at the same time, the chest is crossed, the fist is held by the hand, and the fist heart moves inward; the 18 skeletal key feature points in this one eight beat were captured.
Fig. 3 is a schematic diagram of a method for acquiring skeleton key point information according to the present invention. Referring to fig. 3, the method for obtaining the skeleton key point information includes: s310, acquiring a motion video of the main body to be recognized by taking a preset time period or a beat as a time unit, and extracting an image frame from the motion video of the main body to be recognized by utilizing OpenPose; s320, determining a plurality of skeleton key feature points of the main body to be identified according to the extracted image frame; s330, integrating the plurality of skeleton key feature points into skeleton key point information of the main body to be recognized.
The preset time interval or the beat is a time definition sequence unit, and a fixed time interval can be used as a unit, and a beat can also be selected as a unit.
It should be noted that, the skeleton key point information (x, y, v) includes three pieces of information, x and y are horizontal and vertical coordinate information in the image, and v represents state information of the skeleton key point, i.e., visible, invisible, and not in the image (or cannot be inferred). Wherein, not in the figure means that the skeleton key point is not located in the figure shown in the image frame.
And the information of the skeleton key points is stored in a JSON format. Each JSON file corresponds to a data set respectively, and each item in one JSON file stores the position of a human body frame and the position of a human body skeleton key point of one picture in the data set.
It should be further noted that the information stored in the data set includes the file name of the stored image, the position of the stored human body frame, and the position of the human body skeleton key point; the position of the human body frame comprises four parameters, the first two parameters are coordinate values of the upper left angular point of the human body frame, and the second two parameters are coordinate values of the lower right angular point of the human body frame; and in the key point position of the human skeleton, V represents the state of the key point, vi ═ 1 is visible, vi ═ 2 is invisible, and vi ═ 3 is not in the graph or can not be speculated. Wherein, only the information of skeleton key points in the vi ═ 1 visible state needs to be returned, and the key points in other states are replaced by (0, 0, 0).
S130, identifying the action standardization degree corresponding to the action characteristic information according to the action characteristic information and a body posture identification model generated by pre-training; the body posture recognition model is a target neural network model generated according to a preset standard action; and the target neural network model is generated by training according to the standard action characteristic information arranged according to the time sequence.
In a specific embodiment, the training sample data set is composed of a video set of normative movements, wherein, still taking broadcast gymnastics as an example, the training sample data set is composed of 8-section free-hand gymnastics of stretching movement, chest expanding movement, kicking movement, side movement, body turning movement, whole body movement, jumping movement and finishing movement; the inside involves various movements such as flexion, extension, rotation, balance, jumping, etc.; each section of free-hand operation can be decomposed into 4-5 actions; the training sample data set comprises 2640 videos consisting of 40 actions; each video consists of 2-10 actions. The start time and the end time of the motion are determined, and the characteristic information of the motion is arranged according to the time sequence in the time period.
Specifically, the form classifier can recognize each frame of motion through the motion video, and the motion category is limited to the motion category contained in the training data; still taking the example of a broadcast gymnastics, the action classes are limited to 40 action classes contained in the training data.
An exemplary description is as follows: a video with the beat of 10 seconds and the frame rate of 25 is 250 frames in total, each frame is detected by the skeleton key point of an openposition frame, and 18 values are output; concatenating 18 values, i.e. 250 x 18 values as input, into the network; and obtaining a classification model, and obtaining the classification of how many beats exist. The inputs to this model are a number of 18 values, and the outputs are the labels "X motion Y beat" and Loss value. The labels are illustrated below: "head movement beat 1".
In a specific embodiment, the objective function of the body posture recognition model is CTC; and the body posture recognition model outputs a Loss value through the CTC, and the smaller the Loss value is, the higher the action specification degree of the body to be recognized is.
The invention utilizes the LSTM neural network structure to combine with CTC to carry out the model training of the attitude classifier, and finally obtains the LSTM-CTC attitude classification model. Taking a video set of standard actions as a training dictionary, taking a layer of convolutional neural network CNN plus a five-layer time recursive neural network LSTM as a model, taking human skeleton feature points in the standard action video as targets, and taking a connection meaning time classifier CTC as a training criterion to train the model to obtain a trained CTC posture model; and in the network training process, a neural network model and a loss network function model are trained through a stochastic gradient descent algorithm. The neural network includes: LSTM model + CTC model; wherein: the LSTM + CTC outputs the corresponding class label and the Loss value. The image is firstly read through the RNN, then the image is converted into a matrix, then the number of rows and columns of the matrix is taken, and finally the shape is changed by using a reshape function. The transformed features were fed to the LSTM.
The specific LSTM algorithm is an algorithm proposed due to the deficiencies of gradient disappearance and gradient burst of RNN. And has short-term memory ability. The update of the gradient is generally performed using bptt (back Propagation Through time). In the LSTM network, the neurons in the general RNN network are replaced with blocks.
The network structure of the neural network includes: the front of the LSTM layer is a Masking layer, and the rear of the LSTM layer is a Softmax layer.
Wherein, the Masking layer has the function of filtering 0; the filter character is then specified in mask _ value in the Masking layer. As described above, the complementary 0 s in the sequence are all filtered out; in addition, the embedding layer also has a filtering function, but unlike the masking layer, it can only filter 0, cannot specify other characters, and because of the embedding layer, it maps sequences into a space of a fixed dimension. Therefore, using a Masking layer may be more suitable than using an Embedding layer. The Softmax layer is used for classification.
In a specific embodiment, the LSTM layer concatenates each frame of data according to a step size of 200;
the computation process of CTC Loss, i.e., the Loss of one forward propagation. Is the negative logarithm of the current sequence tag. Obtained using the following formula:
L(S)=-ln∏(x,z)∈Sp(z|x)=-∑(x,z)∈Slnp(z|x)
where p (z | x) represents the probability of outputting a sequence z given an input x, and S is the training set. That is, the product of the probabilities of outputting the correct label after a sample is given, and then taking the negative logarithm is the loss function, and after taking the negative sign, the probability of outputting the correct label is maximized by minimizing the loss function.
That is, before the body posture identification model generated according to the motion characteristic information and the pre-training identifies the motion normalization corresponding to the motion characteristic information, the body posture identification model is subjected to iterative training by using a CTC in a Softmax layer until the Loss value output by the CTC is greater than or equal to a set threshold value. In the training process of the model, the smaller the Loss value, the higher the accuracy of the model, and the general accuracy requirement is 98%.
According to the body posture identification method based on the LSTM, the differences between the competition actions, speed, direction, angles and the like of the competitors and the standard actions can be analyzed without wearing a sensor on the trainers, and the precision of competition scoring is improved.
The invention provides a body posture recognition device based on LSTM, which comprises a skeleton key point information acquisition unit, a body posture recognition model training unit and a body posture recognition model detection unit; the framework key point information acquisition unit is used for acquiring action videos of the participants by using the camera device; skeleton key point detection is carried out on the action video of the subject to be identified through OpenPose to obtain skeleton key point information of the participants; the body posture recognition model training unit is used for acquiring the body posture recognition model through a training step; the body posture recognition model detection unit is used for carrying out classification recognition on the action skeleton key point information by using a body posture recognition model to obtain a classification label and calculate a Loss function Loss; and judging the similarity between the main body action to be identified and the standard action according to the Loss function Loss.
The invention provides a body posture recognition method based on LSTM, which is applied to an electronic device 4. Fig. 4 is a schematic diagram of an application environment of the LSTM-based body posture recognition method according to a preferred embodiment of the present invention.
In the present embodiment, the electronic device 4 may be a terminal device having an arithmetic function, such as a server, a smart phone, a tablet computer, a portable computer, or a desktop computer.
The electronic device 4 includes: a processor 42, a memory 41, an imaging device 43, a network interface 44, and a communication bus 45.
The memory 41 includes at least one type of readable storage medium. The at least one type of readable storage medium may be a non-volatile storage medium such as a flash memory, a hard disk, a multimedia card, a card-type memory 41, and the like. In some embodiments, the readable storage medium may be an internal storage unit of the electronic device 4, such as a hard disk of the electronic device 4. In other embodiments, the readable storage medium may also be an external memory 41 of the electronic device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, provided on the electronic device 4.
In the present embodiment, the readable storage medium of the memory 41 is generally used for storing the LSTM-based body posture recognition program 40 and the like installed in the electronic apparatus 4. The memory 41 may also be used to temporarily store data that has been output or is to be output.
Processor 42, which in some embodiments may be a Central Processing Unit (CPU), microprocessor or other data Processing chip, executes program code or processes data stored in memory 41, such as executing LSTM-based body gesture recognition program 40.
The imaging device 43 may be a part of the electronic device 4 or may be independent of the electronic device 4. In some embodiments, the electronic device 4 is a terminal device having a camera, such as a smart phone, a tablet computer, a portable computer, etc., and the camera 43 is the camera of the electronic device 4. In other embodiments, the electronic device 4 may be a server, and the camera 43 is independent from the electronic device 4 and connected to the electronic device 4 through a network, for example, the camera 43 is installed in a specific location, such as an office or a monitoring area, captures a real-time image of an object entering the specific location in real time, and transmits the captured real-time image to the processor 42 through the network.
The network interface 44 may optionally include a standard wired interface, a wireless interface (e.g., a WI-FI interface), and is typically used to establish a communication link between the electronic apparatus 4 and other electronic devices.
A communication bus 45 is used to enable connection communication between these components.
Fig. 4 only shows the electronic device 4 with components 41-45, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may alternatively be implemented.
In one embodiment of the present invention, the electronic device 4 may further include a user interface, the user interface may include an input unit such as a Keyboard (Keyboard), a voice input device such as a microphone (microphone) or other devices with voice recognition function, a voice output device such as a sound box, an earphone, or other devices, and optionally, the user interface may further include a standard wired interface or a wireless interface.
Furthermore, the electronic device 4 may further comprise a display, which may also be referred to as a display screen or a display unit. In some embodiments, the display device may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch device, or the like. The display is used for displaying information processed in the electronic apparatus 4 and for displaying a visualized user interface.
In addition, the electronic device 4 further includes a touch sensor. The area provided by the touch sensor for the user to perform touch operation is called a touch area. Further, the touch sensor described herein may be a resistive touch sensor, a capacitive touch sensor, or the like. The touch sensor may include not only a contact type touch sensor but also a proximity type touch sensor. Further, the touch sensor may be a single sensor, or may be a plurality of sensors arranged in an array, for example.
The area of the display of the electronic device 4 may be the same as or different from the area of the touch sensor. Optionally, a display is layered with the touch sensor to form a touch display screen. The device detects touch operation triggered by a user based on the touch display screen.
Optionally, the electronic device 4 may further include a Radio Frequency (RF) circuit, a sensor, an audio circuit, and the like, which are not described in detail herein.
In the apparatus embodiment shown in fig. 4, a memory 41, which is a type of computer storage medium, may include therein an operating system, and an LSTM-based body gesture recognition program 40; the processor 42, when executing the LSTM based body pose recognition program 40 stored in the memory 41, implements the following steps:
s110, acquiring a motion video of a subject to be identified; s120, extracting the motion characteristic information in the motion video of the obtained subject to be identified through OpenPose; the action characteristic information at least comprises skeleton key point information; s130, identifying the action standardization degree corresponding to the action characteristic information according to the action characteristic information and a body posture identification model generated by pre-training; the body posture recognition model is a target neural network model generated according to a preset standard action; and the target neural network model is generated by training according to the standard action characteristic information arranged according to the time sequence. .
In the electronic device provided in the above embodiment, the video set of the standard motion is used as a training dictionary, the neural network LSTM including a Masking layer and a Softmax layer is used as a model, the human skeleton feature points in the video of the standard motion are used as targets, and the connection-meaning time classifier CTC is used as a training criterion to train the model, so as to obtain a trained body posture recognition model, thereby performing correct posture recognition.
In other embodiments, the LSTM-based body gesture recognition program 40 may also be divided into one or more modules, which are stored in the memory 41 and executed by the processor 42 to accomplish the present invention. A module as referred to herein is a series of computer program instruction segments capable of performing a specified function.
The LSTM-based body pose recognition program 40 may be segmented into: the method comprises a framework key point information acquisition subprogram, a body posture recognition model training subprogram and a body posture recognition model detection subprogram; the framework key point information acquisition subprogram is used for acquiring the action video of the main body to be identified by utilizing the camera device; skeleton key point detection is carried out on the action video of the main body to be recognized through OpenPose, and skeleton key point information of the main body to be recognized is obtained; a body posture recognition model training subprogram for obtaining the body posture recognition model through the training step; a body posture recognition model detection subprogram for classifying and recognizing the action skeleton key point information by using a body posture recognition model to obtain a classification label and calculate a Loss function Loss; and judging the similarity between the main body action to be identified and the standard action according to the Loss function Loss.
Furthermore, an embodiment of the present invention further provides a computer-readable storage medium, in which an LSTM-based body posture identifying program is stored, and when executed by a processor, the LSTM-based body posture identifying program implements the following operations:
s110, acquiring a motion video of a subject to be identified; s120, extracting the motion characteristic information in the motion video of the obtained subject to be identified through OpenPose; the action characteristic information at least comprises skeleton key point information; s130, identifying the action standardization degree corresponding to the action characteristic information according to the action characteristic information and a body posture identification model generated by pre-training; the body posture recognition model is a target neural network model generated according to a preset standard action; and the target neural network model is generated by training according to the standard action characteristic information arranged according to the time sequence. .
The specific implementation of the computer-readable storage medium of the present invention is substantially the same as the specific implementation of the LSTM-based body gesture recognition method and the electronic device, and will not be described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments. Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solution of the present invention essentially or contributing to the prior art can be embodied in the form of a software product, which is stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g. a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by the contents of the present specification and drawings, or used directly or indirectly in other related fields, are included in the scope of the present invention.

Claims (10)

1. An LSTM-based body posture recognition method applied to an electronic device is characterized by comprising the following steps:
acquiring a motion video of a subject to be identified;
extracting motion characteristic information in the obtained motion video of the main body to be identified through OpenPose; the action characteristic information at least comprises skeleton key point information;
identifying the action standardization degree corresponding to the action characteristic information according to the action characteristic information and a body posture identification model generated by pre-training; the body posture recognition model is a target neural network model generated according to a preset standard action; and the target neural network model is generated by training according to the standard action characteristic information arranged according to the time sequence.
2. The LSTM-based body gesture recognition method of claim 1, wherein the neural network of the body gesture recognition model comprises a Masking layer, a Softmax layer and an LSTM layer between the Masking layer and the Softmax layer.
3. The LSTM-based body gesture recognition method of claim 2, wherein the objective function of the body gesture recognition model is CTC; the body gesture recognition model outputs a Loss value through the CTC, and the smaller the output Loss value is, the higher the motion normalization of the body to be recognized corresponding to the Loss value is.
4. The LSTM-based body posture identifying method according to claim 1, wherein the step of extracting the motion feature information in the motion video of the obtained subject to be identified by openpos comprises:
acquiring the motion video of the main body to be recognized by taking a preset time period or a beat as a time unit, and extracting image frames from the motion video of the main body to be recognized by utilizing OpenPose;
determining a plurality of skeleton key feature points of the main body to be identified according to the extracted image frame;
and integrating the plurality of skeleton key feature points into skeleton key point information of the main body to be identified.
5. The LSTM-based body pose recognition method of claim 4, wherein the skeleton key point information is (x, y, v), where x and y are the abscissa and ordinate information of the skeleton key point, and v is the state information of the skeleton key point;
the state of the keypoints includes visible, invisible, and not within the graph.
6. The LSTM-based body posture recognition method according to claim 5, after extracting the motion feature information in the motion video of the obtained subject to be recognized through openpos, further comprising: and storing the skeleton key point information in a JSON format.
7. The LSTM-based body posture identifying method of claim 3, further comprising, before identifying the motion normalization corresponding to the motion feature information according to the motion feature information and a pre-trained body posture identifying model:
and (3) performing iterative training on the body posture recognition model by using the CTC in a Softmax layer until the Loss value output by the CTC is greater than or equal to a set threshold value.
8. An electronic device comprising a memory, a processor, and an imaging device, wherein an LSTM-based body pose recognition program is stored in the memory, and wherein the LSTM-based body pose recognition program, when executed by the processor, implements the steps of:
acquiring a motion video of a subject to be identified;
extracting motion characteristic information in the obtained motion video of the main body to be identified through OpenPose; the action characteristic information at least comprises skeleton key point information;
identifying the action standardization degree corresponding to the action characteristic information according to the action characteristic information and a body posture identification model generated by pre-training; the body posture recognition model is a target neural network model generated according to a preset standard action; and the target neural network model is generated by training according to the standard action characteristic information arranged according to the time sequence.
9. The electronic device of claim 8,
the step of extracting the motion characteristic information in the motion video of the obtained subject to be identified through openpos comprises the following steps:
acquiring the motion video of the main body to be recognized by taking a preset time period or a beat as a time unit, and extracting image frames from the motion video of the main body to be recognized by utilizing OpenPose;
determining a plurality of skeleton key feature points of the main body to be identified according to the extracted image frame;
and integrating the plurality of skeleton key feature points into skeleton key point information of the main body to be identified.
10. A computer readable storage medium, in which an LSTM based body pose recognition program is stored, which when executed by a processor, performs the steps of the LSTM based body pose recognition method according to any one of claims 1 to 7.
CN201910875154.8A 2019-09-17 2019-09-17 Body posture recognition method and device based on LSTM and storage medium Pending CN110705390A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201910875154.8A CN110705390A (en) 2019-09-17 2019-09-17 Body posture recognition method and device based on LSTM and storage medium
PCT/CN2019/117890 WO2021051579A1 (en) 2019-09-17 2019-11-13 Body pose recognition method, system, and apparatus, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910875154.8A CN110705390A (en) 2019-09-17 2019-09-17 Body posture recognition method and device based on LSTM and storage medium

Publications (1)

Publication Number Publication Date
CN110705390A true CN110705390A (en) 2020-01-17

Family

ID=69196078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910875154.8A Pending CN110705390A (en) 2019-09-17 2019-09-17 Body posture recognition method and device based on LSTM and storage medium

Country Status (2)

Country Link
CN (1) CN110705390A (en)
WO (1) WO2021051579A1 (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111346358A (en) * 2020-03-11 2020-06-30 嘉兴技师学院 Swimming training evaluation system and method based on convolutional neural network
CN111507192A (en) * 2020-03-19 2020-08-07 北京捷通华声科技股份有限公司 Appearance instrument monitoring method and device
CN111590560A (en) * 2020-04-24 2020-08-28 郭子睿 Method for remotely operating manipulator through camera
CN111680562A (en) * 2020-05-09 2020-09-18 北京中广上洋科技股份有限公司 Human body posture identification method and device based on skeleton key points, storage medium and terminal
CN111695457A (en) * 2020-05-28 2020-09-22 浙江工商大学 Human body posture estimation method based on weak supervision mechanism
CN111915460A (en) * 2020-05-07 2020-11-10 同济大学 AI vision-based intelligent scoring system for experimental examination
CN112232194A (en) * 2020-10-15 2021-01-15 广州云从凯风科技有限公司 Single-target human body key point detection method, system, equipment and medium
CN112529934A (en) * 2020-12-02 2021-03-19 北京航空航天大学杭州创新研究院 Multi-target tracking method and device, electronic equipment and storage medium
CN112990878A (en) * 2021-03-30 2021-06-18 北京大智汇领教育科技有限公司 Real-time correcting system and analyzing method for classroom teaching behaviors of teacher
CN113065504A (en) * 2021-04-15 2021-07-02 希亚思(上海)信息技术有限公司 Behavior identification method and device
CN113229810A (en) * 2021-06-22 2021-08-10 西安超越申泰信息科技有限公司 Human behavior recognition method and system and computer readable storage medium
CN114246582A (en) * 2021-12-20 2022-03-29 杭州慧光健康科技有限公司 System and method for detecting bedridden people based on long-term and short-term memory neural network
CN114821639A (en) * 2022-04-11 2022-07-29 西安电子科技大学广州研究院 Method and device for estimating and understanding human body posture in special scene
CN114882443A (en) * 2022-05-31 2022-08-09 江苏濠汉信息技术有限公司 Edge computing system applied to cable accessory construction
US11443558B2 (en) * 2019-11-12 2022-09-13 Omron Corporation Hand-eye, body part motion recognition and chronologically aligned display of recognized body parts
US20220358310A1 (en) * 2021-05-06 2022-11-10 Kuo-Yi Lin Professional dance evaluation method for implementing human pose estimation based on deep transfer learning

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113052097A (en) * 2021-03-31 2021-06-29 开放智能机器(上海)有限公司 Human body sitting posture real-time monitoring system and monitoring method
CN115188062B (en) * 2021-04-06 2024-02-27 广州视源电子科技股份有限公司 User running posture analysis method and device, running machine and storage medium
CN113191319B (en) * 2021-05-21 2022-07-19 河南理工大学 Human body posture intelligent recognition method and computer equipment
CN113239848B (en) * 2021-05-27 2024-02-02 数智引力(厦门)运动科技有限公司 Motion perception method, system, terminal equipment and storage medium
CN113723233B (en) * 2021-08-17 2024-03-26 之江实验室 Student learning participation assessment method based on hierarchical time sequence multi-example learning
CN113743319B (en) * 2021-09-07 2023-12-26 三星电子(中国)研发中心 Self-supervision type intelligent fitness scheme generation method and device
CN113989707A (en) * 2021-10-27 2022-01-28 福州大学 Public place queuing abnormal behavior detection method based on OpenPose and OpenCV
CN114596530B (en) * 2022-03-23 2022-11-18 中国航空油料有限责任公司浙江分公司 Airplane refueling intelligent management method and device based on non-contact optical AI
CN114870385A (en) * 2022-05-11 2022-08-09 安徽理工大学 Established long jump testing method based on optimized OpenPose model
CN115097946B (en) * 2022-08-15 2023-04-18 汉华智能科技(佛山)有限公司 Remote worship method, system and storage medium based on Internet of things
CN117218728A (en) * 2023-11-09 2023-12-12 深圳市微克科技有限公司 Body posture recognition method, system and medium of intelligent wearable device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10304208B1 (en) * 2018-02-12 2019-05-28 Avodah Labs, Inc. Automated gesture identification using neural networks
WO2019114696A1 (en) * 2017-12-13 2019-06-20 腾讯科技(深圳)有限公司 Augmented reality processing method, object recognition method, and related apparatus
CN110070029A (en) * 2019-04-17 2019-07-30 北京易达图灵科技有限公司 A kind of gait recognition method and device
CN110119703A (en) * 2019-05-07 2019-08-13 福州大学 The human motion recognition method of attention mechanism and space-time diagram convolutional neural networks is merged under a kind of security protection scene
WO2019157344A1 (en) * 2018-02-12 2019-08-15 Avodah Labs, Inc. Real-time gesture recognition method and apparatus
CN110135249A (en) * 2019-04-04 2019-08-16 华南理工大学 Human bodys' response method based on time attention mechanism and LSTM
CN110222551A (en) * 2018-03-02 2019-09-10 杭州海康威视数字技术股份有限公司 Method, apparatus, electronic equipment and the storage medium of identification maneuver classification

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI506461B (en) * 2013-07-16 2015-11-01 Univ Nat Taiwan Science Tech Method and system for human action recognition
CN108875708A (en) * 2018-07-18 2018-11-23 广东工业大学 Behavior analysis method, device, equipment, system and storage medium based on video
CN109886123B (en) * 2019-01-23 2023-08-29 平安科技(深圳)有限公司 Method and terminal for identifying human body actions

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019114696A1 (en) * 2017-12-13 2019-06-20 腾讯科技(深圳)有限公司 Augmented reality processing method, object recognition method, and related apparatus
US10304208B1 (en) * 2018-02-12 2019-05-28 Avodah Labs, Inc. Automated gesture identification using neural networks
WO2019157344A1 (en) * 2018-02-12 2019-08-15 Avodah Labs, Inc. Real-time gesture recognition method and apparatus
CN110222551A (en) * 2018-03-02 2019-09-10 杭州海康威视数字技术股份有限公司 Method, apparatus, electronic equipment and the storage medium of identification maneuver classification
CN110135249A (en) * 2019-04-04 2019-08-16 华南理工大学 Human bodys' response method based on time attention mechanism and LSTM
CN110070029A (en) * 2019-04-17 2019-07-30 北京易达图灵科技有限公司 A kind of gait recognition method and device
CN110119703A (en) * 2019-05-07 2019-08-13 福州大学 The human motion recognition method of attention mechanism and space-time diagram convolutional neural networks is merged under a kind of security protection scene

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11443558B2 (en) * 2019-11-12 2022-09-13 Omron Corporation Hand-eye, body part motion recognition and chronologically aligned display of recognized body parts
CN111346358A (en) * 2020-03-11 2020-06-30 嘉兴技师学院 Swimming training evaluation system and method based on convolutional neural network
CN111346358B (en) * 2020-03-11 2024-04-09 嘉兴技师学院 Swimming training evaluation system and method based on convolutional neural network
CN111507192A (en) * 2020-03-19 2020-08-07 北京捷通华声科技股份有限公司 Appearance instrument monitoring method and device
CN111590560A (en) * 2020-04-24 2020-08-28 郭子睿 Method for remotely operating manipulator through camera
CN111915460A (en) * 2020-05-07 2020-11-10 同济大学 AI vision-based intelligent scoring system for experimental examination
CN111915460B (en) * 2020-05-07 2022-05-13 同济大学 AI vision-based intelligent scoring system for experimental examination
CN111680562A (en) * 2020-05-09 2020-09-18 北京中广上洋科技股份有限公司 Human body posture identification method and device based on skeleton key points, storage medium and terminal
CN111695457B (en) * 2020-05-28 2023-05-09 浙江工商大学 Human body posture estimation method based on weak supervision mechanism
CN111695457A (en) * 2020-05-28 2020-09-22 浙江工商大学 Human body posture estimation method based on weak supervision mechanism
CN112232194A (en) * 2020-10-15 2021-01-15 广州云从凯风科技有限公司 Single-target human body key point detection method, system, equipment and medium
CN112529934A (en) * 2020-12-02 2021-03-19 北京航空航天大学杭州创新研究院 Multi-target tracking method and device, electronic equipment and storage medium
CN112529934B (en) * 2020-12-02 2023-12-19 北京航空航天大学杭州创新研究院 Multi-target tracking method, device, electronic equipment and storage medium
CN112990878A (en) * 2021-03-30 2021-06-18 北京大智汇领教育科技有限公司 Real-time correcting system and analyzing method for classroom teaching behaviors of teacher
CN113065504A (en) * 2021-04-15 2021-07-02 希亚思(上海)信息技术有限公司 Behavior identification method and device
US11823496B2 (en) * 2021-05-06 2023-11-21 Kuo-Yi Lin Professional dance evaluation method for implementing human pose estimation based on deep transfer learning
US20220358310A1 (en) * 2021-05-06 2022-11-10 Kuo-Yi Lin Professional dance evaluation method for implementing human pose estimation based on deep transfer learning
CN113229810A (en) * 2021-06-22 2021-08-10 西安超越申泰信息科技有限公司 Human behavior recognition method and system and computer readable storage medium
CN114246582A (en) * 2021-12-20 2022-03-29 杭州慧光健康科技有限公司 System and method for detecting bedridden people based on long-term and short-term memory neural network
CN114821639B (en) * 2022-04-11 2023-04-18 西安电子科技大学广州研究院 Method and device for estimating and understanding human body posture in special scene
CN114821639A (en) * 2022-04-11 2022-07-29 西安电子科技大学广州研究院 Method and device for estimating and understanding human body posture in special scene
CN114882443A (en) * 2022-05-31 2022-08-09 江苏濠汉信息技术有限公司 Edge computing system applied to cable accessory construction

Also Published As

Publication number Publication date
WO2021051579A1 (en) 2021-03-25

Similar Documents

Publication Publication Date Title
CN110705390A (en) Body posture recognition method and device based on LSTM and storage medium
CN108256433B (en) Motion attitude assessment method and system
Patrona et al. Motion analysis: Action detection, recognition and evaluation based on motion capture data
Kitsikidis et al. Dance analysis using multiple kinect sensors
CN111414839B (en) Emotion recognition method and device based on gesture
Thoutam et al. Yoga pose estimation and feedback generation using deep learning
CN107909060A (en) Gymnasium body-building action identification method and device based on deep learning
US9183431B2 (en) Apparatus and method for providing activity recognition based application service
CN109960962B (en) Image recognition method and device, electronic equipment and readable storage medium
CN110633004B (en) Interaction method, device and system based on human body posture estimation
CN107273857B (en) Motion action recognition method and device and electronic equipment
KR100907704B1 (en) Golfer's posture correction system using artificial caddy and golfer's posture correction method using it
Kamnardsiria et al. Knowledge-based system framework for training long jump athletes using action recognition
Malawski et al. Recognition of action dynamics in fencing using multimodal cues
WO2023108842A1 (en) Motion evaluation method and system based on fitness teaching training
Muhamada et al. Review on recent computer vision methods for human action recognition
JP2021135995A (en) Avatar facial expression generating system and avatar facial expression generating method
CN111833439A (en) Artificial intelligence-based ammunition throwing analysis and mobile simulation training method
Kishore et al. Spatial Joint features for 3D human skeletal action recognition system using spatial graph kernels
Shahjalal et al. An approach to automate the scorecard in cricket with computer vision and machine learning
Jiang et al. Deep learning algorithm based wearable device for basketball stance recognition in basketball
CN115188062B (en) User running posture analysis method and device, running machine and storage medium
CN114550071A (en) Method, device and medium for automatically identifying and capturing track and field video action key frames
CN116266415A (en) Action evaluation method, system and device based on body building teaching training and medium
CN113869127A (en) Human behavior detection method, monitoring device, electronic device, and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination