CN115880774A - Body-building action recognition method and device based on human body posture estimation and related equipment - Google Patents

Body-building action recognition method and device based on human body posture estimation and related equipment Download PDF

Info

Publication number
CN115880774A
CN115880774A CN202211531164.8A CN202211531164A CN115880774A CN 115880774 A CN115880774 A CN 115880774A CN 202211531164 A CN202211531164 A CN 202211531164A CN 115880774 A CN115880774 A CN 115880774A
Authority
CN
China
Prior art keywords
action
human body
data
recognition
key point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211531164.8A
Other languages
Chinese (zh)
Inventor
余绍黔
谭孝文
杨华灵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan University of Technology
Original Assignee
Hunan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan University of Technology filed Critical Hunan University of Technology
Priority to CN202211531164.8A priority Critical patent/CN115880774A/en
Publication of CN115880774A publication Critical patent/CN115880774A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a body-building action recognition method, a body-building action recognition device and related equipment based on human body posture estimation, wherein the body-building action recognition method comprises the following steps of: obtaining a video stream, extracting human body key point data based on the video stream, and then inputting the human body key point data into a trained fitness action classification network for action recognition to obtain an action classification result, wherein the trained fitness action classification network is a superposition neural network of a multilayer neural network (MLP) and a long-term memory neural network (LSTM); acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value; and performing fitness action quality evaluation based on the similarity value corresponding to each group of identification action data to obtain a quality evaluation result, and improving the accuracy of fitness action normative identification by adopting the method.

Description

Body-building action recognition method and device based on human body posture estimation and related equipment
Technical Field
The invention relates to the field of data processing, in particular to a body-building action recognition method and device based on human body posture estimation and related equipment.
Background
Along with the improvement of the living economy level, the health consciousness of people is stronger and stronger, more and more people can also utilize leisure time to take exercises at home, moreover, the home fitness only needs simple sports equipment, is not limited by time and place, has the advantage of low participation threshold, more and more people participate in activities such as 'cloud fitness' and direct-broadcast fitness through portable equipment such as mobile phones and tablet computers
Although the participation threshold of body building on a home line is low, which is beneficial to keeping physical and mental health, the incorrect body building action can cause muscle strain, joint dislocation and other sports injuries, and causes irreversible injury to body building participants; moreover, body-building lacks the instruction of professional body-building coach on the house line, and the body-building person probably appears that the body-building action is incorrect, not standard, the action degree of completion is not high leads to invalid exercise scheduling problem, and among the current mode, mainly carry out body-building activity through equipment such as smart mobile phone, panel computer or smart TV, can't monitor the standardization of body-building action, probably appear above-mentioned motion risk to a certain extent. Therefore, how to ensure the standardization of body-building actions and the safety of body building when people perform body building becomes a difficult problem to be solved urgently.
Disclosure of Invention
The embodiment of the invention provides a body-building action recognition method and device based on human posture estimation and related equipment, so as to improve the accuracy of normative recognition of body-building actions.
In order to solve the above technical problem, an embodiment of the present application provides a body-building action recognition method based on human body posture estimation, including:
acquiring a video stream, and extracting human body key point data based on the video stream;
inputting the human body key point data into a trained body-building action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained body-building action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value;
and performing fitness action quality evaluation based on the similarity value corresponding to each group of identification actions to obtain a quality evaluation result.
Optionally, the extracting human key point data based on the video stream includes:
sequentially carrying out human body detection on the video frames in the video stream by adopting a target detection algorithm;
aiming at each video frame, if a human body is detected, adopting a human body posture tracking algorithm to carry out posture estimation on the detected human body to obtain preset regions and preset number of human body key point coordinates, and carrying out serialization processing on the preset number of human body key point coordinates to obtain an initial coordinate sequence;
and carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the key point data of the human body.
Optionally, the preset region and the preset number of coordinates of the key points of the human body include two coordinates of a hip region, and the centering processing on the coordinate sequence to obtain the target coordinate sequence includes:
calculating the mean value of two coordinates of the hip region in the initial coordinate sequence to be used as the central coordinate of the human body;
for each human body key point coordinate in the initial coordinate sequence, subtracting the coordinate value of the central coordinate of the human body from the coordinate value of the human body key point coordinate to obtain a corrected coordinate value;
and taking the sequence constructed by the corrected coordinate values as the target coordinate sequence.
Optionally, the calculating the similarity between the recognized motion data and the standard motion data of the corresponding category in the standard fitness motion dictionary to obtain a similarity value includes:
the similarity value cos (θ) is calculated using the following formula:
Figure BDA0003976077500000031
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003976077500000032
coordinates of the ith human body key point of the standard motion data, device for selecting or keeping>
Figure BDA0003976077500000033
Coordinates of the ith personal key point of the identified motion data.
Optionally, the performing quality evaluation on the fitness actions based on the similarity value corresponding to each group of identification actions, and obtaining a quality evaluation result includes:
normalizing the similarity value by adopting the following formula to obtain normalization:
Figure BDA0003976077500000034
wherein score is a normalized score;
determining a quality assessment result based on the normalized score.
Optionally, after the human body key point data is input into a trained fitness motion classification network for motion recognition to obtain a motion classification result, the fitness motion recognition method based on human body posture estimation further includes:
acquiring an action attention area corresponding to the category in the action classification result, and taking the action attention area as a target area, wherein a mapping relation is preset between the action attention area and the category;
aiming at each video frame corresponding to the category, calculating the joint angle of the target area according to the human body key point data in the video frame;
and generating a visual fluctuation curve of the target area according to the joint angle of each target area, wherein the visual fluctuation curve is used for displaying the fitness completion degree.
Optionally, the category is a squat action category, and the calculating the joint angle of the target area according to the human body key point data in the video frame includes:
identifying a knee joint area in a video frame as a target area;
taking the key points of the human body in the target area range as joint points, and calculating the joint angle by adopting the following formula:
Figure BDA0003976077500000041
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (a), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
In order to solve the above technical problem, an embodiment of the present application further provides a body-building action recognition device based on human body posture estimation, including:
the data extraction module is used for acquiring video streams and extracting human body key point data based on the video streams;
the recognition and classification module is used for inputting the human body key point data into a trained fitness action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained fitness action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
the similarity calculation module is used for acquiring a preset standard fitness action dictionary, and for each recognition action data, performing similarity calculation on the recognition action data and standard action data of a corresponding category in the standard fitness action dictionary to obtain a similarity value;
and the quality evaluation module is used for carrying out fitness action quality evaluation on the basis of the similarity value corresponding to each group of identification action data to obtain a quality evaluation result.
Optionally, the data extraction module includes:
the target detection unit is used for sequentially carrying out human body detection on the video frames in the video stream by adopting a target detection algorithm;
the initial sequence generation unit is used for carrying out posture estimation on the detected human body by adopting a human body posture tracking algorithm if the human body is detected aiming at each video frame to obtain the coordinates of the human body key points in a preset area and a preset number, and carrying out serialization processing on the coordinates of the human body key points in the preset number to obtain an initial coordinate sequence;
and the target sequence determining unit is used for carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the human body key point data.
Optionally, the target sequence determination unit includes:
a central coordinate calculating subunit, configured to calculate an average value of two coordinates of the hip region in the initial coordinate sequence, as a central coordinate of the human body;
the coordinate value correction subunit is configured to subtract, for each human body key point coordinate in the initial coordinate sequence, a coordinate value of the center coordinate of the human body from a coordinate value of the human body key point coordinate to obtain a corrected coordinate value;
and the target sequence construction subunit is used for taking the sequence constructed by the corrected coordinate values as the target coordinate sequence.
Optionally, the similarity calculation module includes:
the similarity value cos (θ) is calculated using the following formula:
Figure BDA0003976077500000061
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003976077500000062
coordinates of the ith human body key point of the standard motion data, device for selecting or keeping>
Figure BDA0003976077500000063
And the coordinates of the ith personal key point of the identified motion data.
Optionally, the quality assessment module comprises:
normalizing the similarity value by adopting the following formula to obtain normalization:
Figure BDA0003976077500000064
wherein score is a normalized score;
determining a quality assessment result based on the normalized score.
Optionally, the fitness action recognition device based on human body posture estimation further includes:
a target area obtaining module, configured to obtain an action attention area corresponding to a category in the action classification result, where the action attention area is used as a target area, and a mapping relationship is preset between the action attention area and the category;
the angle calculation module is used for calculating the joint angle of the target area according to the human body key point data in the video frames aiming at each video frame corresponding to the category;
and the fluctuation visualization module is used for generating a visualization fluctuation curve of the target area according to the joint angle of each target area, and the visualization fluctuation curve is used for displaying the fitness completion degree.
Optionally, the category is a squat action category, and the calculating of the joint angle of the target area according to the human body key point data in the video frame includes:
the target area positioning subunit is used for identifying a knee joint area in the video frame as a target area;
and the joint angle calculating subunit is used for calculating the joint angle by taking the human body key points in the target area range as joint points and adopting the following formula:
Figure BDA0003976077500000065
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (A), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
In order to solve the technical problem, an embodiment of the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the fitness action recognition method based on human body posture estimation when executing the computer program.
In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the above fitness action recognition method based on human body posture estimation.
According to the body-building action recognition method, device, computer equipment and storage medium based on human posture estimation provided by the embodiment of the invention, the video stream is obtained, the human key point data are extracted based on the video stream, and then the human key point data are input into the trained body-building action classification network for action recognition, so that an action classification result is obtained, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained body-building action classification network is a superposition neural network of a multilayer neural network MLP and a long-short memory neural network LSTM; acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value; based on the similarity value corresponding to each group of identification motion data, the fitness motion quality is evaluated to obtain a quality evaluation result, so that the real-time fitness motion is quickly compared with the standard motion, and the accuracy of the normative identification of the fitness motion is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for body-building motion recognition based on body pose estimation according to the present application;
FIG. 3 is a schematic diagram of the present application employing a modified BlazeDark algorithm for gesture recognition;
FIG. 4 is a schematic diagram of an embodiment of a fitness activity recognition device based on human body posture estimation according to the present application;
FIG. 5 is a schematic block diagram of one embodiment of a computer device according to the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein may be combined with other embodiments.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, as shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that the body-building action recognition method based on human body posture estimation provided by the embodiment of the present application is executed by a server, and accordingly, a body-building action recognition device based on human body posture estimation is disposed in the server.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. Any number of terminal devices, networks and servers may be provided according to implementation needs, and the terminal devices 101, 102 and 103 in this embodiment may specifically correspond to an application system in actual production.
Referring to fig. 2, fig. 2 shows a method for recognizing a fitness action based on human body posture estimation according to an embodiment of the present invention, which is described by taking the application of the method to the server in fig. 1 as an example, and is detailed as follows:
s201: and acquiring a video stream, and extracting human body key point data based on the video stream.
When the video stream is a real-time video stream, the video frames are extracted according to a preset time interval to reduce the data volume and ensure that the body-building action is identified in time, and the preset time interval can be set according to actual needs, for example, 0.5 second and the like.
The human body key point data is data corresponding to preset key points in each region of the human body, and in this embodiment, the data may be specifically coordinate data.
It should be understood that the video stream contains a plurality of video frames, and a set of human body key point data is extracted from each video frame.
In a specific optional implementation, the extracting of the human body key point data based on the video stream comprises:
sequentially detecting human bodies of video frames in the video stream by adopting a target detection algorithm;
aiming at each video frame, if a human body is detected, a human body posture tracking algorithm is adopted to carry out posture estimation on the detected human body to obtain preset regions and preset number of human body key point coordinates, and the preset number of human body key point coordinates are serialized to obtain an initial coordinate sequence;
and carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the data of the key points of the human body.
Wherein, predetermine the region and refer to the preset human body region who draws the key point coordinate, predetermine the quantity and refer to the quantity of the key point of extraction, can specifically set for according to actual conditions, as preferred mode, in this embodiment, predetermine the quantity and be 17, predetermine the region and include nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left buttockss, right buttockss, left knee, right knee, left ankle, right ankle.
Specifically, human body posture extraction is sequentially carried out on video frames in the collected video stream, if a human body exists in the video frames, posture estimation is carried out on the human body in the video frames, 17 human body key point coordinates of the human body are obtained, each coordinate comprises x and y two-dimensional data, and the dimensionality of the data obtained in each video frame containing the human body is [1,34 ]]If the human body can not be detected in the video frame, the next frame is continuously detected, and the obtained coordinate form of the human body key point is as follows: [ x ] 1 ,y 1 ,x 2 ,y 2 ...,x 17 ,y 17 ]And sequentially executing the operation on each video frame to obtain the human body key point data of each video frame containing the human body.
Preferably, the human body posture tracking algorithm of the present embodiment adopts an improved BlazeDark algorithm, which mainly includes a heatmap coding module, a feature extraction module, an upsampling module, a heatmap decoding module, and an output module, specifically as shown in fig. 3, fig. 3 is a schematic diagram of posture recognition performed by the improved BlazeDark algorithm of the present embodiment, where the upsampling module is composed of three upsampling layers and is configured to upsample a feature map generated by the feature extraction layer, and obtain a high-resolution feature in the feature extraction module by using a skip connection to restore a resolution of a picture, so as to better represent a heatmap result predicted by a network, and a feature map size output by the upsampling module is finally 17 × 64 × 64, and a training loss of the network is calculated here, and parameters are updated by using back propagation.
Further, the heatmap coding module specifically generates two-dimensional gaussian distribution for key points and surrounding pixel points thereof by using a two-dimensional gaussian distribution method for pixel positions corresponding to key points of a human body in an input picture containing the human body in a network training process so as to calculate the loss of the network;
the way of generating heatmap by two-dimensional Gaussian distribution is as follows:
Figure BDA0003976077500000111
where x, y represent the coordinates from which the heatmap is generated, u, v represent the real coordinates of the keypoints, and σ represents a fixed spatial variance.
Furthermore, the feature extraction module is formed by stacking a plurality of mobilebottlenerick volume blocks and BlazeBlock volume blocks with channel attention mechanisms, is used for extracting deep semantic features in an input picture, and has relatively few parameter quantities, the convolution operation is mainly completed by depth separable convolution, and can be operated on a CPU at a relatively high speed, wherein the size of the input picture is 3 × 256 × 256, and the size of the output feature picture is 192 × 8 × 8;
further, the heatmap decoding module is specifically configured to obtain, for a heatmap result obtained by prediction, a position coordinate of a key point in a picture by using log-likelihood estimation and second-order taylor expansion, and restore the position coordinate to an original input picture, so as to obtain a prediction result of the model;
further, the preset region and the preset number of the human body key point coordinates comprise two coordinates of a hip region, the coordinate sequence is subjected to centering processing, and the obtained target coordinate sequence comprises:
calculating the mean value of two coordinates of the hip area in the initial coordinate sequence as the central coordinate of the human body;
for each human body key point coordinate in the initial coordinate sequence, subtracting the coordinate value of the central coordinate of the human body from the coordinate value of the human body key point coordinate to obtain a corrected coordinate value;
and taking the sequence constructed by the corrected coordinate values as a target coordinate sequence.
It should be noted that the purpose of the centering processing is to eliminate the influence of the human body not being in the center of the video on the classification of the body-building actions in different videos, so as to improve the accuracy of the subsequent identification.
S202, inputting the human body key point data into a trained fitness action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained fitness action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM.
Further, before step S202, training the initial fitness action classification network by using a sample set to obtain a trained fitness action classification network, where the sample set includes a training set and a test set, and a data ratio of the training set to the test set is 4.
In a specific optional embodiment, the sample set includes 1000 video segments of 10 types of fitness motions, the types of the fitness motions include common fitness motions such as deep squat, sit-up, push-up and the like, each type of the fitness motions has 100 video segments, each video segment includes one exerciser who repeatedly performs the fitness motions, and the duration of the video segments is tens of seconds.
S203, a preset standard fitness action dictionary is obtained, and similarity calculation is carried out on the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary according to each recognition action data to obtain a similarity value.
In this embodiment, the standard human body key point coordinates of each type of fitness action are set, so as to obtain standard action data and a dictionary as a preset standard fitness action dictionary, wherein each set of standard action data included in the standard action dictionary corresponds to a group of categories, and the standard human body key point coordinates are obtained by clustering data in a sample set.
Optionally, the preset number is 17, and the similarity calculation is performed on the recognized motion data and the standard motion data of the corresponding category in the standard fitness motion dictionary to obtain a similarity value, including:
the similarity value cos (θ) is calculated using the following formula:
Figure BDA0003976077500000131
wherein the content of the first and second substances,
Figure BDA0003976077500000132
coordinates of the ith individual key point of the standard motion data, device for selecting or keeping>
Figure BDA0003976077500000133
To identify coordinates of the ith individual key point of the motion data.
S204, based on the similarity values corresponding to each group of identification motion data, performing fitness motion quality evaluation to obtain a quality evaluation result.
Specifically, the manner of determining the quality evaluation result according to the similarity value may be set according to actual needs, and the numerical ranges corresponding to the quality evaluation results of different grades may be set, which is not specifically limited herein.
Wherein the quality assessment result includes but is not limited to: excellent, good, general and abnormal.
Optionally, when the quality evaluation result is abnormal, an early warning measure for forcibly pausing the picture and prompting the picture with voice is executed, so that damage to the body caused by irregular action is avoided.
Optionally, the quality evaluation of the fitness action is performed based on the similarity value corresponding to each group of identification action data, and obtaining a quality evaluation result includes:
normalizing the similarity value by adopting the following formula to obtain normalization:
Figure BDA0003976077500000141
wherein score is a normalized score;
a quality assessment result is determined based on the normalized score.
In an optional implementation manner of this embodiment, after inputting the human body key point data into the trained fitness motion classification network for motion recognition to obtain a motion classification result, the fitness motion recognition method based on human body posture estimation further includes:
acquiring an action attention area corresponding to the category in the action classification result, and taking the action attention area as a target area, wherein a mapping relation is preset between the action attention area and the category;
aiming at each video frame corresponding to the category, calculating the joint angle of the target area according to the human body key point data in the video frame;
and generating a visual fluctuation curve of the target area according to the joint angle of each target area, wherein the visual fluctuation curve is used for displaying the fitness completion degree.
Optionally, the method further comprises: generating a standard fluctuation curve corresponding to the category; and when the generated visual fluctuation curve of the target area is more different from the standard fluctuation curve, carrying out early warning reminding.
The shape difference degree can be calculated through the similarity, and the early warning reminding mode includes but is not limited to voice reminding, pause picture and the like.
Further, the category is a deep squat action category, and calculating the joint angle of the target area according to the human body key point data in the video frame comprises:
identifying a knee joint area in a video frame as a target area;
taking the key points of the human body in the target area range as joint points, and calculating the joint angle by adopting the following formula:
Figure BDA0003976077500000151
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (A), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
In the embodiment, a video stream is obtained, human body key point data are extracted based on the video stream, and then the human body key point data are input into a trained body-building action classification network for action recognition, so that an action classification result is obtained, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained body-building action classification network is a superposition neural network of a multilayer neural network MLP and a long-time memory neural network LSTM; acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value; based on the similarity value corresponding to each group of identification motion data, the fitness motion quality is evaluated to obtain a quality evaluation result, so that the real-time fitness motion is quickly compared with the standard motion, and the accuracy of the normative identification of the fitness motion is improved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 4 is a schematic block diagram of a body posture estimation-based fitness motion recognition device in one-to-one correspondence with the body posture estimation-based fitness motion recognition method according to the above-described embodiment. As shown in fig. 4, the fitness action recognition device based on human posture estimation comprises a data extraction module 31, a recognition classification module 32, a similarity calculation module 33 and a quality evaluation module 34. The functional modules are explained in detail as follows:
the data extraction module 31 is configured to obtain a video stream, and extract human body key point data based on the video stream;
the recognition and classification module 32 is configured to input the human body key point data into a trained fitness action classification network for action recognition, so as to obtain an action classification result, where the action classification result includes at least one group of recognition action data and a category corresponding to the recognition action data, and the trained fitness action classification network is a superimposed neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
the similarity calculation module 33 is configured to obtain a preset standard fitness action dictionary, and perform similarity calculation on the recognition action data and standard action data of a corresponding category in the standard fitness action dictionary for each recognition action data to obtain a similarity value;
and the quality evaluation module 34 is configured to perform fitness action quality evaluation based on the similarity value corresponding to each group of identification action data to obtain a quality evaluation result.
Optionally, the data extraction module 31 includes:
the target detection unit is used for sequentially carrying out human body detection on video frames in the video stream by adopting a target detection algorithm;
the initial sequence generation unit is used for carrying out posture estimation on the detected human body by adopting a human body posture tracking algorithm if the human body is detected aiming at each video frame to obtain preset region and preset number of human body key point coordinates, and carrying out serialization processing on the preset number of human body key point coordinates to obtain an initial coordinate sequence;
and the target sequence determining unit is used for carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the key point data of the human body.
Optionally, the target sequence determination unit comprises:
a central coordinate calculating subunit, configured to calculate an average value of two coordinates of the hip region in the initial coordinate sequence, as a central coordinate of the human body;
the coordinate value correction subunit is used for subtracting the coordinate value of the central coordinate of the human body from the coordinate value of the human body key point coordinate to obtain a corrected coordinate value aiming at each human body key point coordinate in the initial coordinate sequence;
and the target sequence construction subunit is used for taking the sequence constructed by the corrected coordinate values as a target coordinate sequence.
Optionally, the similarity calculation module 33 includes:
the similarity value cos (θ) is calculated using the following formula:
Figure BDA0003976077500000171
wherein the content of the first and second substances,
Figure BDA0003976077500000172
coordinates of an ith individual body key point for standard action data>
Figure BDA0003976077500000173
To identify coordinates of the ith individual key point of the motion data.
Optionally, the quality assessment module 34 comprises:
normalizing the similarity value by adopting the following formula to obtain normalization:
Figure BDA0003976077500000174
wherein score is a normalized score;
a quality assessment result is determined based on the normalized score.
Optionally, the body-building motion recognition device based on human posture estimation further includes:
the target area acquisition module is used for acquiring an action attention area corresponding to the category in the action classification result and taking the action attention area as a target area, and a mapping relation is preset between the action attention area and the category;
the angle calculation module is used for calculating the joint angle of the target area according to the human body key point data in the video frames aiming at each video frame corresponding to the category;
and the fluctuation visualization module is used for generating a visualization fluctuation curve of the target area according to the joint angle of each target area, and the visualization fluctuation curve is used for displaying the fitness completion degree.
Optionally, the category is a deep squat action category, and the joint angle calculation module for calculating the target area according to the human body key point data in the video frame includes:
the target area positioning subunit is used for identifying a knee joint area in the video frame as a target area;
and the joint angle calculating subunit is used for calculating the joint angle by taking the human body key points in the target area range as joint points and adopting the following formula:
Figure BDA0003976077500000181
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (a), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
For specific limitations of the fitness motion recognition device based on the human body posture estimation, reference may be made to the above limitations of the fitness motion recognition method based on the human body posture estimation, and details are not repeated here. The modules in the body-building action recognition device based on human body posture estimation can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In order to solve the technical problem, the embodiment of the application further provides computer equipment. Referring to fig. 5, fig. 5 is a block diagram of a basic structure of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, and a network interface 43, which are communicatively connected to each other via a system bus. It is noted that only the computer device 4 having the components connection memory 41, processor 42, network interface 43 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or D interface display memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 4. Of course, the memory 41 may also include both internal and external storage devices of the computer device 4. In this embodiment, the memory 41 is generally used for storing an operating system installed in the computer device 4 and various types of application software, such as program codes for controlling electronic files. Further, the memory 41 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute the program code stored in the memory 41 or process data, such as executing the program code for exercise motion recognition based on body posture estimation.
The network interface 43 may comprise a wireless network interface or a wired network interface, and the network interface 43 is generally used for establishing communication connection between the computer device 4 and other electronic devices.
The present application further provides another embodiment, which is to provide a computer-readable storage medium storing an interface display program, which is executable by at least one processor to cause the at least one processor to execute the steps of the body-building motion recognition method based on human posture estimation as described above.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
It should be understood that the above-described embodiments are merely exemplary of some, and not all, embodiments of the present application, and that the drawings illustrate preferred embodiments of the present application without limiting the scope of the claims appended hereto. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims (10)

1. A body-building action recognition method based on human body posture estimation is characterized by comprising the following steps:
acquiring a video stream, and extracting human body key point data based on the video stream;
inputting the human body key point data into a trained body-building action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained body-building action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value;
and performing fitness action quality evaluation based on the similarity value corresponding to each group of identification action data to obtain a quality evaluation result.
2. The body-building motion recognition method based on human posture estimation as claimed in claim 1, wherein the extracting human key point data based on the video stream comprises:
sequentially carrying out human body detection on the video frames in the video stream by adopting a target detection algorithm;
aiming at each video frame, if a human body is detected, adopting a human body posture tracking algorithm to carry out posture estimation on the detected human body to obtain preset regions and preset number of human body key point coordinates, and carrying out serialization processing on the preset number of human body key point coordinates to obtain an initial coordinate sequence;
and carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the key point data of the human body.
3. A method for recognizing exercise motions based on human body posture estimation according to claim 2, wherein the preset region and the preset number of human body key point coordinates comprise two coordinates of a hip region, and the centering of the coordinate sequence to obtain a target coordinate sequence comprises:
calculating the mean value of two coordinates of the hip area in the initial coordinate sequence, and taking the mean value as the central coordinate of the human body;
for each human body key point coordinate in the initial coordinate sequence, subtracting the coordinate value of the central coordinate of the human body from the coordinate value of the human body key point coordinate to obtain a corrected coordinate value;
and taking the sequence constructed by the corrected coordinate values as the target coordinate sequence.
4. A body-building motion recognition method based on human posture estimation as claimed in claim 1, wherein the preset number is 17, and the calculating of similarity between the recognition motion data and the standard motion data of the corresponding category in the standard body-building motion dictionary to obtain the similarity value comprises:
the similarity value cos (θ) is calculated using the following formula:
Figure FDA0003976077490000021
wherein the content of the first and second substances,
Figure FDA0003976077490000022
coordinates of the ith human body key point of the standard motion data, device for selecting or keeping>
Figure FDA0003976077490000023
And the coordinates of the ith personal key point of the identified motion data.
5. The body-building motion recognition method based on human posture estimation as claimed in claim 4, wherein the performing of the body-building motion quality evaluation based on the similarity value corresponding to each group of the recognition motion data to obtain the quality evaluation result comprises:
normalizing the similarity value by adopting the following formula to obtain normalization:
Figure FDA0003976077490000024
wherein score is a normalized score;
determining a quality assessment result based on the normalized score.
6. The method for recognizing body building motion based on human body posture estimation as claimed in any one of claims 1 to 5, wherein after inputting the human body key point data into the trained body building motion classification network for motion recognition to obtain a motion classification result, the method for recognizing body building motion based on human body posture estimation further comprises:
acquiring an action attention area corresponding to the category in the action classification result, and taking the action attention area as a target area, wherein a mapping relation is preset between the action attention area and the category;
aiming at each video frame corresponding to the category, calculating the joint angle of the target area according to the human body key point data in the video frame;
and generating a visual fluctuation curve of the target area according to the joint angle of each target area, wherein the visual fluctuation curve is used for displaying the fitness completion degree.
7. A method as claimed in claim 6, wherein the category is squat category, and the calculating the joint angle of the target area according to the key point data of the human body in the video frame comprises:
identifying a knee joint area in a video frame as a target area;
taking the key points of the human body in the target area range as joint points, and calculating the joint angle by adopting the following formula:
Figure FDA0003976077490000031
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (a), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
8. A body-building action recognition device based on human posture estimation is characterized by comprising:
the data extraction module is used for acquiring video streams and extracting human body key point data based on the video streams;
the recognition and classification module is used for inputting the human body key point data into a trained fitness action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained fitness action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
the similarity calculation module is used for acquiring a preset standard fitness action dictionary, and for each recognition action data, performing similarity calculation on the recognition action data and standard action data of a corresponding category in the standard fitness action dictionary to obtain a similarity value;
and the quality evaluation module is used for carrying out fitness action quality evaluation on the basis of the similarity value corresponding to each group of identification action data to obtain a quality evaluation result.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the body posture estimation based fitness action recognition method according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the method for body-building motion recognition based on body posture estimation according to any one of claims 1 to 7.
CN202211531164.8A 2022-12-01 2022-12-01 Body-building action recognition method and device based on human body posture estimation and related equipment Pending CN115880774A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211531164.8A CN115880774A (en) 2022-12-01 2022-12-01 Body-building action recognition method and device based on human body posture estimation and related equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211531164.8A CN115880774A (en) 2022-12-01 2022-12-01 Body-building action recognition method and device based on human body posture estimation and related equipment

Publications (1)

Publication Number Publication Date
CN115880774A true CN115880774A (en) 2023-03-31

Family

ID=85765324

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211531164.8A Pending CN115880774A (en) 2022-12-01 2022-12-01 Body-building action recognition method and device based on human body posture estimation and related equipment

Country Status (1)

Country Link
CN (1) CN115880774A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544301A (en) * 2019-09-06 2019-12-06 广东工业大学 Three-dimensional human body action reconstruction system, method and action training system
CN112990011A (en) * 2021-03-15 2021-06-18 上海工程技术大学 Body-building action recognition and evaluation method based on machine vision and deep learning
CN113197572A (en) * 2021-05-08 2021-08-03 解辉 Human body work correction system based on vision
CN113762133A (en) * 2021-09-01 2021-12-07 哈尔滨工业大学(威海) Self-weight fitness auxiliary coaching system, method and terminal based on human body posture recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544301A (en) * 2019-09-06 2019-12-06 广东工业大学 Three-dimensional human body action reconstruction system, method and action training system
CN112990011A (en) * 2021-03-15 2021-06-18 上海工程技术大学 Body-building action recognition and evaluation method based on machine vision and deep learning
CN113197572A (en) * 2021-05-08 2021-08-03 解辉 Human body work correction system based on vision
CN113762133A (en) * 2021-09-01 2021-12-07 哈尔滨工业大学(威海) Self-weight fitness auxiliary coaching system, method and terminal based on human body posture recognition

Similar Documents

Publication Publication Date Title
CN111666857B (en) Human behavior recognition method, device and storage medium based on environment semantic understanding
Mortazavi et al. Determining the single best axis for exercise repetition recognition and counting on smartwatches
US20150092981A1 (en) Apparatus and method for providing activity recognition based application service
CN112543936B (en) Motion structure self-attention-drawing convolution network model for motion recognition
CN111597975B (en) Personnel action detection method and device and electronic equipment
CN112418135A (en) Human behavior recognition method and device, computer equipment and readable storage medium
CN115205764B (en) Online learning concentration monitoring method, system and medium based on machine vision
WO2023040449A1 (en) Triggering of client operation instruction by using fitness action
CN111274932B (en) State identification method and device based on human gait in video and storage medium
WO2023108842A1 (en) Motion evaluation method and system based on fitness teaching training
CN111223549A (en) Mobile end system and method for disease prevention based on posture correction
CN114783061A (en) Smoking behavior detection method, device, equipment and medium
CN113239849B (en) Body-building action quality assessment method, body-building action quality assessment system, terminal equipment and storage medium
CN113781462A (en) Human body disability detection method, device, equipment and storage medium
KR102298505B1 (en) Apparatus and method of intravenous injection performance evaluation
CN113112185A (en) Teacher expressive force evaluation method and device and electronic equipment
CN113392741A (en) Video clip extraction method and device, electronic equipment and storage medium
CN117216313A (en) Attitude evaluation audio output method, attitude evaluation audio output device, electronic equipment and readable medium
CN112381118A (en) Method and device for testing and evaluating dance test of university
CN115880774A (en) Body-building action recognition method and device based on human body posture estimation and related equipment
CN113641856A (en) Method and apparatus for outputting information
CN114694256A (en) Real-time tennis action identification method, device, equipment and medium
CN112633224A (en) Social relationship identification method and device, electronic equipment and storage medium
CN113537122A (en) Motion recognition method and device, storage medium and electronic equipment
CN113963202A (en) Skeleton point action recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination