CN115880774A - Body-building action recognition method and device based on human body posture estimation and related equipment - Google Patents
Body-building action recognition method and device based on human body posture estimation and related equipment Download PDFInfo
- Publication number
- CN115880774A CN115880774A CN202211531164.8A CN202211531164A CN115880774A CN 115880774 A CN115880774 A CN 115880774A CN 202211531164 A CN202211531164 A CN 202211531164A CN 115880774 A CN115880774 A CN 115880774A
- Authority
- CN
- China
- Prior art keywords
- action
- human body
- data
- recognition
- key point
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000009471 action Effects 0.000 title claims abstract description 184
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000013441 quality evaluation Methods 0.000 claims abstract description 29
- 238000013528 artificial neural network Methods 0.000 claims abstract description 27
- 230000007787 long-term memory Effects 0.000 claims abstract description 7
- 230000033001 locomotion Effects 0.000 claims description 66
- 230000015654 memory Effects 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 15
- 238000004422 calculation algorithm Methods 0.000 claims description 14
- 238000001514 detection method Methods 0.000 claims description 11
- 238000003860 storage Methods 0.000 claims description 9
- 210000001624 hip Anatomy 0.000 claims description 8
- 238000001303 quality assessment method Methods 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 7
- 230000000007 visual effect Effects 0.000 claims description 7
- 238000013075 data extraction Methods 0.000 claims description 6
- 210000000629 knee joint Anatomy 0.000 claims description 5
- 238000013507 mapping Methods 0.000 claims description 5
- 238000010606 normalization Methods 0.000 claims description 5
- 239000000126 substance Substances 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 8
- 238000000605 extraction Methods 0.000 description 6
- 238000012800 visualization Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 210000003423 ankle Anatomy 0.000 description 2
- 210000001217 buttock Anatomy 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000006378 damage Effects 0.000 description 2
- 210000003127 knee Anatomy 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 210000000707 wrist Anatomy 0.000 description 2
- 208000025978 Athletic injury Diseases 0.000 description 1
- 206010023204 Joint dislocation Diseases 0.000 description 1
- 206010050031 Muscle strain Diseases 0.000 description 1
- 208000027418 Wounds and injury Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 210000000746 body region Anatomy 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 208000014674 injury Diseases 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000002427 irreversible effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004630 mental health Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Image Analysis (AREA)
Abstract
The invention discloses a body-building action recognition method, a body-building action recognition device and related equipment based on human body posture estimation, wherein the body-building action recognition method comprises the following steps of: obtaining a video stream, extracting human body key point data based on the video stream, and then inputting the human body key point data into a trained fitness action classification network for action recognition to obtain an action classification result, wherein the trained fitness action classification network is a superposition neural network of a multilayer neural network (MLP) and a long-term memory neural network (LSTM); acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value; and performing fitness action quality evaluation based on the similarity value corresponding to each group of identification action data to obtain a quality evaluation result, and improving the accuracy of fitness action normative identification by adopting the method.
Description
Technical Field
The invention relates to the field of data processing, in particular to a body-building action recognition method and device based on human body posture estimation and related equipment.
Background
Along with the improvement of the living economy level, the health consciousness of people is stronger and stronger, more and more people can also utilize leisure time to take exercises at home, moreover, the home fitness only needs simple sports equipment, is not limited by time and place, has the advantage of low participation threshold, more and more people participate in activities such as 'cloud fitness' and direct-broadcast fitness through portable equipment such as mobile phones and tablet computers 。
Although the participation threshold of body building on a home line is low, which is beneficial to keeping physical and mental health, the incorrect body building action can cause muscle strain, joint dislocation and other sports injuries, and causes irreversible injury to body building participants; moreover, body-building lacks the instruction of professional body-building coach on the house line, and the body-building person probably appears that the body-building action is incorrect, not standard, the action degree of completion is not high leads to invalid exercise scheduling problem, and among the current mode, mainly carry out body-building activity through equipment such as smart mobile phone, panel computer or smart TV, can't monitor the standardization of body-building action, probably appear above-mentioned motion risk to a certain extent. Therefore, how to ensure the standardization of body-building actions and the safety of body building when people perform body building becomes a difficult problem to be solved urgently.
Disclosure of Invention
The embodiment of the invention provides a body-building action recognition method and device based on human posture estimation and related equipment, so as to improve the accuracy of normative recognition of body-building actions.
In order to solve the above technical problem, an embodiment of the present application provides a body-building action recognition method based on human body posture estimation, including:
acquiring a video stream, and extracting human body key point data based on the video stream;
inputting the human body key point data into a trained body-building action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained body-building action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value;
and performing fitness action quality evaluation based on the similarity value corresponding to each group of identification actions to obtain a quality evaluation result.
Optionally, the extracting human key point data based on the video stream includes:
sequentially carrying out human body detection on the video frames in the video stream by adopting a target detection algorithm;
aiming at each video frame, if a human body is detected, adopting a human body posture tracking algorithm to carry out posture estimation on the detected human body to obtain preset regions and preset number of human body key point coordinates, and carrying out serialization processing on the preset number of human body key point coordinates to obtain an initial coordinate sequence;
and carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the key point data of the human body.
Optionally, the preset region and the preset number of coordinates of the key points of the human body include two coordinates of a hip region, and the centering processing on the coordinate sequence to obtain the target coordinate sequence includes:
calculating the mean value of two coordinates of the hip region in the initial coordinate sequence to be used as the central coordinate of the human body;
for each human body key point coordinate in the initial coordinate sequence, subtracting the coordinate value of the central coordinate of the human body from the coordinate value of the human body key point coordinate to obtain a corrected coordinate value;
and taking the sequence constructed by the corrected coordinate values as the target coordinate sequence.
Optionally, the calculating the similarity between the recognized motion data and the standard motion data of the corresponding category in the standard fitness motion dictionary to obtain a similarity value includes:
the similarity value cos (θ) is calculated using the following formula:
wherein, the first and the second end of the pipe are connected with each other,coordinates of the ith human body key point of the standard motion data, device for selecting or keeping>Coordinates of the ith personal key point of the identified motion data.
Optionally, the performing quality evaluation on the fitness actions based on the similarity value corresponding to each group of identification actions, and obtaining a quality evaluation result includes:
normalizing the similarity value by adopting the following formula to obtain normalization:
wherein score is a normalized score;
determining a quality assessment result based on the normalized score.
Optionally, after the human body key point data is input into a trained fitness motion classification network for motion recognition to obtain a motion classification result, the fitness motion recognition method based on human body posture estimation further includes:
acquiring an action attention area corresponding to the category in the action classification result, and taking the action attention area as a target area, wherein a mapping relation is preset between the action attention area and the category;
aiming at each video frame corresponding to the category, calculating the joint angle of the target area according to the human body key point data in the video frame;
and generating a visual fluctuation curve of the target area according to the joint angle of each target area, wherein the visual fluctuation curve is used for displaying the fitness completion degree.
Optionally, the category is a squat action category, and the calculating the joint angle of the target area according to the human body key point data in the video frame includes:
identifying a knee joint area in a video frame as a target area;
taking the key points of the human body in the target area range as joint points, and calculating the joint angle by adopting the following formula:
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (a), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
In order to solve the above technical problem, an embodiment of the present application further provides a body-building action recognition device based on human body posture estimation, including:
the data extraction module is used for acquiring video streams and extracting human body key point data based on the video streams;
the recognition and classification module is used for inputting the human body key point data into a trained fitness action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained fitness action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
the similarity calculation module is used for acquiring a preset standard fitness action dictionary, and for each recognition action data, performing similarity calculation on the recognition action data and standard action data of a corresponding category in the standard fitness action dictionary to obtain a similarity value;
and the quality evaluation module is used for carrying out fitness action quality evaluation on the basis of the similarity value corresponding to each group of identification action data to obtain a quality evaluation result.
Optionally, the data extraction module includes:
the target detection unit is used for sequentially carrying out human body detection on the video frames in the video stream by adopting a target detection algorithm;
the initial sequence generation unit is used for carrying out posture estimation on the detected human body by adopting a human body posture tracking algorithm if the human body is detected aiming at each video frame to obtain the coordinates of the human body key points in a preset area and a preset number, and carrying out serialization processing on the coordinates of the human body key points in the preset number to obtain an initial coordinate sequence;
and the target sequence determining unit is used for carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the human body key point data.
Optionally, the target sequence determination unit includes:
a central coordinate calculating subunit, configured to calculate an average value of two coordinates of the hip region in the initial coordinate sequence, as a central coordinate of the human body;
the coordinate value correction subunit is configured to subtract, for each human body key point coordinate in the initial coordinate sequence, a coordinate value of the center coordinate of the human body from a coordinate value of the human body key point coordinate to obtain a corrected coordinate value;
and the target sequence construction subunit is used for taking the sequence constructed by the corrected coordinate values as the target coordinate sequence.
Optionally, the similarity calculation module includes:
the similarity value cos (θ) is calculated using the following formula:
wherein, the first and the second end of the pipe are connected with each other,coordinates of the ith human body key point of the standard motion data, device for selecting or keeping>And the coordinates of the ith personal key point of the identified motion data.
Optionally, the quality assessment module comprises:
normalizing the similarity value by adopting the following formula to obtain normalization:
wherein score is a normalized score;
determining a quality assessment result based on the normalized score.
Optionally, the fitness action recognition device based on human body posture estimation further includes:
a target area obtaining module, configured to obtain an action attention area corresponding to a category in the action classification result, where the action attention area is used as a target area, and a mapping relationship is preset between the action attention area and the category;
the angle calculation module is used for calculating the joint angle of the target area according to the human body key point data in the video frames aiming at each video frame corresponding to the category;
and the fluctuation visualization module is used for generating a visualization fluctuation curve of the target area according to the joint angle of each target area, and the visualization fluctuation curve is used for displaying the fitness completion degree.
Optionally, the category is a squat action category, and the calculating of the joint angle of the target area according to the human body key point data in the video frame includes:
the target area positioning subunit is used for identifying a knee joint area in the video frame as a target area;
and the joint angle calculating subunit is used for calculating the joint angle by taking the human body key points in the target area range as joint points and adopting the following formula:
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (A), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
In order to solve the technical problem, an embodiment of the present application further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of the fitness action recognition method based on human body posture estimation when executing the computer program.
In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of the above fitness action recognition method based on human body posture estimation.
According to the body-building action recognition method, device, computer equipment and storage medium based on human posture estimation provided by the embodiment of the invention, the video stream is obtained, the human key point data are extracted based on the video stream, and then the human key point data are input into the trained body-building action classification network for action recognition, so that an action classification result is obtained, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained body-building action classification network is a superposition neural network of a multilayer neural network MLP and a long-short memory neural network LSTM; acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value; based on the similarity value corresponding to each group of identification motion data, the fitness motion quality is evaluated to obtain a quality evaluation result, so that the real-time fitness motion is quickly compared with the standard motion, and the accuracy of the normative identification of the fitness motion is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow diagram of one embodiment of a method for body-building motion recognition based on body pose estimation according to the present application;
FIG. 3 is a schematic diagram of the present application employing a modified BlazeDark algorithm for gesture recognition;
FIG. 4 is a schematic diagram of an embodiment of a fitness activity recognition device based on human body posture estimation according to the present application;
FIG. 5 is a schematic block diagram of one embodiment of a computer device according to the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof, in the description and claims of this application and the description of the above figures are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein may be combined with other embodiments.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, as shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, motion Picture Experts compression standard Audio Layer 3), MP4 players (Moving Picture Experts Group Audio Layer IV, motion Picture Experts compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that the body-building action recognition method based on human body posture estimation provided by the embodiment of the present application is executed by a server, and accordingly, a body-building action recognition device based on human body posture estimation is disposed in the server.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. Any number of terminal devices, networks and servers may be provided according to implementation needs, and the terminal devices 101, 102 and 103 in this embodiment may specifically correspond to an application system in actual production.
Referring to fig. 2, fig. 2 shows a method for recognizing a fitness action based on human body posture estimation according to an embodiment of the present invention, which is described by taking the application of the method to the server in fig. 1 as an example, and is detailed as follows:
s201: and acquiring a video stream, and extracting human body key point data based on the video stream.
When the video stream is a real-time video stream, the video frames are extracted according to a preset time interval to reduce the data volume and ensure that the body-building action is identified in time, and the preset time interval can be set according to actual needs, for example, 0.5 second and the like.
The human body key point data is data corresponding to preset key points in each region of the human body, and in this embodiment, the data may be specifically coordinate data.
It should be understood that the video stream contains a plurality of video frames, and a set of human body key point data is extracted from each video frame.
In a specific optional implementation, the extracting of the human body key point data based on the video stream comprises:
sequentially detecting human bodies of video frames in the video stream by adopting a target detection algorithm;
aiming at each video frame, if a human body is detected, a human body posture tracking algorithm is adopted to carry out posture estimation on the detected human body to obtain preset regions and preset number of human body key point coordinates, and the preset number of human body key point coordinates are serialized to obtain an initial coordinate sequence;
and carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the data of the key points of the human body.
Wherein, predetermine the region and refer to the preset human body region who draws the key point coordinate, predetermine the quantity and refer to the quantity of the key point of extraction, can specifically set for according to actual conditions, as preferred mode, in this embodiment, predetermine the quantity and be 17, predetermine the region and include nose, left eye, right eye, left ear, right ear, left shoulder, right shoulder, left elbow, right elbow, left wrist, right wrist, left buttockss, right buttockss, left knee, right knee, left ankle, right ankle.
Specifically, human body posture extraction is sequentially carried out on video frames in the collected video stream, if a human body exists in the video frames, posture estimation is carried out on the human body in the video frames, 17 human body key point coordinates of the human body are obtained, each coordinate comprises x and y two-dimensional data, and the dimensionality of the data obtained in each video frame containing the human body is [1,34 ]]If the human body can not be detected in the video frame, the next frame is continuously detected, and the obtained coordinate form of the human body key point is as follows: [ x ] 1 ,y 1 ,x 2 ,y 2 ...,x 17 ,y 17 ]And sequentially executing the operation on each video frame to obtain the human body key point data of each video frame containing the human body.
Preferably, the human body posture tracking algorithm of the present embodiment adopts an improved BlazeDark algorithm, which mainly includes a heatmap coding module, a feature extraction module, an upsampling module, a heatmap decoding module, and an output module, specifically as shown in fig. 3, fig. 3 is a schematic diagram of posture recognition performed by the improved BlazeDark algorithm of the present embodiment, where the upsampling module is composed of three upsampling layers and is configured to upsample a feature map generated by the feature extraction layer, and obtain a high-resolution feature in the feature extraction module by using a skip connection to restore a resolution of a picture, so as to better represent a heatmap result predicted by a network, and a feature map size output by the upsampling module is finally 17 × 64 × 64, and a training loss of the network is calculated here, and parameters are updated by using back propagation.
Further, the heatmap coding module specifically generates two-dimensional gaussian distribution for key points and surrounding pixel points thereof by using a two-dimensional gaussian distribution method for pixel positions corresponding to key points of a human body in an input picture containing the human body in a network training process so as to calculate the loss of the network;
the way of generating heatmap by two-dimensional Gaussian distribution is as follows:
where x, y represent the coordinates from which the heatmap is generated, u, v represent the real coordinates of the keypoints, and σ represents a fixed spatial variance.
Furthermore, the feature extraction module is formed by stacking a plurality of mobilebottlenerick volume blocks and BlazeBlock volume blocks with channel attention mechanisms, is used for extracting deep semantic features in an input picture, and has relatively few parameter quantities, the convolution operation is mainly completed by depth separable convolution, and can be operated on a CPU at a relatively high speed, wherein the size of the input picture is 3 × 256 × 256, and the size of the output feature picture is 192 × 8 × 8;
further, the heatmap decoding module is specifically configured to obtain, for a heatmap result obtained by prediction, a position coordinate of a key point in a picture by using log-likelihood estimation and second-order taylor expansion, and restore the position coordinate to an original input picture, so as to obtain a prediction result of the model;
further, the preset region and the preset number of the human body key point coordinates comprise two coordinates of a hip region, the coordinate sequence is subjected to centering processing, and the obtained target coordinate sequence comprises:
calculating the mean value of two coordinates of the hip area in the initial coordinate sequence as the central coordinate of the human body;
for each human body key point coordinate in the initial coordinate sequence, subtracting the coordinate value of the central coordinate of the human body from the coordinate value of the human body key point coordinate to obtain a corrected coordinate value;
and taking the sequence constructed by the corrected coordinate values as a target coordinate sequence.
It should be noted that the purpose of the centering processing is to eliminate the influence of the human body not being in the center of the video on the classification of the body-building actions in different videos, so as to improve the accuracy of the subsequent identification.
S202, inputting the human body key point data into a trained fitness action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained fitness action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM.
Further, before step S202, training the initial fitness action classification network by using a sample set to obtain a trained fitness action classification network, where the sample set includes a training set and a test set, and a data ratio of the training set to the test set is 4.
In a specific optional embodiment, the sample set includes 1000 video segments of 10 types of fitness motions, the types of the fitness motions include common fitness motions such as deep squat, sit-up, push-up and the like, each type of the fitness motions has 100 video segments, each video segment includes one exerciser who repeatedly performs the fitness motions, and the duration of the video segments is tens of seconds.
S203, a preset standard fitness action dictionary is obtained, and similarity calculation is carried out on the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary according to each recognition action data to obtain a similarity value.
In this embodiment, the standard human body key point coordinates of each type of fitness action are set, so as to obtain standard action data and a dictionary as a preset standard fitness action dictionary, wherein each set of standard action data included in the standard action dictionary corresponds to a group of categories, and the standard human body key point coordinates are obtained by clustering data in a sample set.
Optionally, the preset number is 17, and the similarity calculation is performed on the recognized motion data and the standard motion data of the corresponding category in the standard fitness motion dictionary to obtain a similarity value, including:
the similarity value cos (θ) is calculated using the following formula:
wherein the content of the first and second substances,coordinates of the ith individual key point of the standard motion data, device for selecting or keeping>To identify coordinates of the ith individual key point of the motion data.
S204, based on the similarity values corresponding to each group of identification motion data, performing fitness motion quality evaluation to obtain a quality evaluation result.
Specifically, the manner of determining the quality evaluation result according to the similarity value may be set according to actual needs, and the numerical ranges corresponding to the quality evaluation results of different grades may be set, which is not specifically limited herein.
Wherein the quality assessment result includes but is not limited to: excellent, good, general and abnormal.
Optionally, when the quality evaluation result is abnormal, an early warning measure for forcibly pausing the picture and prompting the picture with voice is executed, so that damage to the body caused by irregular action is avoided.
Optionally, the quality evaluation of the fitness action is performed based on the similarity value corresponding to each group of identification action data, and obtaining a quality evaluation result includes:
normalizing the similarity value by adopting the following formula to obtain normalization:
wherein score is a normalized score;
a quality assessment result is determined based on the normalized score.
In an optional implementation manner of this embodiment, after inputting the human body key point data into the trained fitness motion classification network for motion recognition to obtain a motion classification result, the fitness motion recognition method based on human body posture estimation further includes:
acquiring an action attention area corresponding to the category in the action classification result, and taking the action attention area as a target area, wherein a mapping relation is preset between the action attention area and the category;
aiming at each video frame corresponding to the category, calculating the joint angle of the target area according to the human body key point data in the video frame;
and generating a visual fluctuation curve of the target area according to the joint angle of each target area, wherein the visual fluctuation curve is used for displaying the fitness completion degree.
Optionally, the method further comprises: generating a standard fluctuation curve corresponding to the category; and when the generated visual fluctuation curve of the target area is more different from the standard fluctuation curve, carrying out early warning reminding.
The shape difference degree can be calculated through the similarity, and the early warning reminding mode includes but is not limited to voice reminding, pause picture and the like.
Further, the category is a deep squat action category, and calculating the joint angle of the target area according to the human body key point data in the video frame comprises:
identifying a knee joint area in a video frame as a target area;
taking the key points of the human body in the target area range as joint points, and calculating the joint angle by adopting the following formula:
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (A), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
In the embodiment, a video stream is obtained, human body key point data are extracted based on the video stream, and then the human body key point data are input into a trained body-building action classification network for action recognition, so that an action classification result is obtained, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained body-building action classification network is a superposition neural network of a multilayer neural network MLP and a long-time memory neural network LSTM; acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value; based on the similarity value corresponding to each group of identification motion data, the fitness motion quality is evaluated to obtain a quality evaluation result, so that the real-time fitness motion is quickly compared with the standard motion, and the accuracy of the normative identification of the fitness motion is improved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
Fig. 4 is a schematic block diagram of a body posture estimation-based fitness motion recognition device in one-to-one correspondence with the body posture estimation-based fitness motion recognition method according to the above-described embodiment. As shown in fig. 4, the fitness action recognition device based on human posture estimation comprises a data extraction module 31, a recognition classification module 32, a similarity calculation module 33 and a quality evaluation module 34. The functional modules are explained in detail as follows:
the data extraction module 31 is configured to obtain a video stream, and extract human body key point data based on the video stream;
the recognition and classification module 32 is configured to input the human body key point data into a trained fitness action classification network for action recognition, so as to obtain an action classification result, where the action classification result includes at least one group of recognition action data and a category corresponding to the recognition action data, and the trained fitness action classification network is a superimposed neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
the similarity calculation module 33 is configured to obtain a preset standard fitness action dictionary, and perform similarity calculation on the recognition action data and standard action data of a corresponding category in the standard fitness action dictionary for each recognition action data to obtain a similarity value;
and the quality evaluation module 34 is configured to perform fitness action quality evaluation based on the similarity value corresponding to each group of identification action data to obtain a quality evaluation result.
Optionally, the data extraction module 31 includes:
the target detection unit is used for sequentially carrying out human body detection on video frames in the video stream by adopting a target detection algorithm;
the initial sequence generation unit is used for carrying out posture estimation on the detected human body by adopting a human body posture tracking algorithm if the human body is detected aiming at each video frame to obtain preset region and preset number of human body key point coordinates, and carrying out serialization processing on the preset number of human body key point coordinates to obtain an initial coordinate sequence;
and the target sequence determining unit is used for carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the key point data of the human body.
Optionally, the target sequence determination unit comprises:
a central coordinate calculating subunit, configured to calculate an average value of two coordinates of the hip region in the initial coordinate sequence, as a central coordinate of the human body;
the coordinate value correction subunit is used for subtracting the coordinate value of the central coordinate of the human body from the coordinate value of the human body key point coordinate to obtain a corrected coordinate value aiming at each human body key point coordinate in the initial coordinate sequence;
and the target sequence construction subunit is used for taking the sequence constructed by the corrected coordinate values as a target coordinate sequence.
Optionally, the similarity calculation module 33 includes:
the similarity value cos (θ) is calculated using the following formula:
wherein the content of the first and second substances,coordinates of an ith individual body key point for standard action data>To identify coordinates of the ith individual key point of the motion data.
Optionally, the quality assessment module 34 comprises:
normalizing the similarity value by adopting the following formula to obtain normalization:
wherein score is a normalized score;
a quality assessment result is determined based on the normalized score.
Optionally, the body-building motion recognition device based on human posture estimation further includes:
the target area acquisition module is used for acquiring an action attention area corresponding to the category in the action classification result and taking the action attention area as a target area, and a mapping relation is preset between the action attention area and the category;
the angle calculation module is used for calculating the joint angle of the target area according to the human body key point data in the video frames aiming at each video frame corresponding to the category;
and the fluctuation visualization module is used for generating a visualization fluctuation curve of the target area according to the joint angle of each target area, and the visualization fluctuation curve is used for displaying the fitness completion degree.
Optionally, the category is a deep squat action category, and the joint angle calculation module for calculating the target area according to the human body key point data in the video frame includes:
the target area positioning subunit is used for identifying a knee joint area in the video frame as a target area;
and the joint angle calculating subunit is used for calculating the joint angle by taking the human body key points in the target area range as joint points and adopting the following formula:
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (a), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
For specific limitations of the fitness motion recognition device based on the human body posture estimation, reference may be made to the above limitations of the fitness motion recognition method based on the human body posture estimation, and details are not repeated here. The modules in the body-building action recognition device based on human body posture estimation can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In order to solve the technical problem, the embodiment of the application further provides computer equipment. Referring to fig. 5, fig. 5 is a block diagram of a basic structure of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, and a network interface 43, which are communicatively connected to each other via a system bus. It is noted that only the computer device 4 having the components connection memory 41, processor 42, network interface 43 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch panel or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or D interface display memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 4. Of course, the memory 41 may also include both internal and external storage devices of the computer device 4. In this embodiment, the memory 41 is generally used for storing an operating system installed in the computer device 4 and various types of application software, such as program codes for controlling electronic files. Further, the memory 41 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute the program code stored in the memory 41 or process data, such as executing the program code for exercise motion recognition based on body posture estimation.
The network interface 43 may comprise a wireless network interface or a wired network interface, and the network interface 43 is generally used for establishing communication connection between the computer device 4 and other electronic devices.
The present application further provides another embodiment, which is to provide a computer-readable storage medium storing an interface display program, which is executable by at least one processor to cause the at least one processor to execute the steps of the body-building motion recognition method based on human posture estimation as described above.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.
It should be understood that the above-described embodiments are merely exemplary of some, and not all, embodiments of the present application, and that the drawings illustrate preferred embodiments of the present application without limiting the scope of the claims appended hereto. This application is capable of embodiments in many different forms and is provided for the purpose of enabling a thorough understanding of the disclosure of the application. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.
Claims (10)
1. A body-building action recognition method based on human body posture estimation is characterized by comprising the following steps:
acquiring a video stream, and extracting human body key point data based on the video stream;
inputting the human body key point data into a trained body-building action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained body-building action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
acquiring a preset standard fitness action dictionary, and calculating the similarity of the recognition action data and standard action data of corresponding categories in the standard fitness action dictionary aiming at each recognition action data to obtain a similarity value;
and performing fitness action quality evaluation based on the similarity value corresponding to each group of identification action data to obtain a quality evaluation result.
2. The body-building motion recognition method based on human posture estimation as claimed in claim 1, wherein the extracting human key point data based on the video stream comprises:
sequentially carrying out human body detection on the video frames in the video stream by adopting a target detection algorithm;
aiming at each video frame, if a human body is detected, adopting a human body posture tracking algorithm to carry out posture estimation on the detected human body to obtain preset regions and preset number of human body key point coordinates, and carrying out serialization processing on the preset number of human body key point coordinates to obtain an initial coordinate sequence;
and carrying out centering processing on the initial coordinate sequence to obtain a target coordinate sequence, and taking the target coordinate sequence as the key point data of the human body.
3. A method for recognizing exercise motions based on human body posture estimation according to claim 2, wherein the preset region and the preset number of human body key point coordinates comprise two coordinates of a hip region, and the centering of the coordinate sequence to obtain a target coordinate sequence comprises:
calculating the mean value of two coordinates of the hip area in the initial coordinate sequence, and taking the mean value as the central coordinate of the human body;
for each human body key point coordinate in the initial coordinate sequence, subtracting the coordinate value of the central coordinate of the human body from the coordinate value of the human body key point coordinate to obtain a corrected coordinate value;
and taking the sequence constructed by the corrected coordinate values as the target coordinate sequence.
4. A body-building motion recognition method based on human posture estimation as claimed in claim 1, wherein the preset number is 17, and the calculating of similarity between the recognition motion data and the standard motion data of the corresponding category in the standard body-building motion dictionary to obtain the similarity value comprises:
the similarity value cos (θ) is calculated using the following formula:
5. The body-building motion recognition method based on human posture estimation as claimed in claim 4, wherein the performing of the body-building motion quality evaluation based on the similarity value corresponding to each group of the recognition motion data to obtain the quality evaluation result comprises:
normalizing the similarity value by adopting the following formula to obtain normalization:
wherein score is a normalized score;
determining a quality assessment result based on the normalized score.
6. The method for recognizing body building motion based on human body posture estimation as claimed in any one of claims 1 to 5, wherein after inputting the human body key point data into the trained body building motion classification network for motion recognition to obtain a motion classification result, the method for recognizing body building motion based on human body posture estimation further comprises:
acquiring an action attention area corresponding to the category in the action classification result, and taking the action attention area as a target area, wherein a mapping relation is preset between the action attention area and the category;
aiming at each video frame corresponding to the category, calculating the joint angle of the target area according to the human body key point data in the video frame;
and generating a visual fluctuation curve of the target area according to the joint angle of each target area, wherein the visual fluctuation curve is used for displaying the fitness completion degree.
7. A method as claimed in claim 6, wherein the category is squat category, and the calculating the joint angle of the target area according to the key point data of the human body in the video frame comprises:
identifying a knee joint area in a video frame as a target area;
taking the key points of the human body in the target area range as joint points, and calculating the joint angle by adopting the following formula:
wherein Angle is the Angle of the joint, P 2 P 1 Represents a joint point P 2 To the joint point P 1 Vector of (a), P 2 P 3 Represents a joint point P 2 To the joint point P 3 The vector of (2).
8. A body-building action recognition device based on human posture estimation is characterized by comprising:
the data extraction module is used for acquiring video streams and extracting human body key point data based on the video streams;
the recognition and classification module is used for inputting the human body key point data into a trained fitness action classification network for action recognition to obtain an action classification result, wherein the action classification result comprises at least one group of recognition action data and a category corresponding to the recognition action data, and the trained fitness action classification network is a superposition neural network of a multilayer neural network MLP and a long-term memory neural network LSTM;
the similarity calculation module is used for acquiring a preset standard fitness action dictionary, and for each recognition action data, performing similarity calculation on the recognition action data and standard action data of a corresponding category in the standard fitness action dictionary to obtain a similarity value;
and the quality evaluation module is used for carrying out fitness action quality evaluation on the basis of the similarity value corresponding to each group of identification action data to obtain a quality evaluation result.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the body posture estimation based fitness action recognition method according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the method for body-building motion recognition based on body posture estimation according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211531164.8A CN115880774A (en) | 2022-12-01 | 2022-12-01 | Body-building action recognition method and device based on human body posture estimation and related equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211531164.8A CN115880774A (en) | 2022-12-01 | 2022-12-01 | Body-building action recognition method and device based on human body posture estimation and related equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115880774A true CN115880774A (en) | 2023-03-31 |
Family
ID=85765324
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211531164.8A Pending CN115880774A (en) | 2022-12-01 | 2022-12-01 | Body-building action recognition method and device based on human body posture estimation and related equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115880774A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110544301A (en) * | 2019-09-06 | 2019-12-06 | 广东工业大学 | Three-dimensional human body action reconstruction system, method and action training system |
CN112990011A (en) * | 2021-03-15 | 2021-06-18 | 上海工程技术大学 | Body-building action recognition and evaluation method based on machine vision and deep learning |
CN113197572A (en) * | 2021-05-08 | 2021-08-03 | 解辉 | Human body work correction system based on vision |
CN113762133A (en) * | 2021-09-01 | 2021-12-07 | 哈尔滨工业大学(威海) | Self-weight fitness auxiliary coaching system, method and terminal based on human body posture recognition |
-
2022
- 2022-12-01 CN CN202211531164.8A patent/CN115880774A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110544301A (en) * | 2019-09-06 | 2019-12-06 | 广东工业大学 | Three-dimensional human body action reconstruction system, method and action training system |
CN112990011A (en) * | 2021-03-15 | 2021-06-18 | 上海工程技术大学 | Body-building action recognition and evaluation method based on machine vision and deep learning |
CN113197572A (en) * | 2021-05-08 | 2021-08-03 | 解辉 | Human body work correction system based on vision |
CN113762133A (en) * | 2021-09-01 | 2021-12-07 | 哈尔滨工业大学(威海) | Self-weight fitness auxiliary coaching system, method and terminal based on human body posture recognition |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111666857B (en) | Human behavior recognition method, device and storage medium based on environment semantic understanding | |
Mortazavi et al. | Determining the single best axis for exercise repetition recognition and counting on smartwatches | |
US20150092981A1 (en) | Apparatus and method for providing activity recognition based application service | |
CN112543936B (en) | Motion structure self-attention-drawing convolution network model for motion recognition | |
CN111597975B (en) | Personnel action detection method and device and electronic equipment | |
CN112418135A (en) | Human behavior recognition method and device, computer equipment and readable storage medium | |
CN115205764B (en) | Online learning concentration monitoring method, system and medium based on machine vision | |
WO2023040449A1 (en) | Triggering of client operation instruction by using fitness action | |
CN111274932B (en) | State identification method and device based on human gait in video and storage medium | |
WO2023108842A1 (en) | Motion evaluation method and system based on fitness teaching training | |
CN111223549A (en) | Mobile end system and method for disease prevention based on posture correction | |
CN114783061A (en) | Smoking behavior detection method, device, equipment and medium | |
CN113239849B (en) | Body-building action quality assessment method, body-building action quality assessment system, terminal equipment and storage medium | |
CN113781462A (en) | Human body disability detection method, device, equipment and storage medium | |
KR102298505B1 (en) | Apparatus and method of intravenous injection performance evaluation | |
CN113112185A (en) | Teacher expressive force evaluation method and device and electronic equipment | |
CN113392741A (en) | Video clip extraction method and device, electronic equipment and storage medium | |
CN117216313A (en) | Attitude evaluation audio output method, attitude evaluation audio output device, electronic equipment and readable medium | |
CN112381118A (en) | Method and device for testing and evaluating dance test of university | |
CN115880774A (en) | Body-building action recognition method and device based on human body posture estimation and related equipment | |
CN113641856A (en) | Method and apparatus for outputting information | |
CN114694256A (en) | Real-time tennis action identification method, device, equipment and medium | |
CN112633224A (en) | Social relationship identification method and device, electronic equipment and storage medium | |
CN113537122A (en) | Motion recognition method and device, storage medium and electronic equipment | |
CN113963202A (en) | Skeleton point action recognition method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |