WO2023108842A1 - Motion evaluation method and system based on fitness teaching training - Google Patents

Motion evaluation method and system based on fitness teaching training Download PDF

Info

Publication number
WO2023108842A1
WO2023108842A1 PCT/CN2022/070026 CN2022070026W WO2023108842A1 WO 2023108842 A1 WO2023108842 A1 WO 2023108842A1 CN 2022070026 W CN2022070026 W CN 2022070026W WO 2023108842 A1 WO2023108842 A1 WO 2023108842A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
fitness
posture
standard
pose
Prior art date
Application number
PCT/CN2022/070026
Other languages
French (fr)
Chinese (zh)
Inventor
曾晓嘉
刘易
薛立君
Original Assignee
成都拟合未来科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202111523985.2A external-priority patent/CN116262171A/en
Priority claimed from CN202111523962.1A external-priority patent/CN116266415A/en
Application filed by 成都拟合未来科技有限公司 filed Critical 成都拟合未来科技有限公司
Publication of WO2023108842A1 publication Critical patent/WO2023108842A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A63SPORTS; GAMES; AMUSEMENTS
    • A63BAPPARATUS FOR PHYSICAL TRAINING, GYMNASTICS, SWIMMING, CLIMBING, OR FENCING; BALL GAMES; TRAINING EQUIPMENT
    • A63B71/00Games or sports accessories not covered in groups A63B1/00 - A63B69/00
    • A63B71/06Indicating or scoring devices for games or players, or for other sports activities

Definitions

  • the present disclosure relates to the field of fitness, in particular to an action evaluation method, system, device and medium based on fitness teaching and training.
  • myoelectric detection is to use the bio-myoelectric signal generated by human body movement to identify human body movements, but the user needs to wear the sensor when using it, which is mostly used for scientific research in specific scenarios and does not meet the needs of ordinary fitness.
  • the method of the airbag is similar to the electromyographic detection method, and it is also necessary to wear relevant sensors during the user's exercise to obtain its motion information.
  • the purpose of the present disclosure is to better obtain the user's training situation and user posture when performing fitness training for the user through the fitness video, and judge whether the user's posture is up to standard, thereby ensuring the fitness effect of the fitness video.
  • an action evaluation method based on fitness teaching and training including:
  • the standard posture of the present disclosure appears at a certain time point of the fitness video, but when recognizing the user posture, the present disclosure acquires the user posture in the interval before and after the time point, that is, in a time period, and uses this The user posture corresponding to each frame of image in the time period is compared with the standard posture. This is because the user follows the video training. If a novice has taken the course every time, his actions are lagging behind the course. If he is already familiar with the course Lesson learned, the user may perform actions ahead of time, so when in use, this disclosure expands the time point when the standard posture appears into a time period, and when making a comparison, the basis is accurate and the efficiency is higher.
  • the present disclosure compares several user postures with standard postures, and judges whether the user postures and standard postures are of the same type, specifically through the twin neural network model for comparison, specifically including:
  • the model training process of the present disclosure is trained through larger samples, such as gesture samples such as raising hands and legs, so that the model training learns how to extract gesture features, that is, mapping from 32 dimensions to 100 dimensions the process of.
  • a human body posture has a total of 16 bone points with two-dimensional coordinates, and each bone point has x and y coordinate components, so a human body posture can be abstracted into a 32-dimensional bone point vector, namely [x1, y1 ,x2,y2,x3,y3,...,x16,y16].
  • the 32-dimensional bone point vector will be mapped into a higher-dimensional vector.
  • the output vector in this disclosure is 100-dimensional, that is, the output vector V1 of the standard pose and the V1 of the user pose
  • the output vectors V2 are all 100-dimensional, namely [a1, a2, a3,..., a100].
  • the trained models of the standard pose and the user pose will each be mapped into a 100-dimensional vector, namely V1 and V2, and then the Euclidean distance between V1 and V2 will be calculated.
  • This disclosure uses a deep neural network, which accepts a 32-dimensional vector, that is, a human body pose in this disclosure, and then undergoes a series of intermediate layer operations, such as nonlinear correction, full connection, etc., and finally outputs a 100-dimensional vector.
  • This 100-dimensional vector is a highly abstract feature; in the end, if the two poses are very similar, the Euclidean distance between the two 100-dimensional vectors output by the network is very small, otherwise, the Euclidean distance is very large.
  • Our neural network has a total of 4 layers.
  • the number of nodes in each layer from input to output is 32->64->128->100, that is, a 32-dimensional vector is input and a 100-dimensional vector is output.
  • the user pose and standard pose are input into the standard pose recognition model, and the acquaintance score between the user pose and the standard pose is output, and the user pose with the highest acquaintance score is obtained as the scoring result, and the scoring result is used to judge whether the user pose and the standard pose are of the same type .
  • the Euclidean distance threshold T is obtained based on the standard gesture recognition model, and the threshold T is used to judge whether the user gesture and the standard gesture are of the same type, wherein, if the Euclidean distance output by the standard gesture recognition model is greater than or equal to the threshold T, then the user gesture and the standard gesture The standard poses are of the same type. If the Euclidean distance output by the standard pose recognition model is less than or equal to the threshold T, the user pose and the standard pose are of different types.
  • the Siamese network model After the Siamese network model is trained, it will look for a threshold T of the Euclidean distance on our test set: if the Euclidean distance of the two poses exceeds T, it is considered not to be of the same type; otherwise, it is considered to be the same model.
  • the ROC curve For each threshold T, the ROC curve can be drawn.
  • the area under the ROC curve, called AUC, is a value from 0 to 1. The larger the AUC, the better the model performance.
  • T-best set a critical score according to actual business needs, such as 40 points, which means that at this time, the model believes that the two postures are just at the critical point of similarity and dissimilarity.
  • mapping relationship is as follows: when the actual distance t is in the interval [0, T-best], the similarity score s is [100, 40]; when the actual distance t is in (T-best, infinity), the similarity score s is (40 ,0).
  • both the standard pose and the user pose include 16 bone key points, and the 16 bone key points correspond to a two-dimensional position coordinate.
  • 16 bone key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, the left knee, the left foot, and the patella .
  • the present disclosure also provides a fitness training method based on a fitness device, the fitness training method is based on the above-mentioned exercise evaluation method based on fitness teaching training, and the fitness training method includes:
  • the first user posture is acquired according to the first fitness video, the first user posture is scored, and the scoring result is fed back to the user.
  • the fitness video is played three times, and the first time mainly serves as a demonstration.
  • the user can understand the training content. In the process, it is only judged whether the user took an action, but it does not score the action that occurred, and does not judge whether it has done the same action.
  • the second fitness video can be played in slow motion for the fitness video played for the first time, or can be decomposed for the action of the first fitness video.
  • monitor Whether the user performs follow-up according to the second exercise video and scores the user's actions to determine whether the user has performed the same action as the exercise video, and then determines whether the user has performed follow-up.
  • the video played for the third time is the same as the video played for the first time, but when the video is played for the third time, the user's actions will be recognized, and at the same time, the score will be used to judge whether the user's actions are up to standard , thereby improving the fitness effect of the user.
  • the video played for the first time, the video played for the second time, and the video played for the third time are all videos with the same content, except that the video played for the second time is a slower version of the video played for the first time. Release or action decomposition.
  • the posture of the first user is obtained, and whether the user performs fitness training is judged, specifically including:
  • the fitness device when acquiring the first fitness video, in order to directly play the first fitness video in the fitness device, the fitness device can be a smart fitness mirror or other fitness devices that can play videos;
  • the fitness device When playing the first fitness video, the fitness device recognizes the target fitness area of the first fitness video, and the user is exercising in the target fitness area at this time, so when performing feature extraction on the target fitness area, the first user posture can be obtained;
  • the user After playing the video for the first time, the user has a preliminary familiarity with the actions in the fitness video. After familiarization, the second user's posture is obtained according to the second fitness video, and the second user's posture is scored. According to the scoring result, it is judged whether the user follows the first Two fitness videos for fitness training, including:
  • the acquaintance score of the second user posture based on the second standard posture is obtained, and judging whether the user follows the second fitness video for fitness training according to the scoring result.
  • identifying the target fitness area of the second fitness video, performing feature extraction on the target fitness area, and obtaining the second user posture specifically including:
  • the user's actions should be scored to determine whether the user follows the second fitness video.
  • Each frame is scored and judged, but the second standard posture is preset, and the second standard posture corresponding to each frame image in the first time period is obtained by comparing the first time period when the second standard posture appears in the second fitness video.
  • For the user pose compare the corresponding several frame images with the several second user poses to obtain the acquaintance score of each second user pose based on the corresponding frame images.
  • the specific scoring process includes:
  • the scoring result is greater than or equal to the scoring threshold, then the second standard posture and the second user posture are of the same type, and the user performs fitness training;
  • the scoring result is less than the scoring threshold, the second standard posture and the second user posture are not in the same category, and the user has not performed fitness training.
  • Obtain the first fitness video for fitness that is, play the first fitness video on the fitness device, and preset the first standard posture according to the first fitness video;
  • identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture specifically including:
  • each frame image in the first time period is compared with the corresponding second user gesture, and when the video is played for the third time, it is The first user posture in the second time period is compared with the first standard posture, so that when the video is played for the third time, it is possible to better confirm whether the actions followed by the user meet the standards and improve the efficiency of fitness teaching and training.
  • the present disclosure also provides an action evaluation system based on fitness teaching and training, including:
  • the obtaining module is used to obtain the fitness video, and obtain the time period corresponding to the preset standard posture in the fitness video, and obtain the frame images of the continuous moments corresponding to the fitness video in the time period;
  • An identification module configured to identify several user gestures corresponding to the frame images at consecutive moments in the above time period
  • a comparison module is used to compare several user postures and standard postures to obtain comparison results
  • the judging module is used to judge whether the user posture and the standard posture are of the same type according to the comparison result.
  • the comparison module compares several user postures and standard postures, specifically including training a twin neural network model to obtain a trained standard posture recognition model;
  • the judging module judges whether the user posture and the standard posture are of the same type according to the comparison result, specifically including inputting the user posture and the standard posture into the standard posture recognition model, and judging whether the user posture and the standard posture are of the same type.
  • the judging module inputs the user posture and the standard posture into the standard posture recognition model, and judges whether the user posture and the standard posture are of the same type, specifically including:
  • the judgment module obtains the Euclidean distance threshold T based on the standard posture recognition model, and the threshold T is used to judge whether the user posture and the standard posture are of the same type, wherein, if the Euclidean distance output by the standard posture recognition model is less than or equal to the threshold T, Then the user pose is of the same type as the standard pose, and if the Euclidean distance output by the standard pose recognition model is greater than the threshold T, then the user pose and the standard pose are of a different type.
  • both the standard posture and the user posture include 16 skeletal key points, each of which corresponds to a two-dimensional position coordinate, and the 16 skeletal key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, and the right hand. , left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella.
  • the judging module inputs the user posture and the standard posture into the standard posture recognition model, outputs the acquaintance score between the user posture and the standard posture, obtains the user posture with the highest acquaintance score as the scoring result, and judges the user posture and the standard posture based on the scoring result. Whether the gestures are of the same type.
  • the disclosure also provides a fitness training system based on the fitness device, the fitness training system is based on the above-mentioned exercise evaluation system based on fitness teaching and training, and the fitness training system executes:
  • the first user posture is acquired according to the first fitness video, the first user posture is scored, and the scoring result is fed back to the user.
  • the present disclosure also provides another fitness training system based on a fitness device.
  • the fitness training system includes:
  • the obtaining module is used to obtain the fitness video and process the fitness video
  • the identification module is used to identify the target fitness area of the fitness video, and extracts the features of the target fitness area to obtain the user's posture;
  • a comparison module is used to compare the user's posture with a preset standard posture to obtain a comparison result
  • the judging module is used to judge whether the user performs fitness training or whether the action meets the standard according to the comparison result.
  • the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, When the processor executes the computer program, the above-mentioned action evaluation method based on fitness teaching and training is realized.
  • the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, the When the processor executes the computer program, the above fitness device-based fitness training method is realized.
  • this disclosure also provides a storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the above-mentioned fitness-based Action evaluation methods for teaching and training.
  • the present disclosure also provides a storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the above fitness device-based training method is implemented. fitness training methods.
  • the preset standard posture can be a fixed-point movement, which can more accurately grasp the user's fitness training situation when evaluating the user's movement.
  • the model is constructed through a convolutional neural network, and the 16 key points of the skeleton of the human body and their corresponding two-dimensional position coordinates are used for identification, which has higher accuracy and can quickly determine whether the user's posture is standard or not. , improve efficiency.
  • the user when performing fitness teaching and training in this disclosure, the user is given fitness training action demonstration, slow motion teaching, and normal follow-up through three stages. Whether there is an action in the first stage, whether it is followed up in the second stage, whether the user’s action is up to the standard in the third stage, and feedback the scoring result to the user.
  • the identification and comparison is performed according to the first time period or the second time period corresponding to the preset first standard posture or the second standard posture, and it is not necessary to perform identification and comparison on the entire video.
  • the effect is better, the result is obtained faster, the user's fitness effect is effectively guaranteed, and the user's fitness situation can be obtained at each stage, which improves the efficiency of fitness use and is more convenient for long-term use.
  • Fig. 1 is the schematic flow chart of the action evaluation method based on fitness teaching training
  • Fig. 2 is a schematic diagram when two postures belong to different types
  • Figure 3 is a schematic diagram when two postures belong to the same type
  • Fig. 4 is ROC curve
  • Fig. 5 is a schematic flow chart of a fitness training method based on a fitness device
  • Fig. 6 is a schematic composition diagram of an action evaluation system based on fitness teaching and training or a fitness training system based on a fitness device.
  • the term “a” should be understood as “at least one” or “one or more”, that is, in one embodiment, the number of an element can be one, while in another embodiment, the number of the element
  • the quantity can be multiple, and the term “a” cannot be understood as a limitation on the quantity.
  • FIG. 1 is a schematic flowchart of an action evaluation method based on fitness teaching and training.
  • the present disclosure provides an action evaluation method based on fitness teaching and training. The method includes:
  • comparing several user postures and standard postures, and judging whether the user postures and standard postures are of the same type specifically include:
  • the Euclidean distance threshold T is obtained based on the standard gesture recognition model, and the threshold T is used to judge whether the user gesture and the standard gesture are of the same type, wherein, if the Euclidean distance output by the standard gesture recognition model is greater than or equal to the threshold T, then the user gesture and the standard gesture The standard poses are of the same type. If the Euclidean distance output by the standard pose recognition model is less than or equal to the threshold T, the user pose and the standard pose are of different types.
  • the standard pose and the user pose both include 16 bone key points, and the 16 bone key points correspond to a two-dimensional position coordinate.
  • the 16 bone key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, the left knee, the left foot, and the patella.
  • Operation 1 obtains a fitness video, and a standard posture is preset in the fitness video;
  • the fitness device is a fitness mirror, and the fitness video is played on the mirror surface of the fitness mirror;
  • Operation 2 obtains the time period corresponding to the standard posture in the fitness video, and obtains the frame images of the continuous moments corresponding to the fitness video in this time period;
  • Operation 3 recognizes several user gestures corresponding to the frame images at consecutive moments in the time period
  • the fitness mirror recognizes the user's actions in the target fitness area, extracts the features of the target fitness area, and obtains the user's posture;
  • Operation 4 compares several user postures with standard postures, and determines whether the user postures and standard postures are of the same type
  • Operation 4.1 train the twin neural network model to obtain the trained standard gesture recognition model
  • Operation 4.2 Obtain the bone key points of the standard pose and the user pose and the position coordinates corresponding to each bone key point.
  • the standard pose and the user pose both include 16 bone key points, and the 16 bone key points correspond to a two-dimensional position coordinate ;16 bone key points include top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella;
  • Operation 4.3 Input the position coordinates corresponding to each bone key point of the standard posture and the user posture into the trained standard posture recognition model, and obtain the output vector V1 of the standard posture and the output vector V2 of the user posture respectively;
  • the neural network has a total of 4 layers, and the number of nodes in each layer from input to output is 32->64->128->100, that is, a 32-dimensional vector is input, a 100-dimensional vector is output, and the European style of n-dimensional space
  • the standard gesture recognition model outputs two high-dimensional vectors, which in this embodiment are 100-dimensional vectors. If these two gestures belong to different types, as shown in Figure 2, then the two gestures are mapped to the high-dimensional space The Euclidean distance of the point will be very far; conversely, if the two poses belong to the same type as shown in Figure 3, the Euclidean distance of the points where the two poses are mapped to the high-dimensional space will be very close.
  • Operation 4.4 Obtain the Euclidean distance threshold T based on the standard gesture recognition model, and the threshold T is used to judge whether the user gesture and the standard gesture are of the same type;
  • the user gesture is of the same type as the standard gesture; if the Euclidean distance of the scoring result output by the standard gesture recognition model is less than or equal to the threshold T, then the user gesture and The standard poses are of different types;
  • Operation 4.5 converts the Euclidean distance into the acquaintance score between the user pose and the standard pose, and obtains the user pose with the highest acquaintance score as the scoring result.
  • the Euclidean distance of the two poses exceeds the threshold T, they are considered not to be of the same type; otherwise, they are considered to be of the same model.
  • the ROC curve can be drawn, as shown in Figure 4, the area under the ROC curve, called AUC, is a value of 0-1, the larger the AUC, the better the model performance.
  • AUC the area under the ROC curve
  • the model will judge as many poses that originally belonged to the same class as the same class as much as possible, and at the same time, it will try to misjudge two poses that do not belong to the same class as the same class as little as possible.
  • the mapping relationship is as follows: when the actual distance t is in the interval [0, T-best], the similarity score s is [100, 40]; when the actual distance t is in (T-best, infinity), the similarity score s is (40 ,0).
  • the threshold T is 40 in this embodiment.
  • the action evaluation method based on fitness teaching and training in the present disclosure uses a fitness training based on a fitness device
  • the method is to carry out fitness training action demonstration, action teaching and normal follow-up to the user through three stages respectively.
  • the user's action is recognized to judge whether the user has an action in the first stage, and the second In the third stage, whether the user's action is up to the standard, and the scoring result is fed back to the user.
  • the identification and comparison is performed according to the first time period or the second time period corresponding to the preset first standard posture or the second standard posture, and it is not necessary to perform identification and comparison on the entire video.
  • the effect is better, the result is obtained faster, the user's fitness effect is effectively guaranteed, and the user's fitness situation can be obtained at each stage, which improves the efficiency of fitness use and is more convenient for long-term use.
  • Operation 1 obtains the first body-building video for body-building, and in the present embodiment, the body-building device is a body-building mirror, and the first body-building video is played on the mirror surface of the body-building mirror;
  • Operation 2 Obtain the first user's posture according to the first fitness video, and determine whether the user is performing fitness training;
  • Operation 2.1 The user follows the first fitness video in the target fitness area of the fitness mirror;
  • Operation 2.2 Preset the first standard posture in the first fitness video
  • Operation 2.3 acquires the second time period when the first standard posture appears in the first fitness video
  • the fitness mirror recognizes the user's actions in the target fitness area, and performs feature extraction on the target fitness area.
  • the feature extraction includes 16 bone key points, and the 16 bone key points correspond to a two-dimensional position Coordinates; 16 bone key points include top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella , get the first user pose;
  • Operation 2.5 If the first user’s posture obtained within the second time period has different actions, it means that the user is doing follow-up, that is, the user is performing fitness training; if the first user is not obtained within the second time period posture, or the first user posture does not produce any action, then the user does not follow up, that is, the user does not perform fitness training. In operation 2, during the second time period, the user's action is recognized, but no scoring is performed during this process.
  • Operation 3 obtains the posture of the second user according to the second fitness video, scores the posture of the second user, and judges whether the user follows the second fitness video for fitness training according to the scoring result;
  • Operation 3.1 Play the second fitness video on the mirror surface of the fitness mirror
  • Operation 3.2 Preset the second standard posture according to the second fitness video
  • Operation 3.3 Obtain the first time period when the second standard posture appears in the second fitness video
  • Operation 3.4 In the first time period, obtain the video segment corresponding to the second fitness video in the first time period, perform frame processing on the video segment, and obtain frame images corresponding to several consecutive moments of the video segment;
  • the fitness mirror recognizes the user's actions in the target fitness area, and performs feature extraction on the target fitness area.
  • the feature extraction includes 16 bone key points, and the 16 bone key points correspond to a two-dimensional position Coordinates; 16 bone key points include top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella , to obtain a number of second user gestures corresponding to the frame images one-to-one;
  • Operation 3.6 compares corresponding several frame images and several second user poses, and obtains the acquaintance score of each second user pose based on the corresponding frame images;
  • Operation 3.61 train the twin neural network model to get the trained standard pose recognition model
  • Operation 3.62 Obtain the bone key points of the second standard pose and the second user pose and the position coordinates corresponding to each bone key point, wherein the second standard pose and the second user pose both include 16 bone key points and 16 bone key points
  • the points correspond to a two-dimensional position coordinate; 16 bone key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, Left knee, left foot, and patella; input the position coordinates corresponding to each bone key point of the second standard pose and the second user pose into the trained standard pose recognition model, and obtain the output vector V1 of the second standard pose and the second user pose respectively.
  • Whether the second user gesture is of the same type as the second standard gesture is judged by the Euclidean distance between the output vector V1 of the second standard gesture and the output vector V2 of the second user gesture.
  • the neural network has a total of 4 layers, and the number of nodes in each layer from input to output is 32->64->128->100, that is, a 32-dimensional vector is input, a 100-dimensional vector is output, and the European style of n-dimensional space
  • Operation 3.7 converts the Euclidean distance into the acquaintance score between the user posture and the standard posture, and obtains the user posture with the highest acquaintance score as the scoring result;
  • Operation 3.8 Obtain the second user posture with the highest acquaintance score as the scoring result; if the scoring result is greater than or equal to the scoring threshold, the second standard posture and the second user posture are of the same class, and the user performs fitness training; if the scoring result is less than or is equal to the scoring threshold, then the second standard posture and the second user posture are not in the same category, and the user does not perform fitness training.
  • the Euclidean distance threshold T is obtained based on the standard pose recognition model, and the threshold T is used to judge whether the user pose and the standard pose are of the same type; if the Euclidean distance of the scoring result output by the standard pose recognition model is greater than or equal to the threshold T, the user pose and the standard pose are of the same type, if the Euclidean distance of the scoring result output by the standard gesture recognition model is less than or equal to the threshold T, then the user gesture is not of the same type as the standard gesture.
  • Operation 4 acquires the first user posture according to the first exercise video, scores the first user posture, and feeds back the scoring result to the user;
  • Operation 4.1 The user follows the first fitness video in the target fitness area of the fitness mirror;
  • Operation 4.2 Preset the first standard posture in the first fitness video
  • Operation 4.3 acquires the second time period when the first standard posture appears in the first fitness video
  • the fitness mirror recognizes the user's actions in the target fitness area, and performs feature extraction on the target fitness area.
  • the feature extraction includes 16 bone key points, and the 16 bone key points correspond to a two-dimensional position Coordinates; 16 bone key points include top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella , get the first user pose;
  • Operation 4.5 compares the first standard posture with several first user postures, and obtains the acquaintance score of each first user posture based on the first standard posture;
  • Operation 4.51 Input the skeleton key points of the first standard posture and the first user posture and the position coordinates corresponding to each skeleton key point into the standard posture recognition model, and output the Euclidean distance between the first standard posture and the first user posture;
  • Operation 4.52 converts the Euclidean distance into an acquaintance score
  • Operation 4.53 Obtain the first user gesture with the highest acquaintance score as the scoring result; if the scoring result is greater than or equal to the scoring threshold, the first standard gesture and the first user gesture are of the same category, and the user action meets the standard; if the scoring result is less than or equal to scoring threshold, the first standard gesture and the first user gesture are not in the same category, and the user action does not meet the standard.
  • the Euclidean distance threshold T is obtained based on the standard pose recognition model, and the threshold T is used to judge whether the user pose and the standard pose are of the same type; if the Euclidean distance of the scoring result output by the standard pose recognition model is greater than or equal to the threshold T, the user pose and the standard pose are of the same type, if the Euclidean distance of the scoring result output by the standard gesture recognition model is less than or equal to the threshold T, then the user gesture is not of the same type as the standard gesture.
  • Operation 4.54 obtains the first user gesture with the highest acquaintance score as the scoring result, and feeds back the scoring result to the user;
  • the scoring threshold is 40.
  • the fitness video is played to 10,000 milliseconds, because the user is practicing with the video, and his movements are behind or ahead of the course, so we will set an interval around 10,000ms, such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200]
  • an interval around 10,000ms such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200]
  • the similarity between the first standard pose and the first user pose is calculated for each frame, and then the similarity in this interval is compared to the high
  • the score is output as the final score.
  • the fitness video is played to 10,000 milliseconds, because the user is practicing with the video, and his movements are behind or ahead of the course, so we will set an interval around 10,000ms, such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200] In this interval with a total duration of 1 second, calculate the similarity between each frame and the first user pose, and then use the highest similarity score in this interval as the final Score output.
  • both the first standard posture and the second standard posture are static postures, not a continuous movement.
  • the specific method of comparison is to train a Siamese network structure model based on convolutional neural network, which accepts two poses and maps the two poses to a point in high-dimensional space.
  • the second exercise video may be played in slow motion from the exercise video played for the first time, or may be an action breakdown of the first exercise video.
  • FIG. 5 is a schematic flowchart of a fitness training method based on a fitness device.
  • the present disclosure provides a fitness training method based on a fitness device.
  • the fitness training method is based on the above-mentioned action evaluation method based on fitness teaching and training.
  • the fitness The training method can be implemented as an application method of the above-mentioned action evaluation method based on fitness teaching and training, and the fitness training method includes:
  • Contrast the second standard posture and the second user posture obtain the acquaintance score of the second user posture based on the second standard posture, and judge whether the user follows the second fitness video for fitness training according to the scoring result;
  • identifying the target fitness area of the second fitness video, performing feature extraction on the target fitness area, and obtaining the second user posture specifically including:
  • the twin neural network model is trained to obtain the trained standard gesture recognition model
  • the scoring result is greater than or equal to the scoring threshold, then the second standard posture and the second user posture are of the same type, and the user performs fitness training;
  • the scoring result is less than the scoring threshold, the second standard posture and the second user posture are not in the same category, and the user has not performed fitness training.
  • identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture specifically including:
  • the scoring result is greater than or equal to the scoring threshold, the first standard gesture and the first user gesture are of the same category, and the user action is up to the standard;
  • the scoring result is less than the scoring threshold, the second standard gesture and the second user gesture are not of the same category, and the user action does not meet the standard.
  • Operation 1 obtains the first body-building video for body-building, and in the present embodiment, the body-building device is a body-building mirror, and the first body-building video is played on the mirror surface of the body-building mirror;
  • Operation 2 Obtain the first user's posture according to the first fitness video, and determine whether the user is performing fitness training;
  • Operation 2.1 The user follows the first fitness video in the target fitness area of the fitness mirror;
  • Operation 2.2 Preset the first standard posture in the first fitness video
  • Operation 2.3 acquires the second time period when the first standard posture appears in the first fitness video
  • the fitness mirror recognizes the user's actions in the target fitness area, extracts features from the target fitness area, and obtains the first user posture;
  • Operation 2.5 If the first user’s posture obtained within the second time period has different actions, it means that the user is doing follow-up, that is, the user is performing fitness training; if the first user is not obtained within the second time period posture, or the first user posture does not produce any action, then the user does not follow up, that is, the user does not perform fitness training. In operation 2, during the second time period, the user's action is recognized, but no scoring is performed during this process.
  • Operation 3 obtains the posture of the second user according to the second fitness video, scores the posture of the second user, and judges whether the user follows the second fitness video for fitness training according to the scoring result;
  • Operation 3.1 Play the second fitness video on the mirror surface of the fitness mirror
  • Operation 3.2 Preset the second standard posture according to the second fitness video
  • Operation 3.3 Obtain the first time period when the second standard posture appears in the second fitness video
  • Operation 3.4 In the first time period, obtain the video segment corresponding to the second fitness video in the first time period, perform frame processing on the video segment, and obtain frame images corresponding to several consecutive moments of the video segment;
  • the fitness mirror recognizes the user's actions in the target fitness area, performs feature extraction on the target fitness area, and obtains a number of second user postures corresponding to the frame images one-to-one;
  • Operation 3.6 compares corresponding several frame images and several second user poses, and obtains the acquaintance score of each second user pose based on the corresponding frame images;
  • Operation 3.61 trains the twin neural network model to obtain the trained standard gesture recognition model
  • Operation 3.62 Input the second user pose and the second standard pose into the standard pose recognition model to obtain the acquaintance score;
  • Operation 3.7 Get the second user posture with the highest acquaintance score as the scoring result; if the scoring result is greater than or equal to the scoring threshold, the second standard posture and the second user posture are of the same class, and the user performs fitness training; if the scoring result is less than the scoring threshold threshold, the second standard posture and the second user posture are not of the same type, and the user does not perform fitness training for the user.
  • Operation 4 acquires the first user posture according to the first exercise video, scores the first user posture, and feeds back the scoring result to the user;
  • Operation 4.1 The user follows the first fitness video in the target fitness area of the fitness mirror;
  • Operation 4.2 Preset the first standard posture in the first fitness video
  • Operation 4.3 acquires the second time period when the first standard posture appears in the first fitness video
  • the fitness mirror recognizes the user's actions in the target fitness area, extracts features from the target fitness area, and obtains the first user posture;
  • Operation 4.5 compares the first standard posture with several first user postures, and obtains the acquaintance score of each first user posture based on the first standard posture;
  • Operation 4.51 Input the first user pose and the first standard pose into the standard pose recognition model to obtain an acquaintance score
  • Operation 4.52 acquires the first user gesture with the highest acquaintance score as the scoring result
  • the scoring result is greater than or equal to the scoring threshold, then the first standard gesture and the first user gesture are of the same category, and the user action is up to the standard; if the scoring result is less than the scoring threshold, then the first standard gesture and the first user gesture are not of the same category, The user action is not up to standard.
  • the fitness video is played to 10,000 milliseconds, because the user is practicing with the video, and his movements are behind or ahead of the course, so we will set an interval around 10,000ms, such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200]
  • an interval around 10,000ms such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200]
  • the similarity between the first standard pose and the first user pose is calculated for each frame, and then the similarity in this interval is compared to the high
  • the score is output as the final score.
  • the fitness video is played to 10,000 milliseconds, because the user is practicing with the video, and his movements are behind or ahead of the course, so we will set an interval around 10,000ms, such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200] In this interval with a total duration of 1 second, calculate the similarity between each frame and the first user pose, and then use the highest similarity score in this interval as the final Score output.
  • both the first standard posture and the second standard posture are static postures, not a continuous movement.
  • the specific method of comparison is to train a Siamese network structure model based on convolutional neural network, which accepts two poses and maps the two poses to a point in high-dimensional space.
  • the user posture and standard posture are input into the standard posture recognition model, and the specific method for obtaining the acquaintance score is as follows:
  • 16 A skeleton key point includes the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, the left knee, the left foot, and the patella;
  • a human body posture has a total of 16 bone points with two-dimensional coordinates, and each bone point has x and y coordinate components, so a human body posture can be abstracted into a 32-dimensional bone point vector, namely [x1, y1 ,x2,y2,x3,y3,...,x16,y16].
  • the 32-dimensional bone point vector will be mapped into a higher-dimensional vector.
  • the output vector in this disclosure is 100-dimensional, that is, the output vector V1 of the standard pose and the V1 of the user pose
  • the output vectors V2 are all 100-dimensional, namely [a1, a2, a3,..., a100].
  • the trained models of the standard pose and the user pose will each be mapped into a 100-dimensional vector, namely V1 and V2, and then the Euclidean distance between V1 and V2 will be calculated.
  • This disclosure uses a deep neural network, which accepts a 32-dimensional vector, that is, a human body pose in this disclosure, and then undergoes a series of intermediate layer operations, such as nonlinear correction, full connection, etc., and finally outputs a 100-dimensional vector.
  • This 100-dimensional vector is a highly abstract feature; in the end, if the two poses are very similar, the Euclidean distance between the two 100-dimensional vectors output by the network is very small, otherwise, the Euclidean distance is very large.
  • Our neural network has a total of 4 layers.
  • the number of nodes in each layer from input to output is 32->64->128->100, that is, a 32-dimensional vector is input and a 100-dimensional vector is output.
  • the Euclidean distance threshold T is obtained based on the standard pose recognition model, and the threshold T is used to judge whether the user pose and the standard pose are of the same type; if the Euclidean distance of the scoring result output by the standard pose recognition model is greater than or equal to the threshold T, the user pose and the standard pose For the same type, if the Euclidean distance of the scoring result output by the standard gesture recognition model is less than or equal to the threshold T, then the user gesture and the standard gesture are not of the same type;
  • the Euclidean distance is converted into the acquaintance score between the user pose and the standard pose, and the user pose with the highest acquaintance score is obtained as the scoring result. Specifically, if the Euclidean distance of the two poses exceeds the threshold T, they are considered not to be of the same type; otherwise, they are considered to be of the same model.
  • the ROC curve can be drawn, as shown in Figure 4, the area under the ROC curve, called AUC, is a value of 0-1, the larger the AUC, the better the model performance. Find an optimal threshold T-best that maximizes AUC on the test set.
  • the model will judge as many poses that originally belonged to the same class as the same class as much as possible, and at the same time, it will try to misjudge two poses that do not belong to the same class as the same class as little as possible.
  • T-best we set a critical score according to actual business needs, such as 40 points, which means that at this time, the model believes that the two postures are just at the critical point of similarity and dissimilarity. Then the mapping relationship is as follows: when the actual distance t is in the interval [0, T-best], the similarity score s is [100, 40]; when the actual distance t is in (T-best, infinity), the similarity score s is (40 ,0).
  • the threshold T is 40 in this embodiment.
  • Both the standard pose and the user pose include 16 bone key points, each of which corresponds to a two-dimensional position coordinate.
  • 16 bone key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, the left knee, the left foot, and the patella .
  • Operation 1 is based on the third embodiment, and is further used in somatosensory games.
  • parkour games we set a villain or animal that simulates the user.
  • some villains that simulate the user will be set on the road Or the obstacles that small animals need to avoid, require small people or small animals to jump or lean left/right to avoid. Twisting the waist left/right corresponds to the left/right tilting of the villain or the animal, and the jumping of the user on the spot corresponds to the jumping of the villain or the animal.
  • some other actions can be set, such as the user’s Raising the legs corresponds to the accelerated running of small people or small animals.
  • the user follows the first fitness video in the target fitness area of the fitness mirror;
  • the fitness mirror recognizes the user's actions in the target fitness area, extracts the features of the target fitness area, and obtains the first user posture corresponding to standing, twisting left and right, jumping in situ, and raising legs;
  • the first user gesture acquired within the second time period has different actions, which means that the user is doing follow-up, that is, the user is performing fitness training; if the first user gesture is not acquired within the second time period, or If no action occurs in the first user gesture, the user does not follow suit, that is, the user does not perform fitness training.
  • the user's actions are recognized, but scoring is not performed during this process.
  • the fitness mirror recognizes the user's actions in the target fitness area, extracts the features of the target fitness area, and obtains the second user posture corresponding to standing, twisting the waist left and right, jumping in situ, and raising the legs.
  • each second user's posture is scored, and it is judged according to the scoring result whether the user follows the second fitness video for fitness training.
  • the fitness mirror recognizes the user's actions in the target fitness area, extracts the features of the target fitness area, and obtains the first user posture corresponding to standing, twisting left and right, jumping in situ, and raising legs; On the basis of Embodiment 3, each first user's standard posture is scored, and the scoring result is fed back to the user.
  • the fitness video is a game video
  • the movement of small people or small animals in the game video can be controlled through different standard gestures, so when the present disclosure recognizes the user's actions, the fitness training method of the present disclosure can not only It is used to identify actions and evaluate and score actions.
  • user postures can also control the movement of small people or small animals in fitness videos. Standing is used to control animals to move forward, and twisting left and right is used to control small people or small animals. The animal leans its body to the left or right, jumps in place to control the animal to jump, and raises its legs to control the small person or small animal to run faster.
  • Embodiment five Embodiment five
  • FIG. 6 may be a schematic diagram of the composition of an action evaluation system based on fitness teaching and training.
  • Embodiment 5 of the present disclosure provides an action evaluation system based on fitness teaching and training.
  • the action evaluation system includes:
  • the obtaining module is used to obtain the fitness video, and obtain the time period corresponding to the preset standard posture in the fitness video, and obtain the frame images of the continuous moments corresponding to the fitness video in the time period;
  • An identification module configured to identify several user gestures corresponding to the frame images at consecutive moments in the above time period
  • a comparison module is used to compare several user postures and standard postures to obtain comparison results
  • the judging module is used to judge whether the user posture and the standard posture are of the same type according to the comparison result.
  • the comparison module compares several user postures and standard postures, specifically including training a twin neural network model to obtain a trained standard posture recognition model;
  • the judging module judges whether the user posture and the standard posture are of the same type according to the comparison result, specifically including inputting the user posture and the standard posture into the standard posture recognition model, and judging whether the user posture and the standard posture are of the same type.
  • the judging module inputs the user posture and the standard posture into the standard posture recognition model, and judges whether the user posture and the standard posture are of the same type, specifically including:
  • the judgment module obtains the Euclidean distance threshold T based on the standard posture recognition model, and the threshold T is used to judge whether the user posture and the standard posture are of the same type, wherein, if the Euclidean distance output by the standard posture recognition model is less than or equal to the threshold T, the user pose is of the same type as the standard pose, and if the Euclidean distance output by the standard pose recognition model is greater than the threshold T, then the user pose is of a different type from the standard pose.
  • the standard posture and the user posture both include 16 skeleton key points, and the 16 skeleton key points correspond to a two-dimensional position coordinate respectively, and the 16 skeleton key points include the top of the head, the bottom of the head, the neck, and the right shoulder , right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella.
  • the judging module inputs the user posture and the standard posture into the standard posture recognition model, outputs the acquaintance score between the user posture and the standard posture, obtains the user posture with the highest acquaintance score as the scoring result, and passes the scoring result Determine whether the user gesture is of the same type as the standard gesture.
  • Embodiment 6 of the present disclosure provides a fitness training system based on a fitness device.
  • the fitness training system in Embodiment 6 can be implemented as an application system of the above-mentioned action evaluation system.
  • the fitness training system in Embodiment 6 executes:
  • FIG. 6 can be a schematic diagram of the composition of a fitness training system based on a fitness device.
  • Embodiment 7 of the present disclosure provides a fitness training system based on a fitness device.
  • the fitness training system in Embodiment 7 can be implemented as the above-mentioned action evaluation system.
  • Application system, the fitness training system includes:
  • the obtaining module is used to obtain the fitness video and process the fitness video
  • the identification module is used to identify the target fitness area of the fitness video, and extracts the features of the target fitness area to obtain the user's posture;
  • a comparison module is used to compare the user's posture with a preset standard posture to obtain a comparison result
  • the judging module is used to judge whether the user performs fitness training or whether the action meets the standard according to the comparison result.
  • Embodiment 8 of the present disclosure provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor.
  • the processor executes the computer program, the An action evaluation method based on fitness teaching and training.
  • the processor may be a central processing unit, or other general-purpose processors, digital signal processors, application-specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the memory can be used to store the computer programs and/or modules, and the processor can realize various functions of the disclosed exercise evaluation device based on fitness teaching and training by running or executing the data stored in the memory.
  • the memory may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application required by a function (such as a sound playback function, an image playback function, etc.) and the like.
  • the memory can include high-speed random access memory, and can also include non-volatile memory, such as hard disk, internal memory, plug-in hard disk, smart memory card, secure digital card, flash memory card, at least one magnetic disk storage device, flash memory device, or other volatile solid-state memory devices.
  • Embodiment 9 of the present disclosure provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor.
  • the processor executes the computer program, the A fitness training method based on a fitness device.
  • the processor may be a central processing unit, or other general-purpose processors, digital signal processors, application-specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components etc.
  • a general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
  • the memory can be used to store the computer programs and/or modules, and the processor can realize various functions of the fitness training device based on the fitness device in the disclosure by running or executing the data stored in the memory.
  • the memory may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application required by a function (such as a sound playback function, an image playback function, etc.) and the like.
  • the memory can include high-speed random access memory, and can also include non-volatile memory, such as hard disk, internal memory, plug-in hard disk, smart memory card, secure digital card, flash memory card, at least one magnetic disk storage device, flash memory device, or other volatile solid-state memory devices.
  • Embodiment 10 of the present disclosure provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the exercise evaluation method based on fitness teaching and training is realized.
  • the computer storage medium in the embodiments of the present disclosure may use any combination of one or more computer-readable media.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer readable storage medium may be, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • Embodiment 11 of the present disclosure provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the fitness training method based on a fitness device is implemented.
  • the computer storage medium in the embodiments of the present disclosure may use any combination of one or more computer-readable media.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer readable storage medium may be, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.

Abstract

A motion evaluation method, system, and apparatus based on fitness teaching training, and a medium, relating to the field of fitness. The method comprises: obtaining a fitness video, a standard posture being preset in the fitness video; obtaining a time period corresponding to the standard posture in the fitness video; obtaining image frames at continuous time points corresponding to the fitness video within the time period; identifying a plurality of user postures corresponding to the image frames at the continuous time points within the time period; and comparing the plurality of user postures with the standard posture, and determining whether a user posture and the standard posture are of the same type. During comparison, a model is constructed by means of a convolutional neural network, and identification is implemented by means of 16 skeleton key points of the human body and respective two-dimensional position coordinates thereof, such that the accuracy is high, whether a user posture is standard can be obtained quickly, and the use efficiency is improved.

Description

基于健身教学训练的动作评价方法及系统Movement evaluation method and system based on fitness teaching and training 技术领域technical field
本公开涉及健身领域,具体涉及基于健身教学训练的动作评价方法及系统及装置及介质。The present disclosure relates to the field of fitness, in particular to an action evaluation method, system, device and medium based on fitness teaching and training.
背景技术Background technique
近年来,国民的健康素养在不断提升,对健身运动的需求也在不断增加,健身行业的市场巨大。各种智能健身器材更是发展迅速,现有的镜面健身设备通过机体内置各类设备并由正面的屏幕进行显示和/或镜像展示,用户可对照显示的健身内容进行健身训练,这样在使用时,健身教练可以通过视频直播或提前录制的视频的方式来对多个用户进行健身教学,其中,健身教练在进行健身教学时,通常都是根据该多个用户大致的健身水平,如流瑜伽2级的健身水平等,来进行统一健身教学。但是,这样的健身训练教学方法存在很多问题,例如对用户跟做情况得不到及时的反馈,若是用户并没有跟做或跟做不标准,则健身效果较差,不便于长期使用。In recent years, the health literacy of the people has been continuously improved, and the demand for fitness exercises has also been increasing. The market for the fitness industry is huge. All kinds of intelligent fitness equipment are developing rapidly. The existing mirror fitness equipment is displayed and/or mirrored by the front screen through the built-in various equipment in the body. Users can perform fitness training according to the displayed fitness content. , fitness coaches can conduct fitness teaching to multiple users through live video or pre-recorded video, wherein, when fitness coaches conduct fitness teaching, they are usually based on the approximate fitness level of the multiple users, such as flow yoga 2 Class fitness level, etc., to carry out unified fitness teaching. However, there are many problems in such a fitness training teaching method. For example, timely feedback cannot be obtained on the user's follow-up situation.
在教学的过程中需要对用户的动作进行监控评价,看用户的动作是否达标,目前,常用人体行为检测方法包括有肌电检测,气囊传感器信息获取,视觉图像方法等。肌电检测的方法是利用人体运动产生的生物肌电信号来识别人体的动作,但是在使用的时候需要用户佩戴传感器,其多用于特定场景下的科学研究,并不符合平常健身的需求。气囊的方法与肌电检测法类似,同样需要在用户运动过程中佩戴相关传感器才能获取其运动信息。因此不管是肌电检测法还是气囊这一类基于使用佩戴传感器的方法需要用户佩戴传感器,并不适用于用户在日常健身锻炼中的姿态矫正。视觉图像的方法,此类方法需使用用户设备所拍摄的图像实现包括用户轮廓估计、用户骨架图估计等来估计用户的姿势和动作。主要应用是Openpose,其利用大量人类活动数据以及标签来训练一个图神经网络,进而对人体的姿势进行识别。但是在实际使用时,人体的姿势的识别能力较差,不能较好的检测用户动作是否标准。In the process of teaching, it is necessary to monitor and evaluate the user's actions to see whether the user's actions meet the standards. At present, commonly used human behavior detection methods include myoelectric detection, airbag sensor information acquisition, and visual image methods. The method of myoelectric detection is to use the bio-myoelectric signal generated by human body movement to identify human body movements, but the user needs to wear the sensor when using it, which is mostly used for scientific research in specific scenarios and does not meet the needs of ordinary fitness. The method of the airbag is similar to the electromyographic detection method, and it is also necessary to wear relevant sensors during the user's exercise to obtain its motion information. Therefore, whether it is myoelectric detection method or the method based on the use of wearing sensors such as airbags, the user needs to wear the sensor, which is not suitable for the user's posture correction in daily fitness exercises. Visual image methods, such methods need to use images captured by user equipment to implement user profile estimation, user skeleton map estimation, etc. to estimate the user's posture and actions. The main application is Openpose, which uses a large amount of human activity data and labels to train a graph neural network to recognize human body poses. However, in actual use, the ability to recognize the posture of the human body is poor, and it cannot better detect whether the user's action is standard.
发明内容Contents of the invention
本公开的目的在于在通过健身视频对用户进行健身训练时,能够更好的获取用户的训练情况,更好的获取用户姿态,判断用户姿态是否达标,进而保证健身视频进行健身的效果。The purpose of the present disclosure is to better obtain the user's training situation and user posture when performing fitness training for the user through the fitness video, and judge whether the user's posture is up to standard, thereby ensuring the fitness effect of the fitness video.
为实现上述公开目的,本公开提供了基于健身教学训练的动作评价方法,包括:In order to achieve the above disclosed purpose, the present disclosure provides an action evaluation method based on fitness teaching and training, including:
获取健身视频,健身视频中预设有标准姿态;Obtain fitness videos, in which standard postures are preset;
获取健身视频中标准姿态对应的时间段;Obtain the time period corresponding to the standard posture in the fitness video;
在该时间段中获取健身视频对应的连续时刻的帧图像;Acquire frame images of continuous moments corresponding to the fitness video during the time period;
识别该时间段中连续时刻的帧图像对应的若干用户姿态;Identify several user gestures corresponding to frame images at consecutive moments in the time period;
对比若干用户姿态和标准姿态,判断用户姿态与标准姿态是否为同一类型。Compare several user postures with standard postures, and determine whether the user postures and standard postures are of the same type.
对于本公开而言,在对用户的动作进行评价是否为同一类型时,并不是健身视频中教练的每个标准姿态均与用户姿态进行对比评判,而是只通过对比预设的标准姿态,以及标准姿态在健身视频中出现的时间段中,获取用户姿态进行对比,并且标准姿态与用户姿态均为静止的一个姿态,因此在对比时,更便于操作使用。For this disclosure, when evaluating whether the user's actions are of the same type, not every standard posture of the coach in the fitness video is compared with the user's posture, but only by comparing the preset standard postures, and During the time period when the standard posture appears in the fitness video, the user posture is obtained for comparison, and the standard posture and the user posture are both static postures, so it is easier to operate and use when comparing.
同时,本公开的标准姿态是在健身视频的某个时间点出现的,但是在识别用户姿态时,本公开在该时间点的前后区间内,即一个时间段中进行获取用户姿态,并将这个时间段中每一帧图像对应的用户姿态与标准姿态进行对比,这是由于,用户是跟着视频训练,若是新手每一上过该课程,他的动作相比课程存在超后,若是已经熟悉该课程了,用户可能会超前进行动作,因此在使用时,本公开将标准姿态出现的时间点扩展为时间段,在进行对比时,根据准确,效率更高。At the same time, the standard posture of the present disclosure appears at a certain time point of the fitness video, but when recognizing the user posture, the present disclosure acquires the user posture in the interval before and after the time point, that is, in a time period, and uses this The user posture corresponding to each frame of image in the time period is compared with the standard posture. This is because the user follows the video training. If a novice has taken the course every time, his actions are lagging behind the course. If he is already familiar with the course Lesson learned, the user may perform actions ahead of time, so when in use, this disclosure expands the time point when the standard posture appears into a time period, and when making a comparison, the basis is accurate and the efficiency is higher.
进一步的,本公开在对比若干用户姿态和标准姿态,判断用户姿态与标准姿态是否为同一类型,具体是通过孪生神经网络模型进行对比的,具体包括:Further, the present disclosure compares several user postures with standard postures, and judges whether the user postures and standard postures are of the same type, specifically through the twin neural network model for comparison, specifically including:
训练孪生神经网络模型,得到训练好的标准姿态识别模型;Train the twin neural network model to get the trained standard gesture recognition model;
将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型。Input the user pose and the standard pose into the standard pose recognition model, and judge whether the user pose and the standard pose are of the same type.
其中,训练孪生神经网络时,本公开的模型训练过程中通过较大的样本进行训练,例如通过举手抬腿等姿态样本,使模型训练学习如何抽取姿态特征,即从32维映射到100维的过程。Among them, when training the twin neural network, the model training process of the present disclosure is trained through larger samples, such as gesture samples such as raising hands and legs, so that the model training learns how to extract gesture features, that is, mapping from 32 dimensions to 100 dimensions the process of.
进一步的,将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型,具体包括:Further, input the user posture and the standard posture into the standard posture recognition model, and judge whether the user posture and the standard posture are of the same type, specifically including:
获取标准姿态和用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标;Get the bone key points of the standard pose and user pose and the position coordinates corresponding to each bone key point;
将标准姿态和用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到标准姿态的输出向量V1和用户姿态的输出向量V2;Input the position coordinates corresponding to each bone key point of the standard pose and the user pose into the trained standard pose recognition model, and obtain the output vector V1 of the standard pose and the output vector V2 of the user pose respectively;
计算标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture;
通过标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离判断用户姿态与标准姿态是否为同一类型。It is judged whether the user posture and the standard posture are of the same type by the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture.
其中,在本公开中一个人体姿态共有16个二维坐标的骨骼点,每个骨骼点有x和y坐标分量,那么一个人体姿态可以抽象成一个32维的骨骼点向量,即[x1,y1,x2,y2,x3,y3,…,x16,y16]。在通过训练好的姿态识别模型后,这个32维的骨骼点向量,会映射成一个更高维的向量,在本公开中输出的向量为100维,即标准姿态的输出向量V1和用户姿态的输出向量V2 均为100维,即[a1,a2,a3,…,a100]。在进行姿态比对时,标准姿态和用户姿态经过训练好的模型都会各自映射成一个100维的向量,即V1和V2,再计算V1和V2的欧式距离。Among them, in this disclosure, a human body posture has a total of 16 bone points with two-dimensional coordinates, and each bone point has x and y coordinate components, so a human body posture can be abstracted into a 32-dimensional bone point vector, namely [x1, y1 ,x2,y2,x3,y3,...,x16,y16]. After passing the trained pose recognition model, the 32-dimensional bone point vector will be mapped into a higher-dimensional vector. The output vector in this disclosure is 100-dimensional, that is, the output vector V1 of the standard pose and the V1 of the user pose The output vectors V2 are all 100-dimensional, namely [a1, a2, a3,..., a100]. When performing pose comparison, the trained models of the standard pose and the user pose will each be mapped into a 100-dimensional vector, namely V1 and V2, and then the Euclidean distance between V1 and V2 will be calculated.
本公开使用一个深度神经网络,该网络接受一个32维的向量,即本公开中的一个人体姿态,然后经过一系列的中间层操作,比如非线性矫正,全连接等,最终输出一个100维的向量。这个100维的向量是一个高度抽象的特征;最终使得如果两个姿态是很想相似的,经过网络输出后的两个100维的向量的欧式距离很小,反之,欧式距离很大。This disclosure uses a deep neural network, which accepts a 32-dimensional vector, that is, a human body pose in this disclosure, and then undergoes a series of intermediate layer operations, such as nonlinear correction, full connection, etc., and finally outputs a 100-dimensional vector. This 100-dimensional vector is a highly abstract feature; in the end, if the two poses are very similar, the Euclidean distance between the two 100-dimensional vectors output by the network is very small, otherwise, the Euclidean distance is very large.
我们的神经网络共4层,从输入到输出每一层的节点数分别是32->64->128->100,即输入32维的向量,输出一个100维的向量,n维空间的欧式距离计算公式,本公开映射到100维,即n=100:Our neural network has a total of 4 layers. The number of nodes in each layer from input to output is 32->64->128->100, that is, a 32-dimensional vector is input and a 100-dimensional vector is output. The n-dimensional Euclidean The distance calculation formula, this disclosure maps to 100 dimensions, ie n=100:
Figure PCTCN2022070026-appb-000001
Figure PCTCN2022070026-appb-000001
将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型。其中,将用户姿态和标准姿态输入标准姿态识别模型,输出用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果,通过评分结果判断用户姿态与标准姿态是否为同一类型。Input the user pose and the standard pose into the standard pose recognition model, and judge whether the user pose and the standard pose are of the same type. Among them, the user pose and standard pose are input into the standard pose recognition model, and the acquaintance score between the user pose and the standard pose is output, and the user pose with the highest acquaintance score is obtained as the scoring result, and the scoring result is used to judge whether the user pose and the standard pose are of the same type .
优选的,基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型,其中,若标准姿态识别模型输出的欧式距离大于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的欧式距离小于或等于阈值T,则用户姿态与标准姿态为不同一类型。Preferably, the Euclidean distance threshold T is obtained based on the standard gesture recognition model, and the threshold T is used to judge whether the user gesture and the standard gesture are of the same type, wherein, if the Euclidean distance output by the standard gesture recognition model is greater than or equal to the threshold T, then the user gesture and the standard gesture The standard poses are of the same type. If the Euclidean distance output by the standard pose recognition model is less than or equal to the threshold T, the user pose and the standard pose are of different types.
更进一步的,孪生网络模型训练好后,会在我们的测试集上寻找一个欧式距离的阈值T:如果两个姿态的欧式距离超过T,则认为不是同一类型;反之则认为是同一模型。Furthermore, after the Siamese network model is trained, it will look for a threshold T of the Euclidean distance on our test set: if the Euclidean distance of the two poses exceeds T, it is considered not to be of the same type; otherwise, it is considered to be the same model.
对于每一个阈值T,可以绘制出ROC曲线,ROC曲线下的面积,称为AUC,是一个0-1的值,AUC越大,模型性能越好。找到一个最优的阈值T-best,使得AUC在测试集上最大。通俗地讲,AUC最大,就是模型会尽可能多的将两个原本属于同一类的姿态判定为同一类,同时,会尽量少地将两个不属于同类的姿态误判为同一类。得到最优距离阈值T-best后,根据实际业务需求设置一个临界分数,比如40分,表示在此时,模型认为这两个姿态刚好处于相似和不相似的临界点。其映射关系如下:实际距离t在[0,T-best]区间内,相似度分数s为[100,40];实际距离t在(T-best,无穷大)时,相似度分数s为(40,0)。For each threshold T, the ROC curve can be drawn. The area under the ROC curve, called AUC, is a value from 0 to 1. The larger the AUC, the better the model performance. Find an optimal threshold T-best that maximizes AUC on the test set. In layman's terms, the largest AUC means that the model will judge as many poses that originally belong to the same class as the same class as much as possible, and at the same time, it will try to misjudge two poses that do not belong to the same class as the same class as little as possible. After obtaining the optimal distance threshold T-best, set a critical score according to actual business needs, such as 40 points, which means that at this time, the model believes that the two postures are just at the critical point of similarity and dissimilarity. The mapping relationship is as follows: when the actual distance t is in the interval [0, T-best], the similarity score s is [100, 40]; when the actual distance t is in (T-best, infinity), the similarity score s is (40 ,0).
并且,标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标。其中,16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨。Moreover, both the standard pose and the user pose include 16 bone key points, and the 16 bone key points correspond to a two-dimensional position coordinate. Among them, 16 bone key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, the left knee, the left foot, and the patella .
本公开还提供了基于健身装置的健身训练方法,所述健身训练方法基于上述基于健身教学训练的动作评价方法,所述健身训练方法包括:The present disclosure also provides a fitness training method based on a fitness device, the fitness training method is based on the above-mentioned exercise evaluation method based on fitness teaching training, and the fitness training method includes:
根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练;Obtaining the first user's posture according to the first fitness video, and judging whether the user is performing fitness training;
根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练;Acquiring the second user's posture according to the second fitness video, scoring the second user's posture, and judging whether the user follows the second fitness video for fitness training according to the scoring result;
根据第一健身视频获取第一用户姿态,对第一用户姿态进行评分,并将评分结果反馈给用户。The first user posture is acquired according to the first fitness video, the first user posture is scored, and the scoring result is fed back to the user.
对于本公开而言,健身视频进行三次播放,第一次主要起示范作用,通过播放健身视频,使用户了解训练内容,在此阶段中,主要看用户是否有进行简单的跟做,但是在此过程中只判断用户是否用动作发生,但是并不对发生的动作进行打分,不判断其是否做了相同的动作。第二次播放时,为第一健身视频的慢动作播放,第二健身视频可以为将第一次播放的健身视频进行慢动作播放,也可以为第一健身视频的动作分解,播放时,监测用户是否根据第二健身视频进行跟做,并且对用户的动作进行评分,判断用户的是否做了与健身视频相同的动作,进而判断用户是否进行了跟做。第三次播放时,第三次播放的视频与第一次播放的视频相同,但是第三次播放视频时,会对用户的动作进行识别,同时识别后还会通过评分判断用户的动作是否达标,进而提高用户的健身效果。对应本公开而言,第一次播放的视频、第二次播放的视频以及第三次播放的视频均为相同内容的视频,只不过第二次播放的视频为第一次播放的视频的慢放或动作分解。For this disclosure, the fitness video is played three times, and the first time mainly serves as a demonstration. By playing the fitness video, the user can understand the training content. In the process, it is only judged whether the user took an action, but it does not score the action that occurred, and does not judge whether it has done the same action. When playing for the second time, it is played in slow motion of the first fitness video. The second fitness video can be played in slow motion for the fitness video played for the first time, or can be decomposed for the action of the first fitness video. When playing, monitor Whether the user performs follow-up according to the second exercise video, and scores the user's actions to determine whether the user has performed the same action as the exercise video, and then determines whether the user has performed follow-up. When playing for the third time, the video played for the third time is the same as the video played for the first time, but when the video is played for the third time, the user's actions will be recognized, and at the same time, the score will be used to judge whether the user's actions are up to standard , thereby improving the fitness effect of the user. Corresponding to this disclosure, the video played for the first time, the video played for the second time, and the video played for the third time are all videos with the same content, except that the video played for the second time is a slower version of the video played for the first time. Release or action decomposition.
其中,根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练,具体包括:Wherein, according to the first fitness video, the posture of the first user is obtained, and whether the user performs fitness training is judged, specifically including:
获取健身第一健身视频;获取第一健身视频时,为直接在健身装置中播放第一健身视频,健身装置可以为智能健身镜,也可以为其他可以播放视频的健身装置;Obtain the first fitness video for fitness; when acquiring the first fitness video, in order to directly play the first fitness video in the fitness device, the fitness device can be a smart fitness mirror or other fitness devices that can play videos;
播放第一健身视频时,健身装置识别第一健身视频的目标健身区域,此时用户在目标健身区域中进行健身,因此在对目标健身区域进行特征提取时,能够得到第一用户姿态;When playing the first fitness video, the fitness device recognizes the target fitness area of the first fitness video, and the user is exercising in the target fitness area at this time, so when performing feature extraction on the target fitness area, the first user posture can be obtained;
根据第一用户姿态,判断用户是否进行健身训练此过程中,即为判断用户是否有动作产生,若有动作产生,则均为进行健身训练,若没有动作产生,则用户没有进行健身训练,若没有动作产生,第一次播放视频的效果较差,不能保证用户在第二次或第三次播放视频时,能够快速的掌握动作。In the process of judging whether the user is performing fitness training according to the first user posture, it is to determine whether the user has an action. If there is an action, the user is performing fitness training. No action is generated, and the effect of playing the video for the first time is relatively poor, and it cannot be guaranteed that the user can quickly grasp the action when playing the video for the second or third time.
第一次播放视频后,用户对健身视频中的动作有了初步的熟悉,熟悉后,根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练,具体包括:After playing the video for the first time, the user has a preliminary familiarity with the actions in the fitness video. After familiarization, the second user's posture is obtained according to the second fitness video, and the second user's posture is scored. According to the scoring result, it is judged whether the user follows the first Two fitness videos for fitness training, including:
根据第二健身视频预设第二标准姿态;Presetting a second standard posture according to the second fitness video;
识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到第二用户姿态;Identifying the target fitness area of the second fitness video, performing feature extraction on the target fitness area, and obtaining the second user posture;
对比第二标准姿态和第二用户姿态,得到第二用户姿态基于第二标准姿态的相识度评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练。Comparing the second standard posture and the second user posture, the acquaintance score of the second user posture based on the second standard posture is obtained, and judging whether the user follows the second fitness video for fitness training according to the scoring result.
其中,识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到第二用户姿态,具体包括:Among them, identifying the target fitness area of the second fitness video, performing feature extraction on the target fitness area, and obtaining the second user posture, specifically including:
获取第二标准姿态在第二健身视频中出现的第一时间段;Obtain the first time period when the second standard posture appears in the second fitness video;
获取第一时间段内第二健身视频对应的视频片段,对视频片段进行分帧处理,得到视频片段对应的若干连续时刻的帧图像;Obtain the video segment corresponding to the second fitness video in the first time period, and process the video segment into frames to obtain frame images corresponding to several consecutive moments of the video segment;
在第一时间段内识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到若干与帧图像一一对应的第二用户姿态;Identifying the target fitness area of the second fitness video in the first time period, performing feature extraction on the target fitness area, and obtaining a plurality of second user postures corresponding to the frame images one-to-one;
对比相对应的若干帧图像和若干第二用户姿态,得到每个第二用户姿态基于对应的帧图像的相识度评分;Comparing corresponding several frame images and several second user postures, obtaining the acquaintance score of each second user posture based on the corresponding frame images;
获取相识度评分最高的第二用户姿态作为评分结果。Obtain the second user pose with the highest acquaintance score as the scoring result.
在第二次播放健身视频时,该阶段要对用户的动作进行评分,判断用户是否照着第二健身视频进行跟做,在此过程中,本公开并不是对第二健身视频中的每一帧均进行评分判断,而是预设了第二标准姿态,通过对比第二标准姿态在第二健身视频中出现的第一时间段,来获取第一时间段中每一帧图像对应的第二用户姿态,对比相对应的若干帧图像和若干第二用户姿态,得到每个第二用户姿态基于对应的帧图像的相识度评分。在判断用户是否进行跟做时,通过预设的第二标准姿态获取对应的时间段内的用户的第二用户姿态,并通过每一帧图像与对应的第二用户姿态进行对比,能更好的判断用户是否在进行跟做。When the fitness video is played for the second time, at this stage, the user's actions should be scored to determine whether the user follows the second fitness video. Each frame is scored and judged, but the second standard posture is preset, and the second standard posture corresponding to each frame image in the first time period is obtained by comparing the first time period when the second standard posture appears in the second fitness video. For the user pose, compare the corresponding several frame images with the several second user poses to obtain the acquaintance score of each second user pose based on the corresponding frame images. When judging whether the user is following the action, it is better to obtain the second user posture of the user in the corresponding time period through the preset second standard posture, and compare each frame image with the corresponding second user posture. To determine whether the user is following.
对于本公开而言,具体的评分过程包括:For this disclosure, the specific scoring process includes:
训练孪生神经网络模型,得到训练好的标准姿态识别模型;Train the twin neural network model to get the trained standard gesture recognition model;
将第二用户姿态和第二标准姿态输入标准姿态识别模型,获得相识度评分;Inputting the second user pose and the second standard pose into the standard pose recognition model to obtain an acquaintance score;
若评分结果大于或等于评分阈值,则第二标准姿态和第二用户姿态为同一类,则用户进行健身训练;If the scoring result is greater than or equal to the scoring threshold, then the second standard posture and the second user posture are of the same type, and the user performs fitness training;
若评分结果小于评分阈值,则第二标准姿态和第二用户姿态不为同一类,则用户未进行健身训练。If the scoring result is less than the scoring threshold, the second standard posture and the second user posture are not in the same category, and the user has not performed fitness training.
在进行第三次播放时,即播放第一健身视频时,具体包括:On the third play, when the first workout video is played, it includes:
获取健身第一健身视频,即在健身装置上播放第一健身视频,根据第一健身视频预设第一标准姿态;Obtain the first fitness video for fitness, that is, play the first fitness video on the fitness device, and preset the first standard posture according to the first fitness video;
识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到第一用户姿态;Identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture;
对比第一标准姿态和第一用户姿态,得到第一用户姿态基于第一标准姿态的相识度评分,并将评分结果反馈给用户。Comparing the first standard posture with the first user posture, obtaining the acquaintance score of the first user posture based on the first standard posture, and feeding back the scoring result to the user.
其中,识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到第一用户姿态,具体包括:Among them, identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture, specifically including:
获取第一标准姿态在第一健身视频中出现的第二时间段;Obtain the second time period when the first standard posture appears in the first fitness video;
在第二时间段内识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到若干第一用户姿态;Identifying the target fitness area of the first fitness video in the second time period, performing feature extraction on the target fitness area, and obtaining several first user postures;
对比第一标准姿态和若干第一用户姿态,得到每个第一用户姿态基于第一标准姿态的相识度评分;Comparing the first standard posture and several first user postures, obtaining the acquaintance score of each first user posture based on the first standard posture;
获取相识度评分最高的第一用户姿态作为评分结果。Obtain the first user pose with the highest acquaintance score as the scoring result.
第二次播放视频和第三次播放视频时,第二次播放视频时,为第一时间段内的每一帧图像与对应的第二用户姿态进行对比,而第三次播放视频时,为第二时间段内的第一用户姿态与第一标准姿态进行对比,这样在第三次播放视频时,能够更好的确认用户跟做的动作是否达标,提高健身教学训练的效率。When the video is played for the second time and the video is played for the third time, when the video is played for the second time, each frame image in the first time period is compared with the corresponding second user gesture, and when the video is played for the third time, it is The first user posture in the second time period is compared with the first standard posture, so that when the video is played for the third time, it is possible to better confirm whether the actions followed by the user meet the standards and improve the efficiency of fitness teaching and training.
与本公开中的基于健身教学训练的动作评价方法对应,本公开还提供了基于健身教学训练的动作评价系统,包括:Corresponding to the action evaluation method based on fitness teaching and training in the present disclosure, the present disclosure also provides an action evaluation system based on fitness teaching and training, including:
获取模块,用于获取健身视频,并获取健身视频中预设的标准姿态对应的时间段,以及该时间段中获取健身视频对应的连续时刻的帧图像;The obtaining module is used to obtain the fitness video, and obtain the time period corresponding to the preset standard posture in the fitness video, and obtain the frame images of the continuous moments corresponding to the fitness video in the time period;
识别模块,用于识别上述时间段中连续时刻的帧图像对应的若干用户姿态;An identification module, configured to identify several user gestures corresponding to the frame images at consecutive moments in the above time period;
对比模块,用于对比若干用户姿态和标准姿态,得到对比结果;A comparison module is used to compare several user postures and standard postures to obtain comparison results;
判断模块,用于根据对比结果判断用户姿态与标准姿态是否为同一类型。The judging module is used to judge whether the user posture and the standard posture are of the same type according to the comparison result.
进一步的,所述对比模块对比若干用户姿态和标准姿态,具体包括训练孪生神经网络模型,得到训练好的标准姿态识别模型;Further, the comparison module compares several user postures and standard postures, specifically including training a twin neural network model to obtain a trained standard posture recognition model;
所述判断模块根据对比结果判断用户姿态与标准姿态是否为同一类型,具体包括将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型。The judging module judges whether the user posture and the standard posture are of the same type according to the comparison result, specifically including inputting the user posture and the standard posture into the standard posture recognition model, and judging whether the user posture and the standard posture are of the same type.
进一步的,所述判断模块将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型,具体包括:Further, the judging module inputs the user posture and the standard posture into the standard posture recognition model, and judges whether the user posture and the standard posture are of the same type, specifically including:
获取标准姿态和用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标;Get the bone key points of the standard pose and user pose and the position coordinates corresponding to each bone key point;
将标准姿态和用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到标准姿态的输出向量V1和用户姿态的输出向量V2;Input the position coordinates corresponding to each bone key point of the standard pose and the user pose into the trained standard pose recognition model, and obtain the output vector V1 of the standard pose and the output vector V2 of the user pose respectively;
计算标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture;
通过标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离判断用户姿态与标准姿态是否为同一类型。It is judged whether the user posture and the standard posture are of the same type by the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture.
进一步的,所述判断模块基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型,其中,若标准姿态识别模型输出的欧式距离小于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的欧式距离大于阈值T,则用户姿态与标准姿态为不同一类型。Further, the judgment module obtains the Euclidean distance threshold T based on the standard posture recognition model, and the threshold T is used to judge whether the user posture and the standard posture are of the same type, wherein, if the Euclidean distance output by the standard posture recognition model is less than or equal to the threshold T, Then the user pose is of the same type as the standard pose, and if the Euclidean distance output by the standard pose recognition model is greater than the threshold T, then the user pose and the standard pose are of a different type.
进一步的,标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标,16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨。Further, both the standard posture and the user posture include 16 skeletal key points, each of which corresponds to a two-dimensional position coordinate, and the 16 skeletal key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, and the right hand. , left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella.
进一步的,所述判断模块将用户姿态和标准姿态输入标准姿态识别模型,输出用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果,通过评分结果判断用户姿态与标准姿态是否为同一类型。Further, the judging module inputs the user posture and the standard posture into the standard posture recognition model, outputs the acquaintance score between the user posture and the standard posture, obtains the user posture with the highest acquaintance score as the scoring result, and judges the user posture and the standard posture based on the scoring result. Whether the gestures are of the same type.
与本公开中的基于健身装置的健身训练方法对应,本公开还提供了基于健身装置的健身训练系统,所述健身训练系统基于上述基于健身教学训练的动作评价系统,所述健身训练系统执行:Corresponding to the fitness training method based on the fitness device in the disclosure, the disclosure also provides a fitness training system based on the fitness device, the fitness training system is based on the above-mentioned exercise evaluation system based on fitness teaching and training, and the fitness training system executes:
根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练;Obtaining the first user's posture according to the first fitness video, and judging whether the user is performing fitness training;
根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练;Acquiring the second user's posture according to the second fitness video, scoring the second user's posture, and judging whether the user follows the second fitness video for fitness training according to the scoring result;
根据第一健身视频获取第一用户姿态,对第一用户姿态进行评分,并将评分结果反馈给用户。The first user posture is acquired according to the first fitness video, the first user posture is scored, and the scoring result is fed back to the user.
本公开还提供了另一种基于健身装置的健身训练系统,基于上述基于健身教学训练的动作评价系统,所述健身训练系统包括:The present disclosure also provides another fitness training system based on a fitness device. Based on the above-mentioned action evaluation system based on fitness teaching and training, the fitness training system includes:
获取模块,用于获取健身视频,并对健身视频进行处理;The obtaining module is used to obtain the fitness video and process the fitness video;
识别模块,用于识别健身视频的目标健身区域,对目标健身区域进行特征提取,得到用户姿态;The identification module is used to identify the target fitness area of the fitness video, and extracts the features of the target fitness area to obtain the user's posture;
对比模块,用于对比用户姿态和预设的标准姿态,得到对比结果;A comparison module is used to compare the user's posture with a preset standard posture to obtain a comparison result;
判断模块,用于根据对比结果判断用户是否进行健身训练或动作是否达标。与本公开中的基于健身教学训练的动作评价方法对应,本公开还提供了一种电子装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述基于健身教学训练的动作评价方法。The judging module is used to judge whether the user performs fitness training or whether the action meets the standard according to the comparison result. Corresponding to the action evaluation method based on fitness teaching training in the present disclosure, the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, When the processor executes the computer program, the above-mentioned action evaluation method based on fitness teaching and training is realized.
与本公开中的基于健身装置的健身训练方法对应,本公开还提供了一种电子装置,包括 存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述基于健身装置的健身训练方法。Corresponding to the fitness training method based on the fitness device in the present disclosure, the present disclosure also provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor, the When the processor executes the computer program, the above fitness device-based fitness training method is realized.
与本公开中的基于健身教学训练的动作评价方法对应,本公开还提供了一种存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述基于健身教学训练的动作评价方法。Corresponding to the action evaluation method based on fitness teaching training in this disclosure, this disclosure also provides a storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the above-mentioned fitness-based Action evaluation methods for teaching and training.
与本公开中的基于健身装置的健身训练方法对应,本公开还提供了一种存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现上述基于健身装置的健身训练方法。Corresponding to the fitness training method based on the fitness device in the present disclosure, the present disclosure also provides a storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the above fitness device-based training method is implemented. fitness training methods.
本公开提供的一个或多个技术方案,至少具有如下技术效果或优点:One or more technical solutions provided by the present disclosure have at least the following technical effects or advantages:
本公开在对用户的动作进行评价是否为同一类型时,并不是健身视频中教练的每个标准姿态均与用户姿态进行对比评判,而是只通过对比预设的标准姿态,以及标准姿态在健身视频中出现的时间段中,获取用户姿态进行对比,并且标准姿态与用户姿态均为静止的一个姿态,因此在对比时,更便于操作使用。其次预设的标准姿态可以为定点动作,在对用户动作进行评价时,能更为准确的把握用户健身训练情况。In this disclosure, when evaluating whether the user's actions are of the same type, it does not compare and judge every standard posture of the coach in the fitness video with the user's posture, but only compares the preset standard postures, and the standard postures in the fitness In the time period that appears in the video, the user pose is obtained for comparison, and the standard pose and the user pose are both static poses, so it is easier to operate and use when comparing. Secondly, the preset standard posture can be a fixed-point movement, which can more accurately grasp the user's fitness training situation when evaluating the user's movement.
其次,本公开的在进行对比时,通过卷积神经网络构建模型,并通过人体的16个骨骼关键点以及其分别对应二维位置坐标进行识别,准确度更高,能够快速得到用户姿态是否标准,提高使用效率。Secondly, when making a comparison in the present disclosure, the model is constructed through a convolutional neural network, and the 16 key points of the skeleton of the human body and their corresponding two-dimensional position coordinates are used for identification, which has higher accuracy and can quickly determine whether the user's posture is standard or not. , improve efficiency.
另外,本公开在进行健身教学训练时,通过三个阶段分别对用户进行健身训练动作示范、慢动作教学以及正常跟做,在此过程中,对用户的动作进行识别,判断用户在第一个阶段时是否有动作产生,第二个阶段时是否进行了跟做,在第三个阶段时用户动作是否达标,并将评分结果反馈给用户。In addition, when performing fitness teaching and training in this disclosure, the user is given fitness training action demonstration, slow motion teaching, and normal follow-up through three stages. Whether there is an action in the first stage, whether it is followed up in the second stage, whether the user’s action is up to the standard in the third stage, and feedback the scoring result to the user.
并且本公开在对用户动作进行识别对比时,根据预设的第一标准姿态或第二标准姿态对应的第一时间段或第二时间段进行识别对比,不需要整段视频进行识别对比,对比效果更佳,得到结果速度更快,有效的保障了用户的健身效果,并且在每个阶段都能获取用户的健身情况,提高了健身的使用效率,更便于长期使用。In addition, when the present disclosure performs identification and comparison on user actions, the identification and comparison is performed according to the first time period or the second time period corresponding to the preset first standard posture or the second standard posture, and it is not necessary to perform identification and comparison on the entire video. The effect is better, the result is obtained faster, the user's fitness effect is effectively guaranteed, and the user's fitness situation can be obtained at each stage, which improves the efficiency of fitness use and is more convenient for long-term use.
附图说明Description of drawings
此处所说明的附图用来提供对本公开实施例的进一步理解,构成本申请的一部分,并不构成对本公开实施例的限定。在附图中:The drawings described here are used to provide further understanding of the embodiments of the present disclosure, constitute a part of the application, and do not limit the embodiments of the present disclosure. In the attached picture:
图1为基于健身教学训练的动作评价方法的流程示意图;Fig. 1 is the schematic flow chart of the action evaluation method based on fitness teaching training;
图2为两个姿态属于不同的类型时示意图;Fig. 2 is a schematic diagram when two postures belong to different types;
图3为两个姿态属于同类型时示意图;Figure 3 is a schematic diagram when two postures belong to the same type;
图4为ROC曲线;Fig. 4 is ROC curve;
图5为基于健身装置的健身训练方法的流程示意图;Fig. 5 is a schematic flow chart of a fitness training method based on a fitness device;
图6为基于健身教学训练的动作评价系统或基于健身装置的健身训练系统的组成示意图。Fig. 6 is a schematic composition diagram of an action evaluation system based on fitness teaching and training or a fitness training system based on a fitness device.
具体实施方式Detailed ways
为了能够更清楚地理解本公开的上述目的、特征和优点,下面结合附图和具体实施方式对本公开进行进一步的详细描述。需要说明的是,在相互不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。In order to better understand the above objects, features and advantages of the present disclosure, the present disclosure will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. It should be noted that, under the condition of not conflicting with each other, the embodiments of the present disclosure and the features in the embodiments can be combined with each other.
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但是,本公开还可以采用其他不同于在此描述范围内的其他方式来实施,因此,本公开的保护范围并不受下面公开的具体实施例的限制。In the following description, many specific details are set forth in order to fully understand the present disclosure. However, the present disclosure can also be implemented in other ways different from the scope of this description. Therefore, the protection scope of the present disclosure is not limited by the following disclosure. The limitations of specific examples.
本领域技术人员应理解的是,在本公开的揭露中,术语“纵向”、“横向”、“上”、“下”、“前”、“后”、“左”、“右”、“竖直”、“水平”、“顶”、“底”“内”、“外”等指示的方位或位置关系是基于附图所示的方位或位置关系,其仅是为了便于描述本公开和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此上述术语不能理解为对本公开的限制。Those skilled in the art should understand that, in the disclosure of the present disclosure, the terms "vertical", "transverse", "upper", "lower", "front", "rear", "left", "right", " The orientation or positional relationship indicated by "vertical", "horizontal", "top", "bottom", "inner", "outer", etc. is based on the orientation or positional relationship shown in the drawings, which are only for the convenience of describing the present disclosure and The above terms are not to be construed as limiting the present disclosure because the description is simplified rather than indicating or implying that the device or element referred to must have a particular orientation, be constructed, and operate in a particular orientation.
可以理解的是,术语“一”应理解为“至少一”或“一个或多个”,即在一个实施例中,一个元件的数量可以为一个,而在另外的实施例中,该元件的数量可以为多个,术语“一”不能理解为对数量的限制。It can be understood that the term "a" should be understood as "at least one" or "one or more", that is, in one embodiment, the number of an element can be one, while in another embodiment, the number of the element The quantity can be multiple, and the term "a" cannot be understood as a limitation on the quantity.
实施例一Embodiment one
请参考图1,图1为基于健身教学训练的动作评价方法的流程示意图,本公开提供了基于健身教学训练的动作评价方法,所述方法包括:Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an action evaluation method based on fitness teaching and training. The present disclosure provides an action evaluation method based on fitness teaching and training. The method includes:
获取健身视频,健身视频中预设有标准姿态;Obtain fitness videos, in which standard postures are preset;
获取健身视频中标准姿态对应的时间段;Obtain the time period corresponding to the standard posture in the fitness video;
在该时间段中获取健身视频对应的连续时刻的帧图像;Acquire frame images of continuous moments corresponding to the fitness video during the time period;
识别该时间段中连续时刻的帧图像对应的若干用户姿态;Identify several user gestures corresponding to frame images at consecutive moments in the time period;
对比若干用户姿态和标准姿态,判断用户姿态与标准姿态是否为同一类型。将用户姿态和标准姿态输入标准姿态识别模型,输出用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果,通过评分结果判断用户姿态与标准姿态是否为同一类型。Compare several user postures with standard postures, and determine whether the user postures and standard postures are of the same type. Input the user pose and standard pose into the standard pose recognition model, output the acquaintance score between the user pose and the standard pose, obtain the user pose with the highest acquaintance score as the scoring result, and judge whether the user pose and the standard pose are of the same type through the scoring result.
其中,对比若干用户姿态和标准姿态,判断用户姿态与标准姿态是否为同一类型,具体包括:Among them, comparing several user postures and standard postures, and judging whether the user postures and standard postures are of the same type, specifically include:
训练孪生神经网络模型,得到训练好的标准姿态识别模型;Train the twin neural network model to get the trained standard gesture recognition model;
将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型。Input the user pose and the standard pose into the standard pose recognition model, and judge whether the user pose and the standard pose are of the same type.
将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型,具体包括:Input the user posture and standard posture into the standard posture recognition model, and judge whether the user posture and standard posture are of the same type, including:
获取标准姿态和用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标;Get the bone key points of the standard pose and user pose and the position coordinates corresponding to each bone key point;
将标准姿态和用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到标准姿态的输出向量V1和用户姿态的输出向量V2;Input the position coordinates corresponding to each bone key point of the standard pose and the user pose into the trained standard pose recognition model, and obtain the output vector V1 of the standard pose and the output vector V2 of the user pose respectively;
计算标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture;
通过标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离判断用户姿态与标准姿态是否为同一类型。It is judged whether the user posture and the standard posture are of the same type by the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture.
优选的,基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型,其中,若标准姿态识别模型输出的欧式距离大于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的欧式距离小于或等于阈值T,则用户姿态与标准姿态为不同一类型。Preferably, the Euclidean distance threshold T is obtained based on the standard gesture recognition model, and the threshold T is used to judge whether the user gesture and the standard gesture are of the same type, wherein, if the Euclidean distance output by the standard gesture recognition model is greater than or equal to the threshold T, then the user gesture and the standard gesture The standard poses are of the same type. If the Euclidean distance output by the standard pose recognition model is less than or equal to the threshold T, the user pose and the standard pose are of different types.
其中,标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标。16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨。Among them, the standard pose and the user pose both include 16 bone key points, and the 16 bone key points correspond to a two-dimensional position coordinate. The 16 bone key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, the left knee, the left foot, and the patella.
下面结合具体的例子对本公开中的基于健身教学训练的动作评价方法进行介绍:The following is an introduction to the action evaluation method based on fitness teaching and training in the present disclosure in conjunction with specific examples:
操作1获取健身视频,健身视频中预设有标准姿态;在本实施例中健身装置为健身镜,在健身镜的镜面上播放健身视频;Operation 1 obtains a fitness video, and a standard posture is preset in the fitness video; in this embodiment, the fitness device is a fitness mirror, and the fitness video is played on the mirror surface of the fitness mirror;
操作2获取健身视频中标准姿态对应的时间段,在该时间段中获取健身视频对应的连续时刻的帧图像;Operation 2 obtains the time period corresponding to the standard posture in the fitness video, and obtains the frame images of the continuous moments corresponding to the fitness video in this time period;
操作3识别该时间段中连续时刻的帧图像对应的若干用户姿态;Operation 3 recognizes several user gestures corresponding to the frame images at consecutive moments in the time period;
操作3.1健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,得到用户姿态;Operation 3.1 The fitness mirror recognizes the user's actions in the target fitness area, extracts the features of the target fitness area, and obtains the user's posture;
操作4对比若干用户姿态和标准姿态,判断用户姿态与标准姿态是否为同一类型;Operation 4 compares several user postures with standard postures, and determines whether the user postures and standard postures are of the same type;
操作4.1训练孪生神经网络模型,得到训练好的标准姿态识别模型;Operation 4.1 train the twin neural network model to obtain the trained standard gesture recognition model;
操作4.2获取标准姿态和用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标,其中,标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标;16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨;Operation 4.2 Obtain the bone key points of the standard pose and the user pose and the position coordinates corresponding to each bone key point. Among them, the standard pose and the user pose both include 16 bone key points, and the 16 bone key points correspond to a two-dimensional position coordinate ;16 bone key points include top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella;
操作4.3将标准姿态和用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到标准姿态的输出向量V1和用户姿态的输出向量V2;Operation 4.3 Input the position coordinates corresponding to each bone key point of the standard posture and the user posture into the trained standard posture recognition model, and obtain the output vector V1 of the standard posture and the output vector V2 of the user posture respectively;
计算标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture;
通过标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离判断用户姿态与标准姿态是否为同一类型。It is judged whether the user posture and the standard posture are of the same type by the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture.
其中:神经网络共4层,从输入到输出每一层的节点数分别是32->64->128->100,即输入32维的向量,输出一个100维的向量,n维空间的欧式距离计算公式,本公开映射到100维,即n=100:Among them: the neural network has a total of 4 layers, and the number of nodes in each layer from input to output is 32->64->128->100, that is, a 32-dimensional vector is input, a 100-dimensional vector is output, and the European style of n-dimensional space The distance calculation formula, this disclosure maps to 100 dimensions, ie n=100:
Figure PCTCN2022070026-appb-000002
Figure PCTCN2022070026-appb-000002
标准姿态识别模型输出两个高维的向量,在本实施例中为100维的向量,如果这两个姿态属于不同的类型,如图2所示,则这两个姿态映射到高维空间的点的欧式距离会很远;反之,如果这两个姿态属于同一种类型如图3所示,则这两个姿态映射到高维空间的点的欧式距离会很近。The standard gesture recognition model outputs two high-dimensional vectors, which in this embodiment are 100-dimensional vectors. If these two gestures belong to different types, as shown in Figure 2, then the two gestures are mapped to the high-dimensional space The Euclidean distance of the point will be very far; conversely, if the two poses belong to the same type as shown in Figure 3, the Euclidean distance of the points where the two poses are mapped to the high-dimensional space will be very close.
操作4.4基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型;Operation 4.4 Obtain the Euclidean distance threshold T based on the standard gesture recognition model, and the threshold T is used to judge whether the user gesture and the standard gesture are of the same type;
若标准姿态识别模型输出的评分结果的欧式距离大于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的评分结果的欧式距离小于或等于阈值T,则用户姿态与标准姿态为不同一类型;If the Euclidean distance of the scoring result output by the standard gesture recognition model is greater than or equal to the threshold T, the user gesture is of the same type as the standard gesture; if the Euclidean distance of the scoring result output by the standard gesture recognition model is less than or equal to the threshold T, then the user gesture and The standard poses are of different types;
操作4.5将欧式距离转化为用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果。Operation 4.5 converts the Euclidean distance into the acquaintance score between the user pose and the standard pose, and obtains the user pose with the highest acquaintance score as the scoring result.
具体的,如果两个姿态的欧式距离超过阈值T,则认为不是同一类型;反之则认为是同一模型。对于每一个阈值T,都可以绘制出ROC曲线,如图4所示,ROC曲线下的面积,称为AUC,是一个0-1的值,AUC越大,模型性能越好。找到一个最优的阈值T-best,使得AUC在测试集上最大。若AUC最大,就是模型会尽可能多的将两个原本属于同一类的姿态判定为同一类,同时,会尽量少地将两个不属于同类的姿态误判为同一类。得到最优距离阈值T-best后,我们根据实际业务需求设置一个临界分数,比如40分,表示在此时,模型认为这两个姿态刚好处于相似和不相似的临界点。然后映射关系如下:实际距离t在[0,T-best]区间内,相似度分数s为[100,40];实际距离t在(T-best,无穷大)时,相似度分数s为(40,0)。在本实施例中阈值T为40。Specifically, if the Euclidean distance of the two poses exceeds the threshold T, they are considered not to be of the same type; otherwise, they are considered to be of the same model. For each threshold T, the ROC curve can be drawn, as shown in Figure 4, the area under the ROC curve, called AUC, is a value of 0-1, the larger the AUC, the better the model performance. Find an optimal threshold T-best that maximizes AUC on the test set. If the AUC is the largest, the model will judge as many poses that originally belonged to the same class as the same class as much as possible, and at the same time, it will try to misjudge two poses that do not belong to the same class as the same class as little as possible. After obtaining the optimal distance threshold T-best, we set a critical score according to actual business needs, such as 40 points, which means that at this time, the model believes that the two postures are just at the critical point of similarity and dissimilarity. Then the mapping relationship is as follows: when the actual distance t is in the interval [0, T-best], the similarity score s is [100, 40]; when the actual distance t is in (T-best, infinity), the similarity score s is (40 ,0). The threshold T is 40 in this embodiment.
实施例二Embodiment two
在实施例一的基础上,下面结合具体的例子对本公开中的基于健身教学训练的动作评价方法进行介绍:在实际使用的过程中,本公开的动作评价方法用户一种基于健身装置的健身训练方法,通过三个阶段分别对用户进行健身训练动作示范、动作教学以及正常跟做,在此过程中,对用户的动作进行识别,判断用户在第一个阶段时是否有动作产生,第二个阶段时是否进行了跟做,在第三个阶段时用户动作是否达标,并将评分结果反馈给用户。并且本公开在对用户动作进行识别对比时,根据预设的第一标准姿态或第二标准姿态对应的第一时间段或第二时间段进行识别对比,不需要整段视频进行识别对比,对比效果更佳,得到结果速度更快,有效的保障了用户的健身效果,并且在每个阶段都能获取用户的健身情况,提高了健身的使用效率,更便于长期使用。具体包括:On the basis of Embodiment 1, the action evaluation method based on fitness teaching and training in the present disclosure will be introduced in conjunction with specific examples below: In the actual use process, the action evaluation method of the present disclosure uses a fitness training based on a fitness device The method is to carry out fitness training action demonstration, action teaching and normal follow-up to the user through three stages respectively. In this process, the user's action is recognized to judge whether the user has an action in the first stage, and the second In the third stage, whether the user's action is up to the standard, and the scoring result is fed back to the user. In addition, when the present disclosure performs identification and comparison on user actions, the identification and comparison is performed according to the first time period or the second time period corresponding to the preset first standard posture or the second standard posture, and it is not necessary to perform identification and comparison on the entire video. The effect is better, the result is obtained faster, the user's fitness effect is effectively guaranteed, and the user's fitness situation can be obtained at each stage, which improves the efficiency of fitness use and is more convenient for long-term use. Specifically include:
操作1获取健身第一健身视频,在本实施例中健身装置为健身镜,在健身镜的镜面上播放第一健身视频;Operation 1 obtains the first body-building video for body-building, and in the present embodiment, the body-building device is a body-building mirror, and the first body-building video is played on the mirror surface of the body-building mirror;
操作2根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练;Operation 2 Obtain the first user's posture according to the first fitness video, and determine whether the user is performing fitness training;
操作2.1用户在健身镜的目标健身区域中,根据第一健身视频进行跟做;Operation 2.1 The user follows the first fitness video in the target fitness area of the fitness mirror;
操作2.2第一健身视频中预设第一标准姿态;Operation 2.2 Preset the first standard posture in the first fitness video;
操作2.3获取第一标准姿态在第一健身视频中出现的第二时间段;Operation 2.3 acquires the second time period when the first standard posture appears in the first fitness video;
操作2.4在第二时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,特征提取包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标;16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨,得到第一用户姿态;Operation 2.4 In the second time period, the fitness mirror recognizes the user's actions in the target fitness area, and performs feature extraction on the target fitness area. The feature extraction includes 16 bone key points, and the 16 bone key points correspond to a two-dimensional position Coordinates; 16 bone key points include top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella , get the first user pose;
操作2.5若在第二时间段内获取的第一用户姿态有不同的动作产生,即表明用户在进行跟做,即为用户在进行健身训练;若在第二时间段内没有获取到第一用户姿态,或第一用户姿态没有动作产生,则用户没有跟做,即用户没有进行健身训练。在操作2中,在第二时间段中,对用户的动作进行识别,但是此过程中不进行打分。Operation 2.5 If the first user’s posture obtained within the second time period has different actions, it means that the user is doing follow-up, that is, the user is performing fitness training; if the first user is not obtained within the second time period posture, or the first user posture does not produce any action, then the user does not follow up, that is, the user does not perform fitness training. In operation 2, during the second time period, the user's action is recognized, but no scoring is performed during this process.
操作3根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练;Operation 3 obtains the posture of the second user according to the second fitness video, scores the posture of the second user, and judges whether the user follows the second fitness video for fitness training according to the scoring result;
操作3.1在健身镜的镜面上播放第二健身视频;Operation 3.1 Play the second fitness video on the mirror surface of the fitness mirror;
操作3.2根据第二健身视频预设第二标准姿态;Operation 3.2 Preset the second standard posture according to the second fitness video;
操作3.3获取第二标准姿态在第二健身视频中出现的第一时间段;Operation 3.3 Obtain the first time period when the second standard posture appears in the second fitness video;
操作3.4在第一时间段内,获取第一时间段内第二健身视频对应的视频片段,对视频片段进行分帧处理,得到视频片段对应的若干连续时刻的帧图像;Operation 3.4 In the first time period, obtain the video segment corresponding to the second fitness video in the first time period, perform frame processing on the video segment, and obtain frame images corresponding to several consecutive moments of the video segment;
操作3.5在第一时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身 区域进行特征提取,特征提取包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标;16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨,得到若干与帧图像一一对应的第二用户姿态;Operation 3.5 In the first period of time, the fitness mirror recognizes the user's actions in the target fitness area, and performs feature extraction on the target fitness area. The feature extraction includes 16 bone key points, and the 16 bone key points correspond to a two-dimensional position Coordinates; 16 bone key points include top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella , to obtain a number of second user gestures corresponding to the frame images one-to-one;
操作3.6对比相对应的若干帧图像和若干第二用户姿态,得到每个第二用户姿态基于对应的帧图像的相识度评分;Operation 3.6 compares corresponding several frame images and several second user poses, and obtains the acquaintance score of each second user pose based on the corresponding frame images;
操作3.61训练孪生神经网络模型,得到训练好的标准姿态识别模型Operation 3.61 train the twin neural network model to get the trained standard pose recognition model
操作3.62获取第二标准姿态和第二用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标,其中,第二标准姿态和第二用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标;16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨;将第二标准姿态和第二用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到第二标准姿态的输出向量V1和第二用户姿态的输出向量V2;Operation 3.62 Obtain the bone key points of the second standard pose and the second user pose and the position coordinates corresponding to each bone key point, wherein the second standard pose and the second user pose both include 16 bone key points and 16 bone key points The points correspond to a two-dimensional position coordinate; 16 bone key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, Left knee, left foot, and patella; input the position coordinates corresponding to each bone key point of the second standard pose and the second user pose into the trained standard pose recognition model, and obtain the output vector V1 of the second standard pose and the second user pose respectively. The output vector V2 of the attitude;
计算第二标准姿态的输出向量V1和第二用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the second standard posture and the output vector V2 of the second user posture;
通过第二标准姿态的输出向量V1和第二用户姿态的输出向量V2的欧式距离判断第二用户姿态与第二标准姿态是否为同一类型。Whether the second user gesture is of the same type as the second standard gesture is judged by the Euclidean distance between the output vector V1 of the second standard gesture and the output vector V2 of the second user gesture.
其中:神经网络共4层,从输入到输出每一层的节点数分别是32->64->128->100,即输入32维的向量,输出一个100维的向量,n维空间的欧式距离计算公式,本公开映射到100维,即n=100:Among them: the neural network has a total of 4 layers, and the number of nodes in each layer from input to output is 32->64->128->100, that is, a 32-dimensional vector is input, a 100-dimensional vector is output, and the European style of n-dimensional space The distance calculation formula, this disclosure maps to 100 dimensions, ie n=100:
Figure PCTCN2022070026-appb-000003
Figure PCTCN2022070026-appb-000003
操作3.7将欧式距离转化为用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果;Operation 3.7 converts the Euclidean distance into the acquaintance score between the user posture and the standard posture, and obtains the user posture with the highest acquaintance score as the scoring result;
操作3.8获取相识度评分最高的第二用户姿态作为评分结果;若评分结果大于或等于评分阈值,则第二标准姿态和第二用户姿态为同一类,则用户进行健身训练;若评分结果小于或等于评分阈值,则第二标准姿态和第二用户姿态不为同一类,则用户未用户进行健身训练。基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型;若标准姿态识别模型输出的评分结果的欧式距离大于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的评分结果的欧式距离小于或等于阈值T,则用户姿态与标准姿态为不同一类型。Operation 3.8 Obtain the second user posture with the highest acquaintance score as the scoring result; if the scoring result is greater than or equal to the scoring threshold, the second standard posture and the second user posture are of the same class, and the user performs fitness training; if the scoring result is less than or is equal to the scoring threshold, then the second standard posture and the second user posture are not in the same category, and the user does not perform fitness training. The Euclidean distance threshold T is obtained based on the standard pose recognition model, and the threshold T is used to judge whether the user pose and the standard pose are of the same type; if the Euclidean distance of the scoring result output by the standard pose recognition model is greater than or equal to the threshold T, the user pose and the standard pose are of the same type, if the Euclidean distance of the scoring result output by the standard gesture recognition model is less than or equal to the threshold T, then the user gesture is not of the same type as the standard gesture.
操作4根据第一健身视频获取第一用户姿态,对第一用户姿态进行评分,并将评分结果反馈给用户;Operation 4 acquires the first user posture according to the first exercise video, scores the first user posture, and feeds back the scoring result to the user;
操作4.1用户在健身镜的目标健身区域中,根据第一健身视频进行跟做;Operation 4.1 The user follows the first fitness video in the target fitness area of the fitness mirror;
操作4.2第一健身视频中预设第一标准姿态;Operation 4.2 Preset the first standard posture in the first fitness video;
操作4.3获取第一标准姿态在第一健身视频中出现的第二时间段;Operation 4.3 acquires the second time period when the first standard posture appears in the first fitness video;
操作4.4在第二时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,特征提取包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标;16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨,得到第一用户姿态;Operation 4.4 In the second time period, the fitness mirror recognizes the user's actions in the target fitness area, and performs feature extraction on the target fitness area. The feature extraction includes 16 bone key points, and the 16 bone key points correspond to a two-dimensional position Coordinates; 16 bone key points include top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella , get the first user pose;
操作4.5对比第一标准姿态和若干第一用户姿态,得到每个第一用户姿态基于第一标准姿态的相识度评分;Operation 4.5 compares the first standard posture with several first user postures, and obtains the acquaintance score of each first user posture based on the first standard posture;
操作4.51;将第一标准姿态和第一用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标输入标准姿态识别模型,输出第一标准姿态和第一用户姿态的欧式距离;Operation 4.51; Input the skeleton key points of the first standard posture and the first user posture and the position coordinates corresponding to each skeleton key point into the standard posture recognition model, and output the Euclidean distance between the first standard posture and the first user posture;
操作4.52将欧式距离转换为相识度评分;Operation 4.52 converts the Euclidean distance into an acquaintance score;
操作4.53获取相识度评分最高的第一用户姿态作为评分结果;若评分结果大于或等于评分阈值,则第一标准姿态和第一用户姿态为同一类,则用户动作达标;若评分结果小于或等于评分阈值,则第一标准姿态和第一用户姿态不为同一类,则用户动作不达标。基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型;若标准姿态识别模型输出的评分结果的欧式距离大于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的评分结果的欧式距离小于或等于阈值T,则用户姿态与标准姿态为不同一类型。Operation 4.53 Obtain the first user gesture with the highest acquaintance score as the scoring result; if the scoring result is greater than or equal to the scoring threshold, the first standard gesture and the first user gesture are of the same category, and the user action meets the standard; if the scoring result is less than or equal to scoring threshold, the first standard gesture and the first user gesture are not in the same category, and the user action does not meet the standard. The Euclidean distance threshold T is obtained based on the standard pose recognition model, and the threshold T is used to judge whether the user pose and the standard pose are of the same type; if the Euclidean distance of the scoring result output by the standard pose recognition model is greater than or equal to the threshold T, the user pose and the standard pose are of the same type, if the Euclidean distance of the scoring result output by the standard gesture recognition model is less than or equal to the threshold T, then the user gesture is not of the same type as the standard gesture.
操作4.54获取相识度评分最高的第一用户姿态作为评分结果,并将评分结果反馈给用户;Operation 4.54 obtains the first user gesture with the highest acquaintance score as the scoring result, and feeds back the scoring result to the user;
若评分结果大于或等于评分阈值,则第一标准姿态和第一用户姿态为同一类,则用户动作达标;若评分结果小于或等于评分阈值,则第一标准姿态和第一用户姿态不为同一类,则用户动作未达标。在本实施例中评分阈值为40。If the scoring result is greater than or equal to the scoring threshold, the first standard gesture and the first user gesture are of the same category, and the user action is up to the standard; if the scoring result is less than or equal to the scoring threshold, the first standard gesture and the first user gesture are not the same class, the user action is not up to standard. In this embodiment, the scoring threshold is 40.
在本实施例中,若第一标准姿态在健身视频中出现的时刻在10000ms处。现在健身视频播放到10000毫秒处,因为用户是跟着视频练,他的动作相比课程存在超后或者超前的情况,所以我们会在10000ms的附近设置个区间,比如前800ms和后200ms,也就是时间区间[10000-800,10000+200]在这总时长为1秒的区间里,每一帧都计算出第一标准姿态和第一用户姿态的相似度,然后把这个区间里相似度对高的分数作为最终的分数输出。In this embodiment, if the first standard posture appears in the fitness video at 10000 ms. Now the fitness video is played to 10,000 milliseconds, because the user is practicing with the video, and his movements are behind or ahead of the course, so we will set an interval around 10,000ms, such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200] In this interval with a total duration of 1 second, the similarity between the first standard pose and the first user pose is calculated for each frame, and then the similarity in this interval is compared to the high The score is output as the final score.
在本实施例中,若第二标准姿态在健身视频中出现的时刻在10000ms处。现在健身视频播放到10000毫秒处,因为用户是跟着视频练,他的动作相比课程存在超后或者超前的情况,所以我们会在10000ms的附近设置个区间,比如前800ms和后200ms,也就是时间区间[10000- 800,10000+200]在这总时长为1秒的区间里,计算出每一帧和第一用户姿态的相似度,然后把这个区间里相似度对高的分数作为最终的分数输出。In this embodiment, if the second standard posture appears in the fitness video at 10000 ms. Now the fitness video is played to 10,000 milliseconds, because the user is practicing with the video, and his movements are behind or ahead of the course, so we will set an interval around 10,000ms, such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200] In this interval with a total duration of 1 second, calculate the similarity between each frame and the first user pose, and then use the highest similarity score in this interval as the final Score output.
在具体计算时,第一标准姿态和第二标准姿态均为静止的姿态,并不是一个连续的动作。比对的具体方法是训练一个基于卷积神经网络的孪生网络结构模型,该模型接受两个姿态,并分别将这两个姿态映射到高维空间的一个点。在本实施例中,第二健身视频可以为将第一次播放的健身视频进行慢动作播放,也可以为第一健身视频的动作分解。In specific calculation, both the first standard posture and the second standard posture are static postures, not a continuous movement. The specific method of comparison is to train a Siamese network structure model based on convolutional neural network, which accepts two poses and maps the two poses to a point in high-dimensional space. In this embodiment, the second exercise video may be played in slow motion from the exercise video played for the first time, or may be an action breakdown of the first exercise video.
实施例三Embodiment three
请参考图5,图5为基于健身装置的健身训练方法的流程示意图,本公开提供了基于健身装置的健身训练方法,所述健身训练方法基于上述基于健身教学训练的动作评价方法,所述健身训练方法可以实施为上述基于健身教学训练的动作评价方法的应用方法,所述健身训练方法包括:Please refer to FIG. 5. FIG. 5 is a schematic flowchart of a fitness training method based on a fitness device. The present disclosure provides a fitness training method based on a fitness device. The fitness training method is based on the above-mentioned action evaluation method based on fitness teaching and training. The fitness The training method can be implemented as an application method of the above-mentioned action evaluation method based on fitness teaching and training, and the fitness training method includes:
获取健身第一健身视频;Get fitness first fitness video;
识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到第一用户姿态;Identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture;
根据第一用户姿态,判断用户是否进行健身训练;According to the posture of the first user, it is judged whether the user performs fitness training;
根据第二健身视频预设第二标准姿态;Presetting a second standard posture according to the second fitness video;
识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到第二用户姿态;Identifying the target fitness area of the second fitness video, performing feature extraction on the target fitness area, and obtaining the second user posture;
对比第二标准姿态和第二用户姿态,得到第二用户姿态基于第二标准姿态的相识度评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练;Contrast the second standard posture and the second user posture, obtain the acquaintance score of the second user posture based on the second standard posture, and judge whether the user follows the second fitness video for fitness training according to the scoring result;
获取健身第一健身视频,根据第一健身视频预设第一标准姿态;Obtain the first fitness video for fitness, and preset the first standard posture according to the first fitness video;
识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到第一用户姿态;Identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture;
对比第一标准姿态和第一用户姿态,得到第一用户姿态基于第一标准姿态的相识度评分,并将评分结果反馈给用户。Comparing the first standard posture with the first user posture, obtaining the acquaintance score of the first user posture based on the first standard posture, and feeding back the scoring result to the user.
其中,识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到第二用户姿态,具体包括:Among them, identifying the target fitness area of the second fitness video, performing feature extraction on the target fitness area, and obtaining the second user posture, specifically including:
获取第二标准姿态在第二健身视频中出现的第一时间段;Obtain the first time period when the second standard posture appears in the second fitness video;
获取第一时间段内第二健身视频对应的视频片段,对视频片段进行分帧处理,得到视频片段对应的若干连续时刻的帧图像;Obtain the video segment corresponding to the second fitness video in the first time period, and process the video segment into frames to obtain frame images corresponding to several consecutive moments of the video segment;
在第一时间段内识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到若干与帧图像一一对应的第二用户姿态;Identifying the target fitness area of the second fitness video in the first time period, performing feature extraction on the target fitness area, and obtaining a plurality of second user postures corresponding to the frame images one-to-one;
对比相对应的若干帧图像和若干第二用户姿态,得到每个第二用户姿态基于对应的帧图像的相识度评分;Comparing corresponding several frame images and several second user postures, obtaining the acquaintance score of each second user posture based on the corresponding frame images;
获取相识度评分最高的第二用户姿态作为评分结果。其中,训练孪生神经网络模型,得到训练好的标准姿态识别模型;Obtain the second user pose with the highest acquaintance score as the scoring result. Among them, the twin neural network model is trained to obtain the trained standard gesture recognition model;
将第二用户姿态和第二标准姿态输入标准姿态识别模型,获得相识度评分;Inputting the second user pose and the second standard pose into the standard pose recognition model to obtain an acquaintance score;
若评分结果大于或等于评分阈值,则第二标准姿态和第二用户姿态为同一类,则用户进行健身训练;If the scoring result is greater than or equal to the scoring threshold, then the second standard posture and the second user posture are of the same type, and the user performs fitness training;
若评分结果小于评分阈值,则第二标准姿态和第二用户姿态不为同一类,则用户未进行健身训练。If the scoring result is less than the scoring threshold, the second standard posture and the second user posture are not in the same category, and the user has not performed fitness training.
其中,识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到第一用户姿态,具体包括:Among them, identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture, specifically including:
获取第一标准姿态在第一健身视频中出现的第二时间段;Obtain the second time period when the first standard posture appears in the first fitness video;
在第二时间段内识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到若干第一用户姿态;Identifying the target fitness area of the first fitness video in the second time period, performing feature extraction on the target fitness area, and obtaining several first user postures;
对比第一标准姿态和若干第一用户姿态,得到每个第一用户姿态基于第一标准姿态的相识度评分;Comparing the first standard posture and several first user postures, obtaining the acquaintance score of each first user posture based on the first standard posture;
获取相识度评分最高的第一用户姿态作为评分结果;Obtain the first user pose with the highest acquaintance score as the scoring result;
训练孪生神经网络模型,得到训练好的标准姿态识别模型;Train the twin neural network model to get the trained standard gesture recognition model;
将第一用户姿态和第一标准姿态输入标准姿态识别模型,获得相识度评分;Inputting the first user pose and the first standard pose into the standard pose recognition model to obtain an acquaintance score;
若评分结果大于或等于评分阈值,则第一标准姿态和第一用户姿态为同一类,则用户动作达标;If the scoring result is greater than or equal to the scoring threshold, the first standard gesture and the first user gesture are of the same category, and the user action is up to the standard;
若评分结果小于评分阈值,则第二标准姿态和第二用户姿态不为同一类,则用户动作不达标。If the scoring result is less than the scoring threshold, the second standard gesture and the second user gesture are not of the same category, and the user action does not meet the standard.
下面结合具体的例子对本公开中的基于健身装置的健身训练方法进行介绍:The fitness training method based on the fitness device in the present disclosure will be introduced in conjunction with specific examples below:
操作1获取健身第一健身视频,在本实施例中健身装置为健身镜,在健身镜的镜面上播放第一健身视频;Operation 1 obtains the first body-building video for body-building, and in the present embodiment, the body-building device is a body-building mirror, and the first body-building video is played on the mirror surface of the body-building mirror;
操作2根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练;Operation 2 Obtain the first user's posture according to the first fitness video, and determine whether the user is performing fitness training;
操作2.1用户在健身镜的目标健身区域中,根据第一健身视频进行跟做;Operation 2.1 The user follows the first fitness video in the target fitness area of the fitness mirror;
操作2.2第一健身视频中预设第一标准姿态;Operation 2.2 Preset the first standard posture in the first fitness video;
操作2.3获取第一标准姿态在第一健身视频中出现的第二时间段;Operation 2.3 acquires the second time period when the first standard posture appears in the first fitness video;
操作2.4在第二时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,得到第一用户姿态;Operation 2.4 During the second period of time, the fitness mirror recognizes the user's actions in the target fitness area, extracts features from the target fitness area, and obtains the first user posture;
操作2.5若在第二时间段内获取的第一用户姿态有不同的动作产生,即表明用户在进行 跟做,即为用户在进行健身训练;若在第二时间段内没有获取到第一用户姿态,或第一用户姿态没有动作产生,则用户没有跟做,即用户没有进行健身训练。在操作2中,在第二时间段中,对用户的动作进行识别,但是此过程中不进行打分。Operation 2.5 If the first user’s posture obtained within the second time period has different actions, it means that the user is doing follow-up, that is, the user is performing fitness training; if the first user is not obtained within the second time period posture, or the first user posture does not produce any action, then the user does not follow up, that is, the user does not perform fitness training. In operation 2, during the second time period, the user's action is recognized, but no scoring is performed during this process.
操作3根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练;Operation 3 obtains the posture of the second user according to the second fitness video, scores the posture of the second user, and judges whether the user follows the second fitness video for fitness training according to the scoring result;
操作3.1在健身镜的镜面上播放第二健身视频;Operation 3.1 Play the second fitness video on the mirror surface of the fitness mirror;
操作3.2根据第二健身视频预设第二标准姿态;Operation 3.2 Preset the second standard posture according to the second fitness video;
操作3.3获取第二标准姿态在第二健身视频中出现的第一时间段;Operation 3.3 Obtain the first time period when the second standard posture appears in the second fitness video;
操作3.4在第一时间段内,获取第一时间段内第二健身视频对应的视频片段,对视频片段进行分帧处理,得到视频片段对应的若干连续时刻的帧图像;Operation 3.4 In the first time period, obtain the video segment corresponding to the second fitness video in the first time period, perform frame processing on the video segment, and obtain frame images corresponding to several consecutive moments of the video segment;
操作3.5在第一时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,得到若干与帧图像一一对应的第二用户姿态;Operation 3.5 During the first period of time, the fitness mirror recognizes the user's actions in the target fitness area, performs feature extraction on the target fitness area, and obtains a number of second user postures corresponding to the frame images one-to-one;
操作3.6对比相对应的若干帧图像和若干第二用户姿态,得到每个第二用户姿态基于对应的帧图像的相识度评分;Operation 3.6 compares corresponding several frame images and several second user poses, and obtains the acquaintance score of each second user pose based on the corresponding frame images;
操作3.61训练孪生神经网络模型,得到训练好的标准姿态识别模型;Operation 3.61 trains the twin neural network model to obtain the trained standard gesture recognition model;
操作3.62将第二用户姿态和第二标准姿态输入标准姿态识别模型,获得相识度评分;Operation 3.62 Input the second user pose and the second standard pose into the standard pose recognition model to obtain the acquaintance score;
操作3.7获取相识度评分最高的第二用户姿态作为评分结果;若评分结果大于或等于评分阈值,则第二标准姿态和第二用户姿态为同一类,则用户进行健身训练;若评分结果小于评分阈值,则第二标准姿态和第二用户姿态不为同一类,则用户未用户进行健身训练。Operation 3.7 Get the second user posture with the highest acquaintance score as the scoring result; if the scoring result is greater than or equal to the scoring threshold, the second standard posture and the second user posture are of the same class, and the user performs fitness training; if the scoring result is less than the scoring threshold threshold, the second standard posture and the second user posture are not of the same type, and the user does not perform fitness training for the user.
操作4根据第一健身视频获取第一用户姿态,对第一用户姿态进行评分,并将评分结果反馈给用户;Operation 4 acquires the first user posture according to the first exercise video, scores the first user posture, and feeds back the scoring result to the user;
操作4.1用户在健身镜的目标健身区域中,根据第一健身视频进行跟做;Operation 4.1 The user follows the first fitness video in the target fitness area of the fitness mirror;
操作4.2第一健身视频中预设第一标准姿态;Operation 4.2 Preset the first standard posture in the first fitness video;
操作4.3获取第一标准姿态在第一健身视频中出现的第二时间段;Operation 4.3 acquires the second time period when the first standard posture appears in the first fitness video;
操作4.4在第二时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,得到第一用户姿态;Operation 4.4 In the second time period, the fitness mirror recognizes the user's actions in the target fitness area, extracts features from the target fitness area, and obtains the first user posture;
操作4.5对比第一标准姿态和若干第一用户姿态,得到每个第一用户姿态基于第一标准姿态的相识度评分;Operation 4.5 compares the first standard posture with several first user postures, and obtains the acquaintance score of each first user posture based on the first standard posture;
操作4.51将第一用户姿态和第一标准姿态输入标准姿态识别模型,获得相识度评分;Operation 4.51 Input the first user pose and the first standard pose into the standard pose recognition model to obtain an acquaintance score;
操作4.52获取相识度评分最高的第一用户姿态作为评分结果;Operation 4.52 acquires the first user gesture with the highest acquaintance score as the scoring result;
若评分结果大于或等于评分阈值,则第一标准姿态和第一用户姿态为同一类,则用户动 作达标;若评分结果小于评分阈值,则第一标准姿态和第一用户姿态不为同一类,则用户动作未达标。If the scoring result is greater than or equal to the scoring threshold, then the first standard gesture and the first user gesture are of the same category, and the user action is up to the standard; if the scoring result is less than the scoring threshold, then the first standard gesture and the first user gesture are not of the same category, The user action is not up to standard.
在本实施例中,若第一标准姿态在健身视频中出现的时刻在10000ms处。现在健身视频播放到10000毫秒处,因为用户是跟着视频练,他的动作相比课程存在超后或者超前的情况,所以我们会在10000ms的附近设置个区间,比如前800ms和后200ms,也就是时间区间[10000-800,10000+200]在这总时长为1秒的区间里,每一帧都计算出第一标准姿态和第一用户姿态的相似度,然后把这个区间里相似度对高的分数作为最终的分数输出。In this embodiment, if the first standard posture appears in the fitness video at 10000 ms. Now the fitness video is played to 10,000 milliseconds, because the user is practicing with the video, and his movements are behind or ahead of the course, so we will set an interval around 10,000ms, such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200] In this interval with a total duration of 1 second, the similarity between the first standard pose and the first user pose is calculated for each frame, and then the similarity in this interval is compared to the high The score is output as the final score.
在本实施例中,若第二标准姿态在健身视频中出现的时刻在10000ms处。现在健身视频播放到10000毫秒处,因为用户是跟着视频练,他的动作相比课程存在超后或者超前的情况,所以我们会在10000ms的附近设置个区间,比如前800ms和后200ms,也就是时间区间[10000-800,10000+200]在这总时长为1秒的区间里,计算出每一帧和第一用户姿态的相似度,然后把这个区间里相似度对高的分数作为最终的分数输出。In this embodiment, if the second standard posture appears in the fitness video at 10000 ms. Now the fitness video is played to 10,000 milliseconds, because the user is practicing with the video, and his movements are behind or ahead of the course, so we will set an interval around 10,000ms, such as the first 800ms and the last 200ms, that is Time interval [10000-800, 10000+200] In this interval with a total duration of 1 second, calculate the similarity between each frame and the first user pose, and then use the highest similarity score in this interval as the final Score output.
在具体计算时,第一标准姿态和第二标准姿态均为静止的姿态,并不是一个连续的动作。比对的具体方法是训练一个基于卷积神经网络的孪生网络结构模型,该模型接受两个姿态,并分别将这两个姿态映射到高维空间的一个点。In specific calculation, both the first standard posture and the second standard posture are static postures, not a continuous movement. The specific method of comparison is to train a Siamese network structure model based on convolutional neural network, which accepts two poses and maps the two poses to a point in high-dimensional space.
在本实施例中,将用户姿态和标准姿态输入标准姿态识别模型,获得相识度评分的具体方法为:In this embodiment, the user posture and standard posture are input into the standard posture recognition model, and the specific method for obtaining the acquaintance score is as follows:
获取标准姿态和用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标,其中,标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标;16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨;Obtain the bone key points of the standard pose and the user pose and the position coordinates corresponding to each bone key point, wherein the standard pose and the user pose both include 16 bone key points, and the 16 bone key points correspond to a two-dimensional position coordinate; 16 A skeleton key point includes the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, the left knee, the left foot, and the patella;
将标准姿态和用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到标准姿态的输出向量V1和用户姿态的输出向量V2;Input the position coordinates corresponding to each bone key point of the standard pose and the user pose into the trained standard pose recognition model, and obtain the output vector V1 of the standard pose and the output vector V2 of the user pose respectively;
计算标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture;
其中,在本公开中一个人体姿态共有16个二维坐标的骨骼点,每个骨骼点有x和y坐标分量,那么一个人体姿态可以抽象成一个32维的骨骼点向量,即[x1,y1,x2,y2,x3,y3,…,x16,y16]。在通过训练好的姿态识别模型后,这个32维的骨骼点向量,会映射成一个更高维的向量,在本公开中输出的向量为100维,即标准姿态的输出向量V1和用户姿态的输出向量V2均为100维,即[a1,a2,a3,…,a100]。在进行姿态比对时,标准姿态和用户姿态经过训练好的模型都会各自映射成一个100维的向量,即V1和V2,再计算V1和V2的欧式距离。Among them, in this disclosure, a human body posture has a total of 16 bone points with two-dimensional coordinates, and each bone point has x and y coordinate components, so a human body posture can be abstracted into a 32-dimensional bone point vector, namely [x1, y1 ,x2,y2,x3,y3,...,x16,y16]. After passing the trained pose recognition model, the 32-dimensional bone point vector will be mapped into a higher-dimensional vector. The output vector in this disclosure is 100-dimensional, that is, the output vector V1 of the standard pose and the V1 of the user pose The output vectors V2 are all 100-dimensional, namely [a1, a2, a3,..., a100]. When performing pose comparison, the trained models of the standard pose and the user pose will each be mapped into a 100-dimensional vector, namely V1 and V2, and then the Euclidean distance between V1 and V2 will be calculated.
本公开使用一个深度神经网络,该网络接受一个32维的向量,即本公开中的一个人体姿 态,然后经过一系列的中间层操作,比如非线性矫正,全连接等,最终输出一个100维的向量。这个100维的向量是一个高度抽象的特征;最终使得如果两个姿态是很想相似的,经过网络输出后的两个100维的向量的欧式距离很小,反之,欧式距离很大。This disclosure uses a deep neural network, which accepts a 32-dimensional vector, that is, a human body pose in this disclosure, and then undergoes a series of intermediate layer operations, such as nonlinear correction, full connection, etc., and finally outputs a 100-dimensional vector. This 100-dimensional vector is a highly abstract feature; in the end, if the two poses are very similar, the Euclidean distance between the two 100-dimensional vectors output by the network is very small, otherwise, the Euclidean distance is very large.
我们的神经网络共4层,从输入到输出每一层的节点数分别是32->64->128->100,即输入32维的向量,输出一个100维的向量,n维空间的欧式距离计算公式,本公开映射到100维,即n=100:Our neural network has a total of 4 layers. The number of nodes in each layer from input to output is 32->64->128->100, that is, a 32-dimensional vector is input and a 100-dimensional vector is output. The n-dimensional Euclidean The distance calculation formula, this disclosure maps to 100 dimensions, ie n=100:
Figure PCTCN2022070026-appb-000004
Figure PCTCN2022070026-appb-000004
基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型;若标准姿态识别模型输出的评分结果的欧式距离大于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的评分结果的欧式距离小于或等于阈值T,则用户姿态与标准姿态为不同一类型;The Euclidean distance threshold T is obtained based on the standard pose recognition model, and the threshold T is used to judge whether the user pose and the standard pose are of the same type; if the Euclidean distance of the scoring result output by the standard pose recognition model is greater than or equal to the threshold T, the user pose and the standard pose For the same type, if the Euclidean distance of the scoring result output by the standard gesture recognition model is less than or equal to the threshold T, then the user gesture and the standard gesture are not of the same type;
将欧式距离转化为用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果。具体的,如果两个姿态的欧式距离超过阈值T,则认为不是同一类型;反之则认为是同一模型。对于每一个阈值T,都可以绘制出ROC曲线,如图4所示,ROC曲线下的面积,称为AUC,是一个0-1的值,AUC越大,模型性能越好。找到一个最优的阈值T-best,使得AUC在测试集上最大。若AUC最大,就是模型会尽可能多的将两个原本属于同一类的姿态判定为同一类,同时,会尽量少地将两个不属于同类的姿态误判为同一类。得到最优距离阈值T-best后,我们根据实际业务需求设置一个临界分数,比如40分,表示在此时,模型认为这两个姿态刚好处于相似和不相似的临界点。然后映射关系如下:实际距离t在[0,T-best]区间内,相似度分数s为[100,40];实际距离t在(T-best,无穷大)时,相似度分数s为(40,0)。在本实施例中阈值T为40。The Euclidean distance is converted into the acquaintance score between the user pose and the standard pose, and the user pose with the highest acquaintance score is obtained as the scoring result. Specifically, if the Euclidean distance of the two poses exceeds the threshold T, they are considered not to be of the same type; otherwise, they are considered to be of the same model. For each threshold T, the ROC curve can be drawn, as shown in Figure 4, the area under the ROC curve, called AUC, is a value of 0-1, the larger the AUC, the better the model performance. Find an optimal threshold T-best that maximizes AUC on the test set. If the AUC is the largest, the model will judge as many poses that originally belonged to the same class as the same class as much as possible, and at the same time, it will try to misjudge two poses that do not belong to the same class as the same class as little as possible. After obtaining the optimal distance threshold T-best, we set a critical score according to actual business needs, such as 40 points, which means that at this time, the model believes that the two postures are just at the critical point of similarity and dissimilarity. Then the mapping relationship is as follows: when the actual distance t is in the interval [0, T-best], the similarity score s is [100, 40]; when the actual distance t is in (T-best, infinity), the similarity score s is (40 ,0). The threshold T is 40 in this embodiment.
标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标。其中,16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨。Both the standard pose and the user pose include 16 bone key points, each of which corresponds to a two-dimensional position coordinate. Among them, 16 bone key points include the top of the head, the bottom of the head, the neck, the right shoulder, the right elbow, the right hand, the left shoulder, the left elbow, the left hand, the right hip, the right knee, the right foot, the left hip, the left knee, the left foot, and the patella .
实施例四Embodiment four
操作1在实施例三的基础上,进一步地利用到体感游戏上,对于跑酷类游戏,我们设置一个模拟用户的小人或者小动物,对于该游戏,会在路上设置一些模拟用户的小人或小动物需要躲避的障碍物,需要小人或小动物跳跃或向左/向右倾斜身体躲过,对应地,我们将用户的站立对应小人或小动物的自动向前走,将用户的向左/右扭腰对应小人或小动物的向左/右倾斜身体,将用户的原地跳跃对应小人或小动物的跳跃,额外地,还可以设置一些其他的动作, 如将用户的高抬腿对应小人或小动物的加速奔跑,用户在健身镜的目标健身区域中,根据第一健身视频进行跟做;Operation 1 is based on the third embodiment, and is further used in somatosensory games. For parkour games, we set a villain or animal that simulates the user. For this game, some villains that simulate the user will be set on the road Or the obstacles that small animals need to avoid, require small people or small animals to jump or lean left/right to avoid. Twisting the waist left/right corresponds to the left/right tilting of the villain or the animal, and the jumping of the user on the spot corresponds to the jumping of the villain or the animal. In addition, some other actions can be set, such as the user’s Raising the legs corresponds to the accelerated running of small people or small animals. The user follows the first fitness video in the target fitness area of the fitness mirror;
第一健身视频中将站立、左右扭腰、原地跳跃和高抬腿作为第一标准姿态;In the first fitness video, standing, twisting the waist left and right, jumping in place and raising the legs are the first standard postures;
获取每个第一标准姿态在第一健身视频中出现的第二时间段;Obtain the second time period when each first standard posture appears in the first fitness video;
在第二时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,得到站立、左右扭腰、原地跳跃和高抬腿对应的第一用户姿态;若在第二时间段内获取的第一用户姿态有不同的动作产生,即表明用户在进行跟做,即为用户在进行健身训练;若在第二时间段内没有获取到第一用户姿态,或第一用户姿态没有动作产生,则用户没有跟做,即用户没有进行健身训练。在第二时间段中,对用户的动作进行识别,但是此过程中不进行打分。In the second time period, the fitness mirror recognizes the user's actions in the target fitness area, extracts the features of the target fitness area, and obtains the first user posture corresponding to standing, twisting left and right, jumping in situ, and raising legs; The first user gesture acquired within the second time period has different actions, which means that the user is doing follow-up, that is, the user is performing fitness training; if the first user gesture is not acquired within the second time period, or If no action occurs in the first user gesture, the user does not follow suit, that is, the user does not perform fitness training. In the second time period, the user's actions are recognized, but scoring is not performed during this process.
操作2第二健身视频中将站立、左右扭腰、原地跳跃和高抬腿作为第二标准姿态;In the second fitness video of operation 2, standing, twisting the waist left and right, jumping in situ and raising the legs are used as the second standard posture;
获取每个第二标准姿态在第二健身视频中出现的第一时间段;Obtain the first time period when each second standard posture appears in the second fitness video;
在第一时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,得到站立、左右扭腰、原地跳跃和高抬腿对应的第二用户姿态,在实施例三的基础上对每个第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练。In the first period of time, the fitness mirror recognizes the user's actions in the target fitness area, extracts the features of the target fitness area, and obtains the second user posture corresponding to standing, twisting the waist left and right, jumping in situ, and raising the legs. On the basis of the third embodiment, each second user's posture is scored, and it is judged according to the scoring result whether the user follows the second fitness video for fitness training.
操作3第一健身视频中将站立、左右扭腰、原地跳跃和高抬腿作为第一标准姿态;In the first fitness video of operation 3, standing, twisting the waist left and right, jumping in situ and raising the legs are taken as the first standard posture;
获取每个第一标准姿态在第一健身视频中出现的第二时间段;Obtain the second time period when each first standard posture appears in the first fitness video;
在第二时间段内,健身镜在目标健身区域中对用户动作进行识别,对目标健身区域进行特征提取,得到站立、左右扭腰、原地跳跃和高抬腿对应的第一用户姿态;在实施例三的基础上对每个第一用户标准姿态进行评分,并将评分结果反馈给用户。In the second time period, the fitness mirror recognizes the user's actions in the target fitness area, extracts the features of the target fitness area, and obtains the first user posture corresponding to standing, twisting left and right, jumping in situ, and raising legs; On the basis of Embodiment 3, each first user's standard posture is scored, and the scoring result is fed back to the user.
在本实施例中,由于健身视频为游戏视频,通过不同的标准姿态能够控制游戏视频中的小人或小动物运动,因此本公开对用户的动作进行识别时,本公开的健身训练方法不仅能够用于识别动作并对动作进行评价打分,同时基于标准姿态,用户姿态还能够控制健身视频中的小人或小动物运动,其中站立为控制动物向前走,左右扭腰为控制小人或小动物向左或向右倾斜身体,原地跳跃为控制动物跳跃,高抬腿为控制小人或小动物加速奔跑。实施例五In this embodiment, since the fitness video is a game video, the movement of small people or small animals in the game video can be controlled through different standard gestures, so when the present disclosure recognizes the user's actions, the fitness training method of the present disclosure can not only It is used to identify actions and evaluate and score actions. At the same time, based on standard postures, user postures can also control the movement of small people or small animals in fitness videos. Standing is used to control animals to move forward, and twisting left and right is used to control small people or small animals. The animal leans its body to the left or right, jumps in place to control the animal to jump, and raises its legs to control the small person or small animal to run faster. Embodiment five
请参考图6,图6可以是基于健身教学训练的动作评价系统的组成示意图,本公开实施例五提供了基于健身教学训练的动作评价系统,所述动作评价系统包括:Please refer to FIG. 6. FIG. 6 may be a schematic diagram of the composition of an action evaluation system based on fitness teaching and training. Embodiment 5 of the present disclosure provides an action evaluation system based on fitness teaching and training. The action evaluation system includes:
获取模块,用于获取健身视频,并获取健身视频中预设的标准姿态对应的时间段,以及该时间段中获取健身视频对应的连续时刻的帧图像;The obtaining module is used to obtain the fitness video, and obtain the time period corresponding to the preset standard posture in the fitness video, and obtain the frame images of the continuous moments corresponding to the fitness video in the time period;
识别模块,用于识别上述时间段中连续时刻的帧图像对应的若干用户姿态;An identification module, configured to identify several user gestures corresponding to the frame images at consecutive moments in the above time period;
对比模块,用于对比若干用户姿态和标准姿态,得到对比结果;A comparison module is used to compare several user postures and standard postures to obtain comparison results;
判断模块,用于根据对比结果判断用户姿态与标准姿态是否为同一类型。The judging module is used to judge whether the user posture and the standard posture are of the same type according to the comparison result.
本实施例中,进一步的,所述对比模块对比若干用户姿态和标准姿态,具体包括训练孪生神经网络模型,得到训练好的标准姿态识别模型;In this embodiment, further, the comparison module compares several user postures and standard postures, specifically including training a twin neural network model to obtain a trained standard posture recognition model;
所述判断模块根据对比结果判断用户姿态与标准姿态是否为同一类型,具体包括将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型。The judging module judges whether the user posture and the standard posture are of the same type according to the comparison result, specifically including inputting the user posture and the standard posture into the standard posture recognition model, and judging whether the user posture and the standard posture are of the same type.
本实施例中,进一步的,所述判断模块将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型,具体包括:In this embodiment, further, the judging module inputs the user posture and the standard posture into the standard posture recognition model, and judges whether the user posture and the standard posture are of the same type, specifically including:
获取标准姿态和用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标;Get the bone key points of the standard pose and user pose and the position coordinates corresponding to each bone key point;
将标准姿态和用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到标准姿态的输出向量V1和用户姿态的输出向量V2;Input the position coordinates corresponding to each bone key point of the standard pose and the user pose into the trained standard pose recognition model, and obtain the output vector V1 of the standard pose and the output vector V2 of the user pose respectively;
计算标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture;
通过标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离判断用户姿态与标准姿态是否为同一类型。It is judged whether the user posture and the standard posture are of the same type by the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture.
本实施例中,进一步的,所述判断模块基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型,其中,若标准姿态识别模型输出的欧式距离小于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的欧式距离大于阈值T,则用户姿态与标准姿态为不同一类型。In this embodiment, further, the judgment module obtains the Euclidean distance threshold T based on the standard posture recognition model, and the threshold T is used to judge whether the user posture and the standard posture are of the same type, wherein, if the Euclidean distance output by the standard posture recognition model is less than or equal to the threshold T, the user pose is of the same type as the standard pose, and if the Euclidean distance output by the standard pose recognition model is greater than the threshold T, then the user pose is of a different type from the standard pose.
本实施例中,进一步的,标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标,16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨。In this embodiment, further, the standard posture and the user posture both include 16 skeleton key points, and the 16 skeleton key points correspond to a two-dimensional position coordinate respectively, and the 16 skeleton key points include the top of the head, the bottom of the head, the neck, and the right shoulder , right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella.
本实施例中,进一步的,所述判断模块将用户姿态和标准姿态输入标准姿态识别模型,输出用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果,通过评分结果判断用户姿态与标准姿态是否为同一类型。In this embodiment, further, the judging module inputs the user posture and the standard posture into the standard posture recognition model, outputs the acquaintance score between the user posture and the standard posture, obtains the user posture with the highest acquaintance score as the scoring result, and passes the scoring result Determine whether the user gesture is of the same type as the standard gesture.
实施例六Embodiment six
本公开实施例六提供了基于健身装置的健身训练系统,实施例六的健身训练系统可以实施为上述动作评价系统的应用系统,实施例六的健身训练系统执行:Embodiment 6 of the present disclosure provides a fitness training system based on a fitness device. The fitness training system in Embodiment 6 can be implemented as an application system of the above-mentioned action evaluation system. The fitness training system in Embodiment 6 executes:
根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练;Obtaining the first user's posture according to the first fitness video, and judging whether the user is performing fitness training;
根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练;Acquiring the second user's posture according to the second fitness video, scoring the second user's posture, and judging whether the user follows the second fitness video for fitness training according to the scoring result;
根据第一健身视频获取第一用户姿态,对第一用户姿态进行评分,并将评分结果反馈给 用户。Obtain the first user posture according to the first fitness video, score the first user posture, and feedback the scoring result to the user.
实施例七Embodiment seven
请参考图6,图6可以是基于健身装置的健身训练系统的组成示意图,本公开实施例七提供了基于健身装置的健身训练系统,实施例七的健身训练系统可以实施为上述动作评价系统的应用系统,所述健身训练系统包括:Please refer to FIG. 6. FIG. 6 can be a schematic diagram of the composition of a fitness training system based on a fitness device. Embodiment 7 of the present disclosure provides a fitness training system based on a fitness device. The fitness training system in Embodiment 7 can be implemented as the above-mentioned action evaluation system. Application system, the fitness training system includes:
获取模块,用于获取健身视频,并对健身视频进行处理;The obtaining module is used to obtain the fitness video and process the fitness video;
识别模块,用于识别健身视频的目标健身区域,对目标健身区域进行特征提取,得到用户姿态;The identification module is used to identify the target fitness area of the fitness video, and extracts the features of the target fitness area to obtain the user's posture;
对比模块,用于对比用户姿态和预设的标准姿态,得到对比结果;A comparison module is used to compare the user's posture with a preset standard posture to obtain a comparison result;
判断模块,用于根据对比结果判断用户是否进行健身训练或动作是否达标。The judging module is used to judge whether the user performs fitness training or whether the action meets the standard according to the comparison result.
实施例八Embodiment eight
本公开实施例八提供了一种电子装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现所述基于健身教学训练的动作评价方法。Embodiment 8 of the present disclosure provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the computer program, the An action evaluation method based on fitness teaching and training.
其中,所述处理器可以是中央处理器,还可以是其他通用处理器、数字信号处理器、专用集成电路、现成可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。Wherein, the processor may be a central processing unit, or other general-purpose processors, digital signal processors, application-specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
所述存储器可用于存储所述计算机程序和/或模块,所述处理器通过运行或执行存储在所述存储器内的数据,实现公开中基于健身教学训练的动作评价装置的各种功能。所述存储器可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等。此外,存储器可以包括高速随机存取存储器、还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡,安全数字卡,闪存卡、至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory can be used to store the computer programs and/or modules, and the processor can realize various functions of the disclosed exercise evaluation device based on fitness teaching and training by running or executing the data stored in the memory. The memory may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application required by a function (such as a sound playback function, an image playback function, etc.) and the like. In addition, the memory can include high-speed random access memory, and can also include non-volatile memory, such as hard disk, internal memory, plug-in hard disk, smart memory card, secure digital card, flash memory card, at least one magnetic disk storage device, flash memory device, or other volatile solid-state memory devices.
实施例九Embodiment nine
本公开实施例九提供了一种电子装置,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现所述基于健身装置的健身训练方法。Embodiment 9 of the present disclosure provides an electronic device, including a memory, a processor, and a computer program stored in the memory and operable on the processor. When the processor executes the computer program, the A fitness training method based on a fitness device.
其中,所述处理器可以是中央处理器,还可以是其他通用处理器、数字信号处理器、专用集成电路、现成可编程门阵列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。Wherein, the processor may be a central processing unit, or other general-purpose processors, digital signal processors, application-specific integrated circuits, off-the-shelf programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.
所述存储器可用于存储所述计算机程序和/或模块,所述处理器通过运行或执行存储在所 述存储器内的数据,实现公开中基于健身装置的健身训练装置的各种功能。所述存储器可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等。此外,存储器可以包括高速随机存取存储器、还可以包括非易失性存储器,例如硬盘、内存、插接式硬盘,智能存储卡,安全数字卡,闪存卡、至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory can be used to store the computer programs and/or modules, and the processor can realize various functions of the fitness training device based on the fitness device in the disclosure by running or executing the data stored in the memory. The memory may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, at least one application required by a function (such as a sound playback function, an image playback function, etc.) and the like. In addition, the memory can include high-speed random access memory, and can also include non-volatile memory, such as hard disk, internal memory, plug-in hard disk, smart memory card, secure digital card, flash memory card, at least one magnetic disk storage device, flash memory device, or other volatile solid-state memory devices.
实施例十Embodiment ten
本公开实施例十提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现所述基于健身教学训练的动作评价方法。Embodiment 10 of the present disclosure provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the exercise evaluation method based on fitness teaching and training is realized.
本公开实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质可以是但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ReadOnlyMemory,ROM)、可擦式可编程只读存储器((ErasableProgrammableReadOnlyMemory,EPROM)或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer storage medium in the embodiments of the present disclosure may use any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more conductors, portable computer disks, hard disks, random access memory (RAM), read-only memory (ReadOnlyMemory, ROM ), erasable programmable read-only memory ((ErasableProgrammableReadOnlyMemory, EPROM) or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
实施例十一Embodiment Eleven
本公开实施例十一提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时实现所述基于健身装置的健身训练方法。Embodiment 11 of the present disclosure provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the fitness training method based on a fitness device is implemented.
本公开实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质可以是但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ReadOnlyMemory,ROM)、可擦式可编程只读存储器((ErasableProgrammableReadOnlyMemory,EPROM)或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。The computer storage medium in the embodiments of the present disclosure may use any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples (non-exhaustive list) of computer-readable storage media include: electrical connections with one or more conductors, portable computer disks, hard disks, random access memory (RAM), read-only memory (ReadOnlyMemory, ROM ), erasable programmable read-only memory ((ErasableProgrammableReadOnlyMemory, EPROM) or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
尽管已描述了本公开的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例作出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例 以及落入本公开范围的所有变更和修改。While preferred embodiments of the present disclosure have been described, additional changes and modifications can be made to these embodiments by those skilled in the art once the basic inventive concept is appreciated. Therefore, it is intended that the appended claims be construed to cover the preferred embodiment and all changes and modifications which fall within the scope of the present disclosure.
显然,本领域的技术人员可以对本公开进行各种改动和变型而不脱离本公开的精神和范围。这样,倘若本公开的这些修改和变型属于本公开权利要求及其等同技术的范围之内,则本公开也意图包含这些改动和变型在内。It is obvious that those skilled in the art can make various changes and modifications to the present disclosure without departing from the spirit and scope of the present disclosure. Thus, if these modifications and variations of the present disclosure fall within the scope of the claims of the present disclosure and equivalent technologies thereof, the present disclosure also intends to include these modifications and variations.

Claims (20)

  1. 基于健身教学训练的动作评价方法,包括:Action evaluation methods based on fitness teaching and training, including:
    获取健身视频,健身视频中预设有标准姿态;Obtain fitness videos, in which standard postures are preset;
    获取健身视频中标准姿态对应的时间段;Obtain the time period corresponding to the standard posture in the fitness video;
    在该时间段中获取健身视频对应的连续时刻的帧图像;Acquire frame images of continuous moments corresponding to the fitness video during the time period;
    识别该时间段中连续时刻的帧图像对应的若干用户姿态;Identify several user gestures corresponding to frame images at consecutive moments in the time period;
    对比若干用户姿态和标准姿态,判断用户姿态与标准姿态是否为同一类型。Compare several user postures with standard postures, and determine whether the user postures and standard postures are of the same type.
  2. 根据权利要求1所述的基于健身教学训练的动作评价方法,其中,对比若干用户姿态和标准姿态,判断用户姿态与标准姿态是否为同一类型,具体包括:The action evaluation method based on fitness teaching and training according to claim 1, wherein comparing several user postures and standard postures to determine whether the user postures and standard postures are of the same type specifically includes:
    训练孪生神经网络模型,得到训练好的标准姿态识别模型;Train the twin neural network model to get the trained standard gesture recognition model;
    将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型。Input the user pose and the standard pose into the standard pose recognition model, and judge whether the user pose and the standard pose are of the same type.
  3. 根据权利要求2所述的基于健身教学训练的动作评价方法,其中,将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型,具体包括:The action evaluation method based on fitness teaching and training according to claim 2, wherein the user posture and the standard posture are input into the standard posture recognition model, and it is judged whether the user posture and the standard posture are of the same type, specifically comprising:
    获取标准姿态和用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标;Get the bone key points of the standard pose and user pose and the position coordinates corresponding to each bone key point;
    将标准姿态和用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到标准姿态的输出向量V1和用户姿态的输出向量V2;Input the position coordinates corresponding to each bone key point of the standard pose and the user pose into the trained standard pose recognition model, and obtain the output vector V1 of the standard pose and the output vector V2 of the user pose respectively;
    计算标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture;
    通过标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离判断用户姿态与标准姿态是否为同一类型。It is judged whether the user posture and the standard posture are of the same type by the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture.
  4. 根据权利要求3所述的基于健身教学训练的动作评价方法,其中,基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型,其中,若标准姿态识别模型输出的欧式距离小于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的欧式距离大于阈值T,则用户姿态与标准姿态为不同一类型。The action evaluation method based on fitness teaching training according to claim 3, wherein, based on the standard gesture recognition model, the Euclidean distance threshold T is obtained, and the threshold T is used to judge whether the user gesture and the standard gesture are of the same type, wherein, if the standard gesture recognition If the Euclidean distance output by the model is less than or equal to the threshold T, the user pose is of the same type as the standard pose. If the Euclidean distance output by the standard pose recognition model is greater than the threshold T, the user pose is of a different type from the standard pose.
  5. 根据权利要求3所述的基于健身教学训练的动作评价方法,其中,标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标,16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨。The action evaluation method based on fitness teaching training according to claim 3, wherein both the standard posture and the user posture include 16 key points of bones, the 16 key points of bones respectively correspond to a two-dimensional position coordinate, and the key points of bones include Top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella.
  6. 根据权利要求3所述的基于健身教学训练的动作评价方法,其中,将用户姿态和标准姿态输入标准姿态识别模型,输出用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果,通过评分结果判断用户姿态与标准姿态是否为同一类型。The action evaluation method based on fitness teaching training according to claim 3, wherein the user posture and standard posture are input into the standard posture recognition model, the acquaintance score of the user posture and the standard posture is output, and the user posture with the highest acquaintance score is obtained as Scoring results, by which it is judged whether the user pose is of the same type as the standard pose.
  7. 基于健身装置的健身训练方法,所述健身训练方法基于权利要求1所述的基于健身教 学训练的动作评价方法,所述健身训练方法包括:Based on the fitness training method of the fitness device, the fitness training method is based on the action evaluation method based on the fitness teaching training described in claim 1, and the fitness training method comprises:
    根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练;Obtaining the first user's posture according to the first fitness video, and judging whether the user is performing fitness training;
    根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练;Acquiring the second user's posture according to the second fitness video, scoring the second user's posture, and judging whether the user follows the second fitness video for fitness training according to the scoring result;
    根据第一健身视频获取第一用户姿态,对第一用户姿态进行评分,并将评分结果反馈给用户。The first user posture is acquired according to the first fitness video, the first user posture is scored, and the scoring result is fed back to the user.
  8. 根据权利要求7所述的基于健身装置的健身训练方法,其中,根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练,具体包括:The fitness training method based on a fitness device according to claim 7, wherein the first user posture is acquired according to the first fitness video, and it is judged whether the user is performing fitness training, specifically comprising:
    获取第一健身视频;Get the first fitness video;
    识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到第一用户姿态;Identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture;
    根据第一用户姿态,判断用户是否进行健身训练。According to the first user posture, it is judged whether the user performs fitness training.
  9. 根据权利要求7所述的基于健身装置的健身训练方法,其中,根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练,具体包括:The fitness training method based on a fitness device according to claim 7, wherein the second user posture is obtained according to the second fitness video, the second user posture is scored, and whether the user follows the second fitness video for fitness training is judged according to the scoring result , including:
    根据第二健身视频预设第二标准姿态;Presetting a second standard posture according to the second fitness video;
    识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到第二用户姿态;Identifying the target fitness area of the second fitness video, performing feature extraction on the target fitness area, and obtaining the second user posture;
    对比第二标准姿态和第二用户姿态,得到第二用户姿态基于第二标准姿态的相识度评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练。Comparing the second standard posture and the second user posture, the acquaintance score of the second user posture based on the second standard posture is obtained, and judging whether the user follows the second fitness video for fitness training according to the scoring result.
  10. 根据权利要求9所述的基于健身装置的健身训练方法,其中,识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到第二用户姿态,具体包括:The fitness training method based on a fitness device according to claim 9, wherein the target fitness area of the second fitness video is identified, feature extraction is performed on the target fitness area, and the second user posture is obtained, which specifically includes:
    获取第二标准姿态在第二健身视频中出现的第一时间段;Obtain the first time period when the second standard posture appears in the second fitness video;
    获取第一时间段内第二健身视频对应的视频片段,对视频片段进行分帧处理,得到视频片段对应的若干连续时刻的帧图像;Obtain the video segment corresponding to the second fitness video in the first time period, and process the video segment into frames to obtain frame images corresponding to several consecutive moments of the video segment;
    在第一时间段内识别第二健身视频的目标健身区域,对目标健身区域进行特征提取,得到若干与帧图像一一对应的第二用户姿态;Identifying the target fitness area of the second fitness video in the first time period, performing feature extraction on the target fitness area, and obtaining a plurality of second user postures corresponding to the frame images one-to-one;
    对比相对应的若干帧图像和若干第二用户姿态,得到每个第二用户姿态基于对应的帧图像的相识度评分;Comparing corresponding several frame images and several second user postures, obtaining the acquaintance score of each second user posture based on the corresponding frame images;
    获取相识度评分最高的第二用户姿态作为评分结果。Obtain the second user pose with the highest acquaintance score as the scoring result.
  11. 根据权利要求7所述的基于健身装置的健身训练方法,其中,根据第一健身视频获取第一用户姿态,对第一用户姿态进行评分,具体包括:The fitness training method based on a fitness device according to claim 7, wherein the first user posture is obtained according to the first fitness video, and the first user posture is scored, specifically comprising:
    获取健身第一健身视频,根据第一健身视频预设第一标准姿态;Obtain the first fitness video for fitness, and preset the first standard posture according to the first fitness video;
    识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到第一用户姿态;Identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture;
    对比第一标准姿态和第一用户姿态,得到第一用户姿态基于第一标准姿态的相识度评分,并将评分结果反馈给用户。Comparing the first standard posture with the first user posture, obtaining the acquaintance score of the first user posture based on the first standard posture, and feeding back the scoring result to the user.
  12. 根据权利要求11所述的基于健身装置的健身训练方法,其中,识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到第一用户姿态,具体包括:The fitness training method based on a fitness device according to claim 11, wherein identifying the target fitness area of the first fitness video, performing feature extraction on the target fitness area, and obtaining the first user posture, specifically comprising:
    获取第一标准姿态在第一健身视频中出现的第二时间段;Obtain the second time period when the first standard posture appears in the first fitness video;
    在第二时间段内识别第一健身视频的目标健身区域,对目标健身区域进行特征提取,得到若干第一用户姿态;Identifying the target fitness area of the first fitness video in the second time period, performing feature extraction on the target fitness area, and obtaining several first user postures;
    对比第一标准姿态和若干第一用户姿态,得到每个第一用户姿态基于第一标准姿态的相识度评分;Comparing the first standard posture and several first user postures, obtaining the acquaintance score of each first user posture based on the first standard posture;
    获取相识度评分最高的第一用户姿态作为评分结果。Obtain the first user pose with the highest acquaintance score as the scoring result.
  13. 根据权利要求11或12所述的基于健身装置的健身训练方法,其中,对比第一标准姿态和第一用户姿态,具体包括:The fitness training method based on a fitness device according to claim 11 or 12, wherein comparing the first standard posture and the first user posture specifically includes:
    训练孪生神经网络模型,得到训练好的标准姿态识别模型;Train the twin neural network model to get the trained standard gesture recognition model;
    将第一用户姿态和第一标准姿态输入标准姿态识别模型,获得相识度评分;Inputting the first user pose and the first standard pose into the standard pose recognition model to obtain an acquaintance score;
    若评分结果大于或等于评分阈值,则第一标准姿态和第一用户姿态为同一类,则用户动作达标;If the scoring result is greater than or equal to the scoring threshold, the first standard gesture and the first user gesture are of the same category, and the user action is up to the standard;
    若评分结果小于评分阈值,则第一标准姿态和第一用户姿态不为同一类,则用户动作未达标。If the scoring result is less than the scoring threshold, the first standard gesture and the first user gesture are not of the same category, and the user action does not meet the standard.
  14. 基于健身教学训练的动作评价系统,包括:Action evaluation system based on fitness teaching and training, including:
    获取模块,用于获取健身视频,并获取健身视频中预设的标准姿态对应的时间段,以及该时间段中获取健身视频对应的连续时刻的帧图像;The obtaining module is used to obtain the fitness video, and obtain the time period corresponding to the preset standard posture in the fitness video, and obtain the frame images of the continuous moments corresponding to the fitness video in the time period;
    识别模块,用于识别上述时间段中连续时刻的帧图像对应的若干用户姿态;An identification module, configured to identify several user gestures corresponding to the frame images at consecutive moments in the above time period;
    对比模块,用于对比若干用户姿态和标准姿态,得到对比结果;A comparison module is used to compare several user postures and standard postures to obtain comparison results;
    判断模块,用于根据对比结果判断用户姿态与标准姿态是否为同一类型。The judging module is used to judge whether the user posture and the standard posture are of the same type according to the comparison result.
  15. 根据权利要求14所述的基于健身教学训练的动作评价系统,其中,所述对比模块对比若干用户姿态和标准姿态,具体包括训练孪生神经网络模型,得到训练好的标准姿态识别模型;The action evaluation system based on fitness teaching training according to claim 14, wherein the comparison module compares several user postures with standard postures, specifically including training a twin neural network model to obtain a trained standard posture recognition model;
    所述判断模块根据对比结果判断用户姿态与标准姿态是否为同一类型,具体包括将用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型。The judging module judges whether the user posture and the standard posture are of the same type according to the comparison result, specifically including inputting the user posture and the standard posture into the standard posture recognition model, and judging whether the user posture and the standard posture are of the same type.
  16. 根据权利要求15所述的基于健身教学训练的动作评价系统,其中,所述判断模块将 用户姿态和标准姿态输入标准姿态识别模型,判断用户姿态与标准姿态是否为同一类型,具体包括:The action evaluation system based on fitness teaching and training according to claim 15, wherein the judgment module inputs user posture and standard posture into the standard posture recognition model, and judges whether the user posture and standard posture are of the same type, specifically comprising:
    获取标准姿态和用户姿态的骨骼关键点及每个骨骼关键点对应的位置坐标;Get the bone key points of the standard pose and user pose and the position coordinates corresponding to each bone key point;
    将标准姿态和用户姿态每个骨骼关键点对应的位置坐标输入训练好的标准姿态识别模型,分别得到标准姿态的输出向量V1和用户姿态的输出向量V2;Input the position coordinates corresponding to each bone key point of the standard pose and the user pose into the trained standard pose recognition model, and obtain the output vector V1 of the standard pose and the output vector V2 of the user pose respectively;
    计算标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离;Calculate the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture;
    通过标准姿态的输出向量V1和用户姿态的输出向量V2的欧式距离判断用户姿态与标准姿态是否为同一类型。It is judged whether the user posture and the standard posture are of the same type by the Euclidean distance between the output vector V1 of the standard posture and the output vector V2 of the user posture.
  17. 根据权利要求16所述的基于健身教学训练的动作评价系统,其中,所述判断模块基于标准姿态识别模型获取欧式距离阈值T,阈值T用于判断用户姿态与标准姿态是否为同一类型,其中,若标准姿态识别模型输出的欧式距离小于或等于阈值T,则用户姿态与标准姿态为同一类型,若标准姿态识别模型输出的欧式距离大于阈值T,则用户姿态与标准姿态为不同一类型。The action evaluation system based on fitness teaching and training according to claim 16, wherein the judging module obtains the Euclidean distance threshold T based on the standard posture recognition model, and the threshold T is used to judge whether the user posture is of the same type as the standard posture, wherein, If the Euclidean distance output by the standard pose recognition model is less than or equal to the threshold T, the user pose is of the same type as the standard pose; if the Euclidean distance output by the standard pose recognition model is greater than the threshold T, then the user pose and the standard pose are of a different type.
  18. 根据权利要求16所述的基于健身教学训练的动作评价系统,其中,标准姿态和用户姿态均包括16个骨骼关键点,16个骨骼关键点分别对应一个二维位置坐标,16个骨骼关键点包括头顶、头底、颈部、右肩、右肘、右手、左肩、左肘、左手、右胯、右膝、右脚、左胯、左膝、左脚、髌骨。The action evaluation system based on fitness teaching training according to claim 16, wherein both the standard posture and the user posture include 16 key points of bones, the 16 key points of bones respectively correspond to a two-dimensional position coordinate, and the key points of bones include Top of head, bottom of head, neck, right shoulder, right elbow, right hand, left shoulder, left elbow, left hand, right hip, right knee, right foot, left hip, left knee, left foot, patella.
  19. 根据权利要求16所述的基于健身教学训练的动作评价系统,其中,所述判断模块将用户姿态和标准姿态输入标准姿态识别模型,输出用户姿态与标准姿态的相识度评分,获取相识度评分最高的用户姿态作为评分结果,通过评分结果判断用户姿态与标准姿态是否为同一类型。The action evaluation system based on fitness teaching and training according to claim 16, wherein the judging module inputs the user posture and the standard posture into the standard posture recognition model, outputs the acquaintance score between the user posture and the standard posture, and obtains the highest acquaintance score The user pose is used as the scoring result, and the scoring result is used to judge whether the user pose is of the same type as the standard pose.
  20. 基于健身装置的健身训练系统,所述健身训练系统基于权利要求14所述的基于健身教学训练的动作评价系统,所述健身训练系统执行:A fitness training system based on a fitness device, the fitness training system is based on the action evaluation system based on fitness teaching and training according to claim 14, and the fitness training system executes:
    根据第一健身视频获取第一用户姿态,判断用户是否进行健身训练;Obtaining the first user's posture according to the first fitness video, and judging whether the user is performing fitness training;
    根据第二健身视频获取第二用户姿态,对第二用户姿态进行评分,根据评分结果判断用户是否跟随第二健身视频进行健身训练;Acquiring the second user's posture according to the second fitness video, scoring the second user's posture, and judging whether the user follows the second fitness video for fitness training according to the scoring result;
    根据第一健身视频获取第一用户姿态,对第一用户姿态进行评分,并将评分结果反馈给用户。The first user posture is acquired according to the first fitness video, the first user posture is scored, and the scoring result is fed back to the user.
PCT/CN2022/070026 2021-12-14 2022-01-04 Motion evaluation method and system based on fitness teaching training WO2023108842A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202111523985.2A CN116262171A (en) 2021-12-14 2021-12-14 Body-building training method, system and device based on body-building device and medium
CN202111523962.1 2021-12-14
CN202111523962.1A CN116266415A (en) 2021-12-14 2021-12-14 Action evaluation method, system and device based on body building teaching training and medium
CN202111523985.2 2021-12-14

Publications (1)

Publication Number Publication Date
WO2023108842A1 true WO2023108842A1 (en) 2023-06-22

Family

ID=86775097

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/070026 WO2023108842A1 (en) 2021-12-14 2022-01-04 Motion evaluation method and system based on fitness teaching training

Country Status (1)

Country Link
WO (1) WO2023108842A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117392762A (en) * 2023-12-13 2024-01-12 中国石油集团川庆钻探工程有限公司 Characteristic behavior recognition method based on human body key point gesture coding
CN117746305A (en) * 2024-02-21 2024-03-22 四川大学华西医院 Medical care operation training method and system based on automatic evaluation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427900A (en) * 2019-08-07 2019-11-08 广东工业大学 A kind of method, apparatus and equipment of intelligent guidance body-building
CN110751050A (en) * 2019-09-20 2020-02-04 郑鸿 Motion teaching system based on AI visual perception technology
CN110867099A (en) * 2019-08-23 2020-03-06 广东工业大学 Device and method for scoring and teaching exercise and fitness postures
CN112560665A (en) * 2020-12-13 2021-03-26 同济大学 Professional dance evaluation method for realizing human body posture detection based on deep migration learning
US20210093920A1 (en) * 2019-09-26 2021-04-01 True Adherence, Inc. Personal Fitness Training System With Biomechanical Feedback

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110427900A (en) * 2019-08-07 2019-11-08 广东工业大学 A kind of method, apparatus and equipment of intelligent guidance body-building
CN110867099A (en) * 2019-08-23 2020-03-06 广东工业大学 Device and method for scoring and teaching exercise and fitness postures
CN110751050A (en) * 2019-09-20 2020-02-04 郑鸿 Motion teaching system based on AI visual perception technology
US20210093920A1 (en) * 2019-09-26 2021-04-01 True Adherence, Inc. Personal Fitness Training System With Biomechanical Feedback
CN112560665A (en) * 2020-12-13 2021-03-26 同济大学 Professional dance evaluation method for realizing human body posture detection based on deep migration learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117392762A (en) * 2023-12-13 2024-01-12 中国石油集团川庆钻探工程有限公司 Characteristic behavior recognition method based on human body key point gesture coding
CN117746305A (en) * 2024-02-21 2024-03-22 四川大学华西医院 Medical care operation training method and system based on automatic evaluation
CN117746305B (en) * 2024-02-21 2024-04-19 四川大学华西医院 Medical care operation training method and system based on automatic evaluation

Similar Documents

Publication Publication Date Title
WO2021051579A1 (en) Body pose recognition method, system, and apparatus, and storage medium
Rudovic et al. Context-sensitive dynamic ordinal regression for intensity estimation of facial action units
Hoffman et al. Breaking the status quo: Improving 3D gesture recognition with spatially convenient input devices
Chen et al. Computer-assisted self-training system for sports exercise using kinects
WO2023108842A1 (en) Motion evaluation method and system based on fitness teaching training
WO2017161734A1 (en) Correction of human body movements via television and motion-sensing accessory and system
Zhang et al. A kinect based golf swing score and grade system using gmm and svm
Monir et al. Rotation and scale invariant posture recognition using Microsoft Kinect skeletal tracking feature
CN112749684A (en) Cardiopulmonary resuscitation training and evaluating method, device, equipment and storage medium
CN113409651B (en) Live broadcast body building method, system, electronic equipment and storage medium
CN115331314A (en) Exercise effect evaluation method and system based on APP screening function
Morel et al. Automatic evaluation of sports motion: A generic computation of spatial and temporal errors
Zhou et al. Skeleton-based human keypoints detection and action similarity assessment for fitness assistance
Furley et al. Coding body language in sports: the nonverbal behavior coding system for soccer penalties
CN116266415A (en) Action evaluation method, system and device based on body building teaching training and medium
Pai et al. Home Fitness and Rehabilitation Support System Implemented by Combining Deep Images and Machine Learning Using Unity Game Engine.
CN113392744A (en) Dance motion aesthetic feeling confirmation method and device, electronic equipment and storage medium
CN116262171A (en) Body-building training method, system and device based on body-building device and medium
CN111507555A (en) Human body state detection method, classroom teaching quality evaluation method and related device
Potigutsai et al. Hand and fingertip detection for game-based hand rehabilitation
Chen et al. Research on Table Tennis Swing Recognition Based on Lightweight OpenPose
CN115862810B (en) VR rehabilitation training method and system with quantitative evaluation function
Patel et al. Gesture Recognition Using MediaPipe for Online Realtime Gameplay
Wagh et al. Virtual Yoga System Using Kinect Sensor
Porwal et al. ASL Language Translation using ML

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22905616

Country of ref document: EP

Kind code of ref document: A1