CN116311536A - Video action scoring method, computer-readable storage medium and system - Google Patents

Video action scoring method, computer-readable storage medium and system Download PDF

Info

Publication number
CN116311536A
CN116311536A CN202310561082.6A CN202310561082A CN116311536A CN 116311536 A CN116311536 A CN 116311536A CN 202310561082 A CN202310561082 A CN 202310561082A CN 116311536 A CN116311536 A CN 116311536A
Authority
CN
China
Prior art keywords
action
video
specific
images
motion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310561082.6A
Other languages
Chinese (zh)
Other versions
CN116311536B (en
Inventor
宗强
吴佳伦
李俊礼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xunlong Guangdong Intelligent Technology Co ltd
Original Assignee
Xunlong Guangdong Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xunlong Guangdong Intelligent Technology Co ltd filed Critical Xunlong Guangdong Intelligent Technology Co ltd
Priority to CN202310561082.6A priority Critical patent/CN116311536B/en
Publication of CN116311536A publication Critical patent/CN116311536A/en
Application granted granted Critical
Publication of CN116311536B publication Critical patent/CN116311536B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames

Abstract

The invention provides a video action scoring method, a computer readable storage medium and a system, wherein the method acquires body data and three-dimensional action videos of an action operator, identifies all continuous frame images of specific actions in each three-dimensional action video and key frame images which can reflect the specific actions most, then identifies action types, specification coefficients and difficulty coefficients of the specific actions in the selected key frame images, compares the quantity of all continuous frame images with the quantity of all standard action continuous frame images, acquires the exercise coefficients by combining the body data of the action operator, and calculates according to the specification coefficients, the difficulty coefficients, the exercise coefficients and the power coefficients of the specific actions to obtain action exercise scores, so that the action scoring comprehensively considers action difficulties and the power coefficients reflected by the body data, the action types and the specification coefficients of the action operator, and action scoring results are no longer on one surface.

Description

Video action scoring method, computer-readable storage medium and system
Technical Field
The present invention relates to the field of motion recognition technologies, and in particular, to a video motion scoring method, a computer-readable storage medium, and a system.
Background
With the rapid development of intelligent motion recognition technology, the intelligent motion recognition technology is widely applied to various industries. In the martial arts exercise scene, martial arts actions of an exercise person can be recognized firstly based on an action intelligent recognition technology, then action scoring is carried out, in the process, martial arts exercise videos of the martial arts actions of the exercise person are recorded firstly, a scoring system recognizes martial arts actions in the martial arts exercise videos by utilizing a video action recognition algorithm model, and then the recognized martial arts actions are scored. The existing scoring system generally compares the similarity between the identified martial arts and standard actions, and scores the martial arts according to the comparison result, but simply performing similarity comparison scoring does not consider the influence of factors such as physical data of an action trainer, action types of the martial arts, action difficulty and the like on the whole set of martial arts, and the obtained action scoring result is relatively unilateral.
Disclosure of Invention
The invention aims to solve the technical problem of improving the situation that the action scoring result is more unilateral.
In order to solve the technical problems, the invention provides a video action scoring method, which comprises the following steps:
A. acquiring body data of an action trainer, wherein the body data comprises sex coefficients, height, weight and arm length;
B. shooting a front action video and a plurality of side action videos of an action exerciser, wherein the shooting angles of the side action videos are different;
C. inputting each side action video and the front action video into a three-dimensional reconstruction algorithm model together to construct a plurality of three-dimensional action videos corresponding to each side action video respectively;
D. respectively inputting the plurality of three-dimensional action videos into a video action recognition algorithm model to recognize specific actions, obtaining all continuous frame images from a start frame image to an end frame image of the specific actions in each three-dimensional action video, and recognizing a key frame image which can reflect the specific actions most in all the continuous frame images;
E. acquiring a preset standard action library, wherein the standard action library comprises all standard action continuous frame images of each specific action;
F. inputting each key frame image into a human body key point regression algorithm respectively to obtain a human body key point feature image corresponding to a specific action in each key frame image, comparing the human body key point feature image of each key frame image with all standard action continuous frame images of each specific action in the standard action library respectively, selecting one which is most in line with the standard action and a corresponding three-dimensional action video thereof from each key frame image according to a comparison result, and identifying an action category, a specification coefficient and a difficulty coefficient to which the specific action in the selected key frame image belongs;
G. according to the action category of the specific action in the selected key frame image, acquiring the number of all standard action continuous frame images corresponding to the action category from the standard action library, comparing the number of all continuous frame images with the number of all standard action continuous frame images, and acquiring an exercise coefficient by combining body data of an action exercise person, wherein the number of all continuous frame images of the specific action in the three-dimensional action video corresponding to the selected key frame image;
H. and calculating to obtain a power coefficient according to the action category, the exercise coefficient and the body data of the specific action in the selected key frame image, and calculating to obtain an action exercise score according to the specification coefficient, the difficulty coefficient, the exercise coefficient and the power coefficient of the specific action in the selected key frame image.
Preferably, in the step G, a calculation formula of the drilling coefficient is:
Figure SMS_1
wherein ,
Figure SMS_2
drill coefficients representing specific actions in the selected keyframe image, +.>
Figure SMS_3
Representing the number of all continuous frame images of a specific motion in the three-dimensional motion video corresponding to the selected key frame image,/or more>
Figure SMS_4
Representing the number of consecutive frame images of all standard actions taken from the standard action library,/for>
Figure SMS_5
The body coefficients are fitted based on body data of the action exercisers, and the fitting formula is as follows:
Figure SMS_6
wherein ,
Figure SMS_9
sex coefficient of the person who performs exercise, +.>
Figure SMS_12
Representing the weight of the action exercisers, +.>
Figure SMS_14
Representing the height of the action exercisers, +.>
Figure SMS_7
Representing arm length of the motion trainer, +.>
Figure SMS_10
、/>
Figure SMS_13
、/>
Figure SMS_15
、/>
Figure SMS_8
、/>
Figure SMS_11
Is a preset fitting constant.
Preferably, in the step H, the calculation formula of the power coefficient is:
Figure SMS_16
wherein ,
Figure SMS_17
a power factor representing a specific action in the selected key frame image,/or->
Figure SMS_18
Drill coefficients representing specific actions in the selected keyframe image, +.>
Figure SMS_19
Is a body factor fitted based on body data of the action exercises, is +.>
Figure SMS_20
The power constant is preset corresponding to the action category to which each specific action belongs.
Preferably, in the step H, a calculation formula of the action exercise score is:
Figure SMS_21
wherein ,
Figure SMS_22
representing an action drill score, < >>
Figure SMS_23
Specification coefficients representing specific actions in the selected key frame image +.>
Figure SMS_24
Difficulty coefficient representing specific action in selected key frame image, < >>
Figure SMS_25
Drill coefficients representing specific actions in the selected keyframe image, +.>
Figure SMS_26
A power factor representing a particular action in the selected key frame image.
Preferably, the video motion recognition algorithm model is trained through specific motion recognition, and the specific training method comprises the following steps of S1, S2, S3, S4 and S5:
s1, acquiring an original video, and segmenting the original video according to a segmentation instruction input by a user to obtain a plurality of segmented videos;
s2, extracting images of each segmented video for multiple times to obtain multiple video images;
s3, dividing the extracted video images according to specific actions defined by a user, wherein the division results comprise specific action images and non-specific action images;
s4, synthesizing the specific action images obtained by dividing into specific action videos, and respectively carrying out image extraction on the specific action videos for a plurality of times according to a plurality of preset different interval frequencies, wherein each image extraction obtains a plurality of sample images;
s5, respectively converting a plurality of sample images obtained by extracting each image into a sample file format of a video motion recognition algorithm model, and inputting all sample images after the format conversion into the video motion recognition algorithm model to train the video motion recognition algorithm model until the video motion recognition algorithm model has the capability of recognizing specific motions in video.
Preferably, in the step S1, the original video is equally divided according to a division instruction input by a user to obtain a plurality of division videos with the same duration.
Preferably, in the step S4, the number of times of image extraction of the specific motion video is three or more.
Preferably, in the step S4, the extraction time corresponding to the interval frequency of each image extraction does not completely overlap with the extraction time corresponding to the interval frequency of other image extraction.
The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps in a video action scoring method as described above.
The invention also provides a video action scoring system which comprises a main control device, a height and weight measuring instrument and a plurality of cameras, wherein the main control device is respectively connected with the height and weight measuring instrument and the cameras, and the main control device comprises a processor and the computer readable storage medium which are mutually connected.
The invention has the following beneficial effects: when video action scoring is carried out, body data and a plurality of three-dimensional action videos of an action operator are obtained, all continuous frame images of specific actions in each three-dimensional action video and key frame images which can reflect the specific actions in the continuous frame images are identified by utilizing a video action identification algorithm model, one of the key frame images which is most in line with the standard actions and the corresponding three-dimensional action video thereof are selected, then action categories, specification coefficients and difficulty coefficients of the specific actions in the selected key frame images are identified according to a preset standard action library, corresponding continuous frame image numbers of all standard actions are obtained from a standard action library according to the action categories of the specific actions in the selected key frame images, then the continuous frame image numbers of all continuous frame images of the specific actions in the three-dimensional action video corresponding to the selected key frame images are compared with the obtained continuous frame image numbers of all standard actions, then the exercise coefficients are obtained by combining with body data of the action operator, then the exercise coefficients are obtained according to the action categories and the exercise coefficients of the specific actions in the selected key frame images, the exercise coefficients are obtained according to the action categories and the exercise coefficients of the specific actions in the preset standard action images, the exercise coefficients in the step coefficients are obtained, and the performance coefficients of the body score is not considered, and the final action coefficients are obtained according to the overall action score and the performance score is obtained.
Drawings
FIG. 1 is a connected block diagram of a video action scoring system.
Fig. 2 is a schematic diagram of a martial arts station.
Fig. 3 is a flow chart of a video action scoring method.
Fig. 4 is a flow chart of a training method of a video motion recognition algorithm model.
Detailed Description
The invention is further described in detail below in connection with the detailed description.
The embodiment provides a video action scoring system applied to a martial arts exercise scene, as shown in fig. 1, the system comprises a main control device 1, a height and weight measuring instrument 2 and seven cameras 3, wherein the main control device 1 is respectively connected with the height and weight measuring instrument 2 and the seven cameras 3. In the martial arts exercise scene, the action exercisers need to log on the martial arts exercise table 4 to perform martial arts exercise, as shown in fig. 2, the martial arts exercise table 4 is rectangular, and a 5 m×5 m square calibration plate 5 is arranged behind the martial arts exercise table, each square is 0.1 m×0.1 m in size, namely square length
Figure SMS_27
0.1 meter; the height and weight measuring instrument 2 is arranged in front of the calibration plate 5, seven cameras 3 are respectively arranged right in front of the model Wu Tai, left in front of the model Wu Tai, right in front of the model Wu Tai, left in back of the model Wu Tai, right in front of the model forward, and right in back of the model forward, and the seven cameras 3 face the center of the model stage 4. The master control apparatus 1 comprises a computer readable storage medium and a processor connected to each other, the computer readable storage medium having stored thereon a computer program which, when executed by the processor, implements a video action scoring method as shown in fig. 3, the method comprising the following step A, B, C, D, E, F, G, H.
A. Body data of the action exercises including sex coefficients, height, weight and arm length are acquired.
The action trainer stands on the height and weight measuring instrument 2 before martial arts action exercises are performed, and the main control device 1 acquires the body of the action trainer standing on the height and weight measuring instrument 2The body data includes a sex factor X, a height h, a weight W, and an arm length L. Specifically, the main control device 1 directly measures the height h and the weight W of the motion exercise person through the height and weight measuring instrument 2, at the same time, the main control device 1 captures images of the motion exercise person standing on the height and weight measuring instrument 2 and the rear calibration plate 5 by using the camera 3 arranged right in front of the actor Wu Tai, then recognizes the sex of the motion exercise person in the images through the face recognition technology, and correspondingly generates different sex coefficients X if the sex of the motion exercise person is different, for example, if the main control device 1 recognizes that the sex of the motion exercise person is male through the face recognition technology, the sex coefficient X may be generated to be 1, and if the main control device 1 recognizes that the sex of the motion exercise person is female through the face recognition technology, the sex coefficient X may be generated to be 1.1. Then the main control equipment 1 inputs the image shot by the camera 3 in front into a human body key point regression algorithm DEKR to obtain key point coordinates of the big arm and the small arm of the action exercise person and calculate the pixel length of the arm according to the key point coordinates
Figure SMS_28
Then, the image shot by the camera 3 in front is input into a contour acquisition algorithm based on the computer vision software OpenCV to obtain the square pixel length +.>
Figure SMS_29
Since the length of the square on the calibration plate 5 is known +.>
Figure SMS_30
Is 0.1 meter, so that the length of the square lattice can be combined>
Figure SMS_31
Arm pixel length
Figure SMS_32
And square pixel length->
Figure SMS_33
Calculating to obtain the arm length of the action exercise person>
Figure SMS_34
B. The method comprises the steps of shooting a front action video and a plurality of side action videos of an action exerciser, wherein shooting angles of the side action videos are different.
When the action trainer performs martial arts action exercises, the main control device 1 captures front action videos of the action trainer by using the camera 3 in front of the action trainer, and captures three side action videos of the action trainer from different angles by using the three cameras 3 positioned on the same side. For example, if the main control device 1 captures a front motion video of the motion trainer by using the camera 3 located right in front of the main control device and captures a left side motion video of the motion trainer by using the camera 3 located left in front of the main control device, three of the front left, and rear left are located right in front of the main control device, four motion videos, that is, a front motion video, a front left motion video, a rear left motion video, and a rear left motion video can be obtained; alternatively, the main control device 1 captures a front motion video of the motion trainer by using the camera 3 located right ahead, and captures a right side video of the motion trainer by using the cameras 3 located right ahead, right behind, and right behind, so that four motion videos, that is, a front motion video, a right motion video, and a right rear motion video, can be obtained.
C. And inputting each side action video and the front action video into a three-dimensional reconstruction algorithm model, and constructing a plurality of three-dimensional action videos corresponding to each side action video.
Taking the example of capturing a front action video, a front left action video and a rear left action video, the main control device 1 inputs each side action video and the front action video into the three-dimensional reconstruction algorithm model Nvdiffrec together, and combines the side action video and the front action video to construct a corresponding three-dimensional action video. Specifically, the main control device 1 inputs the left front side action video and the front side action video into the three-dimensional reconstruction algorithm model Nvdiffrec together to construct a first three-dimensional action video, inputs the positive left side action video and the front side action video into the three-dimensional reconstruction algorithm model together to construct a second three-dimensional action video, and inputting the left rear side action video and the front side action video into a three-dimensional reconstruction algorithm model together to mutually combine and construct a third three-dimensional action video, so that three-dimensional action videos are obtained in total.
D. And respectively inputting a plurality of three-dimensional action videos into a video action recognition algorithm model to perform specific action recognition, obtaining all continuous frame images from a start frame image to an end frame image of each three-dimensional action video, and recognizing a key frame image which can most reflect the specific action in all the continuous frame images.
In this embodiment, before using the video motion recognition algorithm model SlowFast to perform specific motion recognition on the three-dimensional motion video, a training system of the video motion recognition algorithm model SlowFast is used to train the video motion recognition algorithm model SlowFast, so that the video motion recognition algorithm model SlowFast has the capability of recognizing the specific motion in the video, and the training system includes a computer readable storage medium and a processor, which are connected with each other, where the computer readable storage medium stores a computer program, and when executed by the processor, the training method of the video motion recognition algorithm model shown in fig. 4 is implemented, and the training method includes steps S1, S2, S3, S4, and S5.
S1, acquiring an original video, and segmenting the original video according to segmentation instructions input by a user to obtain a plurality of segmented videos.
In order to enable the video motion recognition algorithm model SlowFast to recognize specific motions in the three-dimensional motion video, an original video containing a set of standard motions of a martial arts coach needs to be acquired to train the video motion recognition algorithm model SlowFast, in the training process, a training system can firstly conduct image extraction on the original video for multiple times to obtain a plurality of video images, at the moment, a folder corresponding to the original video can be generated firstly, and then the extracted video images are stored in the folder. However, the time spent for playing a set of standard actions by a martial arts coach is generally longer, so that the recorded original video is longer in duration and occupies a larger memory, if the original video is directly subjected to multiple image extraction, a very large number of video images are generated under one folder, which can cause a computer screen to display very slow actions such as refreshing, opening the folder and the like, even to be blocked, therefore, before the original video is subjected to multiple image extraction, a training system firstly acquires a segmentation instruction input by a user (for example, a segmentation instruction input by the user and used for performing ten equal parts on the original video), then segments the original video into a plurality of segmented videos with shorter duration according to the segmentation instruction, so that a folder is generated corresponding to each segmented video, multiple image extraction is performed on each segmented video, and corresponding video images are respectively generated under the folders corresponding to each segmented video, so that the number of video images in each folder is smaller.
S2, extracting images of each segmented video for multiple times to obtain multiple video images.
After the original video is segmented to obtain a plurality of segmented videos, a folder is generated corresponding to each segmented video, then the training system performs image extraction on each segmented video for a plurality of times according to preset interval frequency, and a plurality of video images are respectively generated under the folders corresponding to each segmented video. For example, after an original video is divided into ten equal-divided videos according to a dividing instruction input by a user, the training system divides the original video into ten divided videos with the time length of 6 minutes respectively, a folder is generated corresponding to each divided video, then the training system extracts images of each divided video for a plurality of times according to the interval frequency of 1 frame per second, and 360 video images are generated under the folders corresponding to each divided video respectively.
S3, dividing the extracted video images according to specific actions defined by a user, wherein the division result comprises specific action images and non-specific action images.
After extracting a plurality of video images, a user inputs defined specific actions to a training system, and the training system can divide 360 video images extracted under each folder for segmenting video according to the specific actions defined by the user to obtain specific action images with specific actions in the images and non-specific action images without specific actions in the images.
S4, synthesizing the specific action images obtained through division into specific action videos, and respectively carrying out image extraction on the specific action videos for a plurality of times according to a plurality of preset different interval frequencies, wherein each image extraction obtains a plurality of sample images.
After the training system divides the specific action images with specific actions in the obtained images, the specific action images are synthesized into specific action videos by using a video synthesis algorithm based on computer vision software OpenCV. And then, the training system respectively performs multiple image extraction on the specific action video according to a plurality of preset different interval frequencies, and a plurality of sample images are obtained through each image extraction. For example, the first image extraction is performed on the specific motion video at an interval frequency of 4 frames per second to obtain a number of sample images of 4m, the second image extraction is performed on the specific motion video at an interval frequency of 10 frames per second to obtain a number of sample images of 10m, and the third image extraction is performed on the specific motion video at an interval frequency of 30 frames per second to obtain a number of sample images of 30 m.
In this embodiment, for example, the specific motion video is extracted for the first time at an interval frequency of 4 frames per second, the extraction time corresponding to the first image extraction is not completely overlapped with the extraction time corresponding to the second image extraction, and the extraction time corresponding to the third image extraction is not completely overlapped with the extraction time corresponding to the third image extraction.
In other embodiments, if the specific motion video is first extracted at an interval frequency of 2 frames per second, the corresponding extraction time is 0.5 seconds, 1 second, 1.5 seconds, 2 seconds, the specific motion video is second extracted at an interval frequency of 4 frames per second, the corresponding extraction time is 0.25 seconds, 0.5 seconds, 0.75 seconds, 1 second … …, and the specific motion video is third extracted at an interval frequency of 10 frames per second, the corresponding extraction time is 0.1 seconds, 0.2 seconds, 0.3 seconds, 0.4 seconds … …, so that the extraction time corresponding to the first image extraction is covered by the extraction time corresponding to the second image extraction, that is, if the extraction time corresponding to the interval frequency of some image extraction is completely overlapped with the extraction time corresponding to the interval frequency of other image extraction, it is not acceptable.
S5, respectively converting a plurality of sample images obtained by extracting each image into a sample file format of a video motion recognition algorithm model, and inputting all sample images after the format conversion into the video motion recognition algorithm model to train the sample images until the video motion recognition algorithm model has the capability of recognizing specific motions in video.
After a plurality of sample images are extracted, the training system respectively converts the plurality of sample images extracted from each image into a sample file format of a video motion recognition algorithm model SlowFast, the specific format is a coco format json result file marked by labelme, and then all sample images after the conversion format are input into the video motion recognition algorithm model SlowFast to train the video motion recognition algorithm model SlowFast until the video motion recognition algorithm model SlowFast has the capability of recognizing specific motions in video. Therefore, the total quantity of the sample images input into the video motion recognition algorithm model swfast is 4m+10m+30m=44m, the quantity is more than the quantity (4 m, 10m or 30 m) of the sample images obtained by single image extraction, the interval frequency of each sample image in extraction is different, namely the sample images are different, the characteristic diversity of the sample images is increased, the sample images are input into the video motion recognition algorithm model swfast for training, the overfitting phenomenon can be avoided in training, and a large error does not exist between the predicted result and the actual result of the trained video motion recognition algorithm model swfast.
After obtaining three-dimensional motion videos, the main control device 1 respectively inputs the three-dimensional motion videos into the trained video motion recognition algorithm model SlowFast to perform specific motion recognition, and it should be noted that, when the motion exerciser performs specific motion, a continuous motion gesture change process is undergone, for example, when the motion exerciser performs a specific motion, the specific motion is performed by turning the right hand from front to back half a turn, the specific motion is subjected to a motion gesture change process from front starting point to back end point, and the motion gesture change process corresponds to a continuous frame image, so that after performing specific motion recognition, the video motion recognition algorithm model SlowFast can obtain a starting frame image of the specific motion in each three-dimensional motion video
Figure SMS_35
To end frame image->
Figure SMS_36
All consecutive frame images in between. In addition, in all continuous frame images with specific actions in each three-dimensional action video, a certain frame image can reflect the specific actions most, and a trained video action recognition algorithm model SlowFast can recognize key frame images which can reflect the specific actions most in all continuous frame images of each three-dimensional action video>
Figure SMS_37
E. And acquiring a preset standard action library, wherein the standard action library comprises all standard action continuous frame images of each specific action.
In this embodiment, a standard motion library is preset for the martial arts exercise scene, the standard motion library includes all standard motion continuous frame images of all standard motions played by a trainer during martial arts exercise, and motion types and specification coefficients given by the trainer are recorded for all standard motion continuous frame images of all standard motions
Figure SMS_38
And difficulty coefficient->
Figure SMS_39
. After each main control device 1 obtains a preset standard action library, each key frame image is compared with the standard action continuous frame images in the standard action library in similarity, so that one of the key frame images which is most in line with the standard action can be identified.
F. Inputting each key frame image into a human body key point regression algorithm respectively to obtain a human body key point feature map corresponding to a specific action in each key frame image, comparing the human body key point feature map of each key frame image with all standard action continuous frame images of each specific action in a standard action library respectively, selecting one which is most in line with the standard action and a corresponding three-dimensional action video thereof from each key frame image according to a comparison result, and identifying an action category, a specification coefficient and a difficulty coefficient to which the specific action in the selected key frame image belongs.
Key frame image capable of reflecting specific action most in all continuous frame images corresponding to each three-dimensional action video
Figure SMS_45
After that, the main control device 1 will every key frame image +.>
Figure SMS_42
Respectively inputting the images into a human body key point regression algorithm DEKR to obtain each key frame image +.>
Figure SMS_54
Human body key point feature map corresponding to specific action in the image, and then each key frame image is +.>
Figure SMS_41
The human body key point feature images of the (4) are respectively compared with all standard action continuous frame images of each specific action in a standard action library, and the +.>
Figure SMS_51
Selecting one of the most suitable standard actions, and acquiring the key frame image +.>
Figure SMS_53
And corresponding three-dimensional action video. For example, in step D, the video motion recognition algorithm model SlowFast recognizes the first keyframe image +.>
Figure SMS_56
Identifying a second keyframe image from the second three-dimensional motion video>
Figure SMS_46
Identifying a third keyframe image from the third three-dimensional action video>
Figure SMS_55
In the case of the three key frame images +.>
Figure SMS_40
Inputting the three key frame images into a human key point regression algorithm DEKR to obtain a corresponding human key point feature image>
Figure SMS_48
The human body key point feature images of the (4) are respectively compared with all standard action continuous frame images of each specific action in a standard action library to obtain a first key frame image +.>
Figure SMS_44
The similarity of the specific motion in the (2) and the standard motion of a certain standard motion continuous frame image is 0.99, and the second key frame image is +.>
Figure SMS_50
The similarity of the specific motion in the (2) and the standard motion of a certain standard motion continuous frame image is 0.98, and the third key frame image is +.>
Figure SMS_47
The similarity of the specific motion in the (2) and the standard motion of the continuous frame image of a certain standard motion is 0.97, the main control device 1 is in the (i) three key frame images>
Figure SMS_52
Selecting the first key frame image with highest similarity, i.e. the most conforming to the standard action +.>
Figure SMS_43
And acquires the first key frame image +.>
Figure SMS_49
A corresponding first three-dimensional action video.
Then, the main control device 1 generates the first key frame image according to the standard action library
Figure SMS_57
Standard motion continuous frame image with similarity of 0.99, identifying motion category and specification coefficient of standard motion corresponding to the standard motion continuous frame image>
Figure SMS_58
And difficulty coefficient->
Figure SMS_59
And takes this as the first keyframe image selected +.>
Figure SMS_60
Action category and specification coefficient of specific action in (a)>
Figure SMS_61
And difficulty coefficient->
Figure SMS_62
G. According to the action category of the specific action in the selected key frame image, acquiring the quantity of all standard action continuous frame images corresponding to the action category from a standard action library, comparing the quantity of all continuous frame images of the specific action in the three-dimensional action video corresponding to the selected key frame image with the acquired quantity of all standard action continuous frame images, and combining body data of an action trainer to obtain the exercise coefficient.
The main control device 1 selects the first key frame image
Figure SMS_65
The action category of the specific action in the list is obtained from the standard action library, and the number of all standard action continuous frame images corresponding to the action category is +.>
Figure SMS_66
I.e. acquire the first key frame image +.>
Figure SMS_69
All standard motion continuous frame image number with similarity of 0.99 +.>
Figure SMS_64
And acquires the selected first key frame image +.>
Figure SMS_68
The number of all consecutive frame images of the corresponding three-dimensional motion video in which the specific motion occurs>
Figure SMS_71
Then, the number of all continuous frame images of the specific action is +.>
Figure SMS_72
And all standard movementsNumber of pictures for successive frames->
Figure SMS_63
By contrast, the physical data of the action exercisers are combined to obtain the exercise coefficient +.>
Figure SMS_67
The drill factor->
Figure SMS_70
The specific calculation formula of (2) is as follows:
Figure SMS_73
wherein ,
Figure SMS_74
representing the selected key frame image +.>
Figure SMS_75
Drill coefficient of specific action in +.>
Figure SMS_76
Representing the selected key frame image +.>
Figure SMS_77
All the continuous frame image quantity of specific action in the corresponding three-dimensional action video, +.>
Figure SMS_78
Representing the number of consecutive frame images of all standard actions taken from the standard action library,/for>
Figure SMS_79
The body coefficients are fitted based on body data (sex coefficient X, height h, weight W and arm length L) of the action exercises, and a specific fitting formula is as follows:
Figure SMS_80
wherein ,
Figure SMS_83
sex coefficient of the person who performs exercise, +.>
Figure SMS_86
Representing the weight of the action exercisers, +.>
Figure SMS_88
Representing the height of the action exercisers, +.>
Figure SMS_82
Representing arm length of the motion trainer, +.>
Figure SMS_85
、/>
Figure SMS_87
、/>
Figure SMS_89
、/>
Figure SMS_81
、/>
Figure SMS_84
Is a preset fitting constant.
H. And calculating to obtain a power coefficient according to the action category, the exercise coefficient and the body data of the specific action in the selected key frame image, and calculating to obtain an action exercise score according to the specification coefficient, the difficulty coefficient, the exercise coefficient and the power coefficient of the specific action in the selected key frame image.
Then, the main control device 1 selects the first key frame image
Figure SMS_90
Action category and exercise coefficient to which the specific action in (a) belongs +.>
Figure SMS_91
Calculating to obtain the power coefficient->
Figure SMS_92
Coefficient of power->
Figure SMS_93
The specific calculation formula of (2) is as follows:
Figure SMS_94
wherein ,
Figure SMS_95
representing the selected key frame image +.>
Figure SMS_96
The power factor of a specific action in +.>
Figure SMS_97
Representing the selected key frame image +.>
Figure SMS_98
Drill coefficient of specific action in +.>
Figure SMS_99
Is a body factor fitted based on body data of the action exercises, is +.>
Figure SMS_100
The power constant is preset corresponding to the action category to which each specific action belongs.
Then, the main control device 1 then selects the first key frame image
Figure SMS_101
Specification coefficient of specific motion in (a)
Figure SMS_102
Difficulty coefficient->
Figure SMS_103
Drill factor->
Figure SMS_104
And the coefficient of power->
Figure SMS_105
Calculating to obtain action exercise score +.>
Figure SMS_106
Drill score->
Figure SMS_107
The specific calculation formula of (2) is as follows:
Figure SMS_108
wherein ,
Figure SMS_110
representing an action drill score, < >>
Figure SMS_114
Representing the selected key frame image +.>
Figure SMS_116
Specification coefficient of specific action in (a), +.>
Figure SMS_111
Representing the selected key frame image +.>
Figure SMS_112
Difficulty coefficient of specific action in +.>
Figure SMS_115
Representing the selected key frame image +.>
Figure SMS_117
Drill coefficient of specific action in +.>
Figure SMS_109
Representing the selected key frame image +.>
Figure SMS_113
Work of a specific action in (a)Force coefficient.
The action exercise score thus finally calculated
Figure SMS_118
Comprehensively consider the physical data (sex coefficient X, height h, weight W, arm length L) of the action exercisers, the action category and specification coefficient of the specific action>
Figure SMS_119
Reflected action specification, difficulty coefficient ∈>
Figure SMS_120
The reflected action difficulty and the exercise coefficient according to the action category>
Figure SMS_121
The obtained power factor->
Figure SMS_122
And enabling the action scoring result to be no longer on one side.
The above-described embodiments are provided for the present invention only and are not intended to limit the scope of patent protection. Insubstantial changes and substitutions can be made by one skilled in the art in light of the teachings of the invention, as yet fall within the scope of the claims.

Claims (10)

1. The video action scoring method is characterized by comprising the following steps of:
A. acquiring body data of an action trainer, wherein the body data comprises sex coefficients, height, weight and arm length;
B. shooting a front action video and a plurality of side action videos of an action exerciser, wherein the shooting angles of the side action videos are different;
C. inputting each side action video and the front action video into a three-dimensional reconstruction algorithm model together to construct a plurality of three-dimensional action videos corresponding to each side action video respectively;
D. respectively inputting the plurality of three-dimensional action videos into a video action recognition algorithm model to recognize specific actions, obtaining all continuous frame images from a start frame image to an end frame image of the specific actions in each three-dimensional action video, and recognizing a key frame image which can reflect the specific actions most in all the continuous frame images;
E. acquiring a preset standard action library, wherein the standard action library comprises all standard action continuous frame images of each specific action;
F. inputting each key frame image into a human body key point regression algorithm respectively to obtain a human body key point feature image corresponding to a specific action in each key frame image, comparing the human body key point feature image of each key frame image with all standard action continuous frame images of each specific action in the standard action library respectively, selecting one which is most in line with the standard action and a corresponding three-dimensional action video thereof from each key frame image according to a comparison result, and identifying an action category, a specification coefficient and a difficulty coefficient to which the specific action in the selected key frame image belongs;
G. according to the action category of the specific action in the selected key frame image, acquiring the number of all standard action continuous frame images corresponding to the action category from the standard action library, comparing the number of all continuous frame images with the number of all standard action continuous frame images, and acquiring an exercise coefficient by combining body data of an action exercise person, wherein the number of all continuous frame images of the specific action in the three-dimensional action video corresponding to the selected key frame image;
H. and calculating to obtain a power coefficient according to the action category, the exercise coefficient and the body data of the specific action in the selected key frame image, and calculating to obtain an action exercise score according to the specification coefficient, the difficulty coefficient, the exercise coefficient and the power coefficient of the specific action in the selected key frame image.
2. The video action scoring method according to claim 1, wherein in the step G, a calculation formula of the drilling coefficient is:
Figure QLYQS_1
wherein ,
Figure QLYQS_2
drill coefficients representing specific actions in the selected keyframe image, +.>
Figure QLYQS_3
Representing the number of all continuous frame images of a specific motion in the three-dimensional motion video corresponding to the selected key frame image,/or more>
Figure QLYQS_4
Representing the number of consecutive frame images of all standard actions taken from the standard action library,/for>
Figure QLYQS_5
The body coefficients are fitted based on body data of the action exercisers, and the fitting formula is as follows:
Figure QLYQS_6
wherein ,
Figure QLYQS_8
sex coefficient of the person who performs exercise, +.>
Figure QLYQS_12
Representing the weight of the action exercisers, +.>
Figure QLYQS_14
Representing the height of the action exercisers, +.>
Figure QLYQS_9
Representing arm length of the motion trainer, +.>
Figure QLYQS_11
、/>
Figure QLYQS_13
、/>
Figure QLYQS_15
、/>
Figure QLYQS_7
、/>
Figure QLYQS_10
Is a preset fitting constant.
3. The video action scoring method according to claim 2, wherein in the step H, the calculation formula of the power coefficient is:
Figure QLYQS_16
wherein ,
Figure QLYQS_17
a power factor representing a specific action in the selected key frame image,/or->
Figure QLYQS_18
Drill coefficients representing specific actions in the selected keyframe image, +.>
Figure QLYQS_19
Is a body factor fitted based on body data of the action exercises, is +.>
Figure QLYQS_20
The power constant is preset corresponding to the action category to which each specific action belongs.
4. The video motion scoring method according to claim 3, wherein in the step H, a calculation formula of the motion exercise score is:
Figure QLYQS_21
wherein ,
Figure QLYQS_22
representing an action drill score, < >>
Figure QLYQS_23
Specification coefficients representing a particular action in the selected key frame image,
Figure QLYQS_24
difficulty coefficient representing specific action in selected key frame image, < >>
Figure QLYQS_25
Drill coefficients representing specific actions in the selected keyframe image, +.>
Figure QLYQS_26
A power factor representing a particular action in the selected key frame image.
5. The video motion scoring method according to claim 1, wherein the video motion recognition algorithm model is trained for specific motion recognition, and the specific training method comprises the following steps S1, S2, S3, S4, S5:
s1, acquiring an original video, and segmenting the original video according to a segmentation instruction input by a user to obtain a plurality of segmented videos;
s2, extracting images of each segmented video for multiple times to obtain multiple video images;
s3, dividing the extracted video images according to specific actions defined by a user, wherein the division results comprise specific action images and non-specific action images;
s4, synthesizing the specific action images obtained by dividing into specific action videos, and respectively carrying out image extraction on the specific action videos for a plurality of times according to a plurality of preset different interval frequencies, wherein each image extraction obtains a plurality of sample images;
s5, respectively converting a plurality of sample images obtained by extracting each image into a sample file format of a video motion recognition algorithm model, and inputting all sample images after the format conversion into the video motion recognition algorithm model to train the video motion recognition algorithm model until the video motion recognition algorithm model has the capability of recognizing specific motions in video.
6. The video action scoring method according to claim 5, wherein in step S1, the original video is equally divided according to a segmentation command input by a user to obtain a plurality of segmented videos with the same duration.
7. The video motion scoring method according to claim 5, wherein in step S4, the number of times of image extraction of the specific motion video is three or more.
8. The video action scoring method according to claim 7, wherein in step S4, the extraction time corresponding to the interval frequency of each image extraction does not completely overlap with the extraction time corresponding to the interval frequency of the other image extraction.
9. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps in the video action scoring method of any one of claims 1 to 8.
10. A video action scoring system, comprising a main control device (1), a height and weight measuring instrument (2) and a plurality of cameras (3), wherein the main control device (1) is respectively connected with the height and weight measuring instrument (2) and each camera (3), and the main control device (1) comprises a processor and a computer readable storage medium according to claim 9 which are mutually connected.
CN202310561082.6A 2023-05-18 2023-05-18 Video action scoring method, computer-readable storage medium and system Active CN116311536B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310561082.6A CN116311536B (en) 2023-05-18 2023-05-18 Video action scoring method, computer-readable storage medium and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310561082.6A CN116311536B (en) 2023-05-18 2023-05-18 Video action scoring method, computer-readable storage medium and system

Publications (2)

Publication Number Publication Date
CN116311536A true CN116311536A (en) 2023-06-23
CN116311536B CN116311536B (en) 2023-08-08

Family

ID=86801733

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310561082.6A Active CN116311536B (en) 2023-05-18 2023-05-18 Video action scoring method, computer-readable storage medium and system

Country Status (1)

Country Link
CN (1) CN116311536B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216313A (en) * 2023-09-13 2023-12-12 中关村科学城城市大脑股份有限公司 Attitude evaluation audio output method, attitude evaluation audio output device, electronic equipment and readable medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107349594A (en) * 2017-08-31 2017-11-17 华中师范大学 A kind of action evaluation method of virtual Dance System
US20190126145A1 (en) * 2014-10-22 2019-05-02 Activarium, LLC Exercise motion system and method
US20200394413A1 (en) * 2019-06-17 2020-12-17 The Regents of the University of California, Oakland, CA Athlete style recognition system and method
WO2021096669A1 (en) * 2019-11-15 2021-05-20 Microsoft Technology Licensing, Llc Assessing a pose-based sport
CN113947809A (en) * 2021-09-18 2022-01-18 杭州电子科技大学 Dance action visual analysis system based on standard video
CN114187654A (en) * 2021-11-24 2022-03-15 东南大学 Micro-inertia martial arts action identification method and system based on machine learning
CN114550027A (en) * 2022-01-18 2022-05-27 清华大学 Vision-based motion video fine analysis method and device
US20220358310A1 (en) * 2021-05-06 2022-11-10 Kuo-Yi Lin Professional dance evaluation method for implementing human pose estimation based on deep transfer learning
CN115527080A (en) * 2022-09-09 2022-12-27 阿里巴巴(中国)有限公司 Method for generating video motion recognition model and electronic equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190126145A1 (en) * 2014-10-22 2019-05-02 Activarium, LLC Exercise motion system and method
CN107349594A (en) * 2017-08-31 2017-11-17 华中师范大学 A kind of action evaluation method of virtual Dance System
US20200394413A1 (en) * 2019-06-17 2020-12-17 The Regents of the University of California, Oakland, CA Athlete style recognition system and method
WO2021096669A1 (en) * 2019-11-15 2021-05-20 Microsoft Technology Licensing, Llc Assessing a pose-based sport
US20220358310A1 (en) * 2021-05-06 2022-11-10 Kuo-Yi Lin Professional dance evaluation method for implementing human pose estimation based on deep transfer learning
CN113947809A (en) * 2021-09-18 2022-01-18 杭州电子科技大学 Dance action visual analysis system based on standard video
CN114187654A (en) * 2021-11-24 2022-03-15 东南大学 Micro-inertia martial arts action identification method and system based on machine learning
CN114550027A (en) * 2022-01-18 2022-05-27 清华大学 Vision-based motion video fine analysis method and device
CN115527080A (en) * 2022-09-09 2022-12-27 阿里巴巴(中国)有限公司 Method for generating video motion recognition model and electronic equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JOSÉ RUI FIGUEIRA 等: "Electre-Score: A first outranking based method for scoring actions", 《EUROPEAN JOURNAL OF OPERATIONAL RESEARCH 》, pages 986 - 1005 *
周帅 等: "基于Kinect采集的武术动作识别匹配研究", 自动化技术与应用, vol. 39, no. 03, pages 94 - 97 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117216313A (en) * 2023-09-13 2023-12-12 中关村科学城城市大脑股份有限公司 Attitude evaluation audio output method, attitude evaluation audio output device, electronic equipment and readable medium

Also Published As

Publication number Publication date
CN116311536B (en) 2023-08-08

Similar Documents

Publication Publication Date Title
Rogez et al. Mocap-guided data augmentation for 3d pose estimation in the wild
CN110555434B (en) Method for detecting visual saliency of three-dimensional image through local contrast and global guidance
CN110544301A (en) Three-dimensional human body action reconstruction system, method and action training system
CN109903331B (en) Convolutional neural network target detection method based on RGB-D camera
CN108388882B (en) Gesture recognition method based on global-local RGB-D multi-mode
CN110448870B (en) Human body posture training method
CN116311536B (en) Video action scoring method, computer-readable storage medium and system
KR20090084563A (en) Method and apparatus for generating the depth map of video image
EP4072147A1 (en) Video stream processing method, apparatus and device, and medium
CN109117753A (en) Position recognition methods, device, terminal and storage medium
CN109274883A (en) Posture antidote, device, terminal and storage medium
CN110378234A (en) Convolutional neural networks thermal imagery face identification method and system based on TensorFlow building
US11810366B1 (en) Joint modeling method and apparatus for enhancing local features of pedestrians
CN111723687A (en) Human body action recognition method and device based on neural network
Zhang et al. Automatic calibration of the fisheye camera for egocentric 3d human pose estimation from a single image
CN113065506B (en) Human body posture recognition method and system
CN113177940A (en) Gastroscope video part identification network structure based on Transformer
CN112509129B (en) Spatial view field image generation method based on improved GAN network
JP2007199864A (en) Method for image sequence generation and image column generation device
CN112070181A (en) Image stream-based cooperative detection method and device and storage medium
CN116311537A (en) Training method, storage medium and system for video motion recognition algorithm model
CN111260555A (en) Improved image splicing method based on SURF
US20220207261A1 (en) Method and apparatus for detecting associated objects
CN114943746A (en) Motion migration method utilizing depth information assistance and contour enhancement loss
CN115035007A (en) Face aging system for generating countermeasure network based on pixel level alignment and establishment method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A video action rating method, computer-readable storage medium, and system

Granted publication date: 20230808

Pledgee: Guangdong Provincial Bank of Communications Co.,Ltd.

Pledgor: Xunlong (Guangdong) Intelligent Technology Co.,Ltd.

Registration number: Y2024980002437