CN116311536A - Video action scoring method, computer-readable storage medium and system - Google Patents
Video action scoring method, computer-readable storage medium and system Download PDFInfo
- Publication number
- CN116311536A CN116311536A CN202310561082.6A CN202310561082A CN116311536A CN 116311536 A CN116311536 A CN 116311536A CN 202310561082 A CN202310561082 A CN 202310561082A CN 116311536 A CN116311536 A CN 116311536A
- Authority
- CN
- China
- Prior art keywords
- action
- video
- specific
- images
- motion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013077 scoring method Methods 0.000 title claims abstract description 17
- 238000000034 method Methods 0.000 claims abstract description 13
- 230000033001 locomotion Effects 0.000 claims description 128
- 238000004422 calculation algorithm Methods 0.000 claims description 52
- 238000000605 extraction Methods 0.000 claims description 47
- 238000004364 calculation method Methods 0.000 claims description 9
- 230000011218 segmentation Effects 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 4
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 238000005553 drilling Methods 0.000 claims description 2
- 238000005516 engineering process Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 2
- 108010014173 Factor X Proteins 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/23—Recognition of whole body movements, e.g. for sport training
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Abstract
The invention provides a video action scoring method, a computer readable storage medium and a system, wherein the method acquires body data and three-dimensional action videos of an action operator, identifies all continuous frame images of specific actions in each three-dimensional action video and key frame images which can reflect the specific actions most, then identifies action types, specification coefficients and difficulty coefficients of the specific actions in the selected key frame images, compares the quantity of all continuous frame images with the quantity of all standard action continuous frame images, acquires the exercise coefficients by combining the body data of the action operator, and calculates according to the specification coefficients, the difficulty coefficients, the exercise coefficients and the power coefficients of the specific actions to obtain action exercise scores, so that the action scoring comprehensively considers action difficulties and the power coefficients reflected by the body data, the action types and the specification coefficients of the action operator, and action scoring results are no longer on one surface.
Description
Technical Field
The present invention relates to the field of motion recognition technologies, and in particular, to a video motion scoring method, a computer-readable storage medium, and a system.
Background
With the rapid development of intelligent motion recognition technology, the intelligent motion recognition technology is widely applied to various industries. In the martial arts exercise scene, martial arts actions of an exercise person can be recognized firstly based on an action intelligent recognition technology, then action scoring is carried out, in the process, martial arts exercise videos of the martial arts actions of the exercise person are recorded firstly, a scoring system recognizes martial arts actions in the martial arts exercise videos by utilizing a video action recognition algorithm model, and then the recognized martial arts actions are scored. The existing scoring system generally compares the similarity between the identified martial arts and standard actions, and scores the martial arts according to the comparison result, but simply performing similarity comparison scoring does not consider the influence of factors such as physical data of an action trainer, action types of the martial arts, action difficulty and the like on the whole set of martial arts, and the obtained action scoring result is relatively unilateral.
Disclosure of Invention
The invention aims to solve the technical problem of improving the situation that the action scoring result is more unilateral.
In order to solve the technical problems, the invention provides a video action scoring method, which comprises the following steps:
A. acquiring body data of an action trainer, wherein the body data comprises sex coefficients, height, weight and arm length;
B. shooting a front action video and a plurality of side action videos of an action exerciser, wherein the shooting angles of the side action videos are different;
C. inputting each side action video and the front action video into a three-dimensional reconstruction algorithm model together to construct a plurality of three-dimensional action videos corresponding to each side action video respectively;
D. respectively inputting the plurality of three-dimensional action videos into a video action recognition algorithm model to recognize specific actions, obtaining all continuous frame images from a start frame image to an end frame image of the specific actions in each three-dimensional action video, and recognizing a key frame image which can reflect the specific actions most in all the continuous frame images;
E. acquiring a preset standard action library, wherein the standard action library comprises all standard action continuous frame images of each specific action;
F. inputting each key frame image into a human body key point regression algorithm respectively to obtain a human body key point feature image corresponding to a specific action in each key frame image, comparing the human body key point feature image of each key frame image with all standard action continuous frame images of each specific action in the standard action library respectively, selecting one which is most in line with the standard action and a corresponding three-dimensional action video thereof from each key frame image according to a comparison result, and identifying an action category, a specification coefficient and a difficulty coefficient to which the specific action in the selected key frame image belongs;
G. according to the action category of the specific action in the selected key frame image, acquiring the number of all standard action continuous frame images corresponding to the action category from the standard action library, comparing the number of all continuous frame images with the number of all standard action continuous frame images, and acquiring an exercise coefficient by combining body data of an action exercise person, wherein the number of all continuous frame images of the specific action in the three-dimensional action video corresponding to the selected key frame image;
H. and calculating to obtain a power coefficient according to the action category, the exercise coefficient and the body data of the specific action in the selected key frame image, and calculating to obtain an action exercise score according to the specification coefficient, the difficulty coefficient, the exercise coefficient and the power coefficient of the specific action in the selected key frame image.
Preferably, in the step G, a calculation formula of the drilling coefficient is:
wherein ,drill coefficients representing specific actions in the selected keyframe image, +.>Representing the number of all continuous frame images of a specific motion in the three-dimensional motion video corresponding to the selected key frame image,/or more>Representing the number of consecutive frame images of all standard actions taken from the standard action library,/for>The body coefficients are fitted based on body data of the action exercisers, and the fitting formula is as follows:
wherein ,sex coefficient of the person who performs exercise, +.>Representing the weight of the action exercisers, +.>Representing the height of the action exercisers, +.>Representing arm length of the motion trainer, +.>、/>、/>、/>、/>Is a preset fitting constant.
Preferably, in the step H, the calculation formula of the power coefficient is:
wherein ,a power factor representing a specific action in the selected key frame image,/or->Drill coefficients representing specific actions in the selected keyframe image, +.>Is a body factor fitted based on body data of the action exercises, is +.>The power constant is preset corresponding to the action category to which each specific action belongs.
Preferably, in the step H, a calculation formula of the action exercise score is:
wherein ,representing an action drill score, < >>Specification coefficients representing specific actions in the selected key frame image +.>Difficulty coefficient representing specific action in selected key frame image, < >>Drill coefficients representing specific actions in the selected keyframe image, +.>A power factor representing a particular action in the selected key frame image.
Preferably, the video motion recognition algorithm model is trained through specific motion recognition, and the specific training method comprises the following steps of S1, S2, S3, S4 and S5:
s1, acquiring an original video, and segmenting the original video according to a segmentation instruction input by a user to obtain a plurality of segmented videos;
s2, extracting images of each segmented video for multiple times to obtain multiple video images;
s3, dividing the extracted video images according to specific actions defined by a user, wherein the division results comprise specific action images and non-specific action images;
s4, synthesizing the specific action images obtained by dividing into specific action videos, and respectively carrying out image extraction on the specific action videos for a plurality of times according to a plurality of preset different interval frequencies, wherein each image extraction obtains a plurality of sample images;
s5, respectively converting a plurality of sample images obtained by extracting each image into a sample file format of a video motion recognition algorithm model, and inputting all sample images after the format conversion into the video motion recognition algorithm model to train the video motion recognition algorithm model until the video motion recognition algorithm model has the capability of recognizing specific motions in video.
Preferably, in the step S1, the original video is equally divided according to a division instruction input by a user to obtain a plurality of division videos with the same duration.
Preferably, in the step S4, the number of times of image extraction of the specific motion video is three or more.
Preferably, in the step S4, the extraction time corresponding to the interval frequency of each image extraction does not completely overlap with the extraction time corresponding to the interval frequency of other image extraction.
The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps in a video action scoring method as described above.
The invention also provides a video action scoring system which comprises a main control device, a height and weight measuring instrument and a plurality of cameras, wherein the main control device is respectively connected with the height and weight measuring instrument and the cameras, and the main control device comprises a processor and the computer readable storage medium which are mutually connected.
The invention has the following beneficial effects: when video action scoring is carried out, body data and a plurality of three-dimensional action videos of an action operator are obtained, all continuous frame images of specific actions in each three-dimensional action video and key frame images which can reflect the specific actions in the continuous frame images are identified by utilizing a video action identification algorithm model, one of the key frame images which is most in line with the standard actions and the corresponding three-dimensional action video thereof are selected, then action categories, specification coefficients and difficulty coefficients of the specific actions in the selected key frame images are identified according to a preset standard action library, corresponding continuous frame image numbers of all standard actions are obtained from a standard action library according to the action categories of the specific actions in the selected key frame images, then the continuous frame image numbers of all continuous frame images of the specific actions in the three-dimensional action video corresponding to the selected key frame images are compared with the obtained continuous frame image numbers of all standard actions, then the exercise coefficients are obtained by combining with body data of the action operator, then the exercise coefficients are obtained according to the action categories and the exercise coefficients of the specific actions in the selected key frame images, the exercise coefficients are obtained according to the action categories and the exercise coefficients of the specific actions in the preset standard action images, the exercise coefficients in the step coefficients are obtained, and the performance coefficients of the body score is not considered, and the final action coefficients are obtained according to the overall action score and the performance score is obtained.
Drawings
FIG. 1 is a connected block diagram of a video action scoring system.
Fig. 2 is a schematic diagram of a martial arts station.
Fig. 3 is a flow chart of a video action scoring method.
Fig. 4 is a flow chart of a training method of a video motion recognition algorithm model.
Detailed Description
The invention is further described in detail below in connection with the detailed description.
The embodiment provides a video action scoring system applied to a martial arts exercise scene, as shown in fig. 1, the system comprises a main control device 1, a height and weight measuring instrument 2 and seven cameras 3, wherein the main control device 1 is respectively connected with the height and weight measuring instrument 2 and the seven cameras 3. In the martial arts exercise scene, the action exercisers need to log on the martial arts exercise table 4 to perform martial arts exercise, as shown in fig. 2, the martial arts exercise table 4 is rectangular, and a 5 m×5 m square calibration plate 5 is arranged behind the martial arts exercise table, each square is 0.1 m×0.1 m in size, namely square length0.1 meter; the height and weight measuring instrument 2 is arranged in front of the calibration plate 5, seven cameras 3 are respectively arranged right in front of the model Wu Tai, left in front of the model Wu Tai, right in front of the model Wu Tai, left in back of the model Wu Tai, right in front of the model forward, and right in back of the model forward, and the seven cameras 3 face the center of the model stage 4. The master control apparatus 1 comprises a computer readable storage medium and a processor connected to each other, the computer readable storage medium having stored thereon a computer program which, when executed by the processor, implements a video action scoring method as shown in fig. 3, the method comprising the following step A, B, C, D, E, F, G, H.
A. Body data of the action exercises including sex coefficients, height, weight and arm length are acquired.
The action trainer stands on the height and weight measuring instrument 2 before martial arts action exercises are performed, and the main control device 1 acquires the body of the action trainer standing on the height and weight measuring instrument 2The body data includes a sex factor X, a height h, a weight W, and an arm length L. Specifically, the main control device 1 directly measures the height h and the weight W of the motion exercise person through the height and weight measuring instrument 2, at the same time, the main control device 1 captures images of the motion exercise person standing on the height and weight measuring instrument 2 and the rear calibration plate 5 by using the camera 3 arranged right in front of the actor Wu Tai, then recognizes the sex of the motion exercise person in the images through the face recognition technology, and correspondingly generates different sex coefficients X if the sex of the motion exercise person is different, for example, if the main control device 1 recognizes that the sex of the motion exercise person is male through the face recognition technology, the sex coefficient X may be generated to be 1, and if the main control device 1 recognizes that the sex of the motion exercise person is female through the face recognition technology, the sex coefficient X may be generated to be 1.1. Then the main control equipment 1 inputs the image shot by the camera 3 in front into a human body key point regression algorithm DEKR to obtain key point coordinates of the big arm and the small arm of the action exercise person and calculate the pixel length of the arm according to the key point coordinatesThen, the image shot by the camera 3 in front is input into a contour acquisition algorithm based on the computer vision software OpenCV to obtain the square pixel length +.>Since the length of the square on the calibration plate 5 is known +.>Is 0.1 meter, so that the length of the square lattice can be combined>Arm pixel lengthAnd square pixel length->Calculating to obtain the arm length of the action exercise person>。
B. The method comprises the steps of shooting a front action video and a plurality of side action videos of an action exerciser, wherein shooting angles of the side action videos are different.
When the action trainer performs martial arts action exercises, the main control device 1 captures front action videos of the action trainer by using the camera 3 in front of the action trainer, and captures three side action videos of the action trainer from different angles by using the three cameras 3 positioned on the same side. For example, if the main control device 1 captures a front motion video of the motion trainer by using the camera 3 located right in front of the main control device and captures a left side motion video of the motion trainer by using the camera 3 located left in front of the main control device, three of the front left, and rear left are located right in front of the main control device, four motion videos, that is, a front motion video, a front left motion video, a rear left motion video, and a rear left motion video can be obtained; alternatively, the main control device 1 captures a front motion video of the motion trainer by using the camera 3 located right ahead, and captures a right side video of the motion trainer by using the cameras 3 located right ahead, right behind, and right behind, so that four motion videos, that is, a front motion video, a right motion video, and a right rear motion video, can be obtained.
C. And inputting each side action video and the front action video into a three-dimensional reconstruction algorithm model, and constructing a plurality of three-dimensional action videos corresponding to each side action video.
Taking the example of capturing a front action video, a front left action video and a rear left action video, the main control device 1 inputs each side action video and the front action video into the three-dimensional reconstruction algorithm model Nvdiffrec together, and combines the side action video and the front action video to construct a corresponding three-dimensional action video. Specifically, the main control device 1 inputs the left front side action video and the front side action video into the three-dimensional reconstruction algorithm model Nvdiffrec together to construct a first three-dimensional action video, inputs the positive left side action video and the front side action video into the three-dimensional reconstruction algorithm model together to construct a second three-dimensional action video, and inputting the left rear side action video and the front side action video into a three-dimensional reconstruction algorithm model together to mutually combine and construct a third three-dimensional action video, so that three-dimensional action videos are obtained in total.
D. And respectively inputting a plurality of three-dimensional action videos into a video action recognition algorithm model to perform specific action recognition, obtaining all continuous frame images from a start frame image to an end frame image of each three-dimensional action video, and recognizing a key frame image which can most reflect the specific action in all the continuous frame images.
In this embodiment, before using the video motion recognition algorithm model SlowFast to perform specific motion recognition on the three-dimensional motion video, a training system of the video motion recognition algorithm model SlowFast is used to train the video motion recognition algorithm model SlowFast, so that the video motion recognition algorithm model SlowFast has the capability of recognizing the specific motion in the video, and the training system includes a computer readable storage medium and a processor, which are connected with each other, where the computer readable storage medium stores a computer program, and when executed by the processor, the training method of the video motion recognition algorithm model shown in fig. 4 is implemented, and the training method includes steps S1, S2, S3, S4, and S5.
S1, acquiring an original video, and segmenting the original video according to segmentation instructions input by a user to obtain a plurality of segmented videos.
In order to enable the video motion recognition algorithm model SlowFast to recognize specific motions in the three-dimensional motion video, an original video containing a set of standard motions of a martial arts coach needs to be acquired to train the video motion recognition algorithm model SlowFast, in the training process, a training system can firstly conduct image extraction on the original video for multiple times to obtain a plurality of video images, at the moment, a folder corresponding to the original video can be generated firstly, and then the extracted video images are stored in the folder. However, the time spent for playing a set of standard actions by a martial arts coach is generally longer, so that the recorded original video is longer in duration and occupies a larger memory, if the original video is directly subjected to multiple image extraction, a very large number of video images are generated under one folder, which can cause a computer screen to display very slow actions such as refreshing, opening the folder and the like, even to be blocked, therefore, before the original video is subjected to multiple image extraction, a training system firstly acquires a segmentation instruction input by a user (for example, a segmentation instruction input by the user and used for performing ten equal parts on the original video), then segments the original video into a plurality of segmented videos with shorter duration according to the segmentation instruction, so that a folder is generated corresponding to each segmented video, multiple image extraction is performed on each segmented video, and corresponding video images are respectively generated under the folders corresponding to each segmented video, so that the number of video images in each folder is smaller.
S2, extracting images of each segmented video for multiple times to obtain multiple video images.
After the original video is segmented to obtain a plurality of segmented videos, a folder is generated corresponding to each segmented video, then the training system performs image extraction on each segmented video for a plurality of times according to preset interval frequency, and a plurality of video images are respectively generated under the folders corresponding to each segmented video. For example, after an original video is divided into ten equal-divided videos according to a dividing instruction input by a user, the training system divides the original video into ten divided videos with the time length of 6 minutes respectively, a folder is generated corresponding to each divided video, then the training system extracts images of each divided video for a plurality of times according to the interval frequency of 1 frame per second, and 360 video images are generated under the folders corresponding to each divided video respectively.
S3, dividing the extracted video images according to specific actions defined by a user, wherein the division result comprises specific action images and non-specific action images.
After extracting a plurality of video images, a user inputs defined specific actions to a training system, and the training system can divide 360 video images extracted under each folder for segmenting video according to the specific actions defined by the user to obtain specific action images with specific actions in the images and non-specific action images without specific actions in the images.
S4, synthesizing the specific action images obtained through division into specific action videos, and respectively carrying out image extraction on the specific action videos for a plurality of times according to a plurality of preset different interval frequencies, wherein each image extraction obtains a plurality of sample images.
After the training system divides the specific action images with specific actions in the obtained images, the specific action images are synthesized into specific action videos by using a video synthesis algorithm based on computer vision software OpenCV. And then, the training system respectively performs multiple image extraction on the specific action video according to a plurality of preset different interval frequencies, and a plurality of sample images are obtained through each image extraction. For example, the first image extraction is performed on the specific motion video at an interval frequency of 4 frames per second to obtain a number of sample images of 4m, the second image extraction is performed on the specific motion video at an interval frequency of 10 frames per second to obtain a number of sample images of 10m, and the third image extraction is performed on the specific motion video at an interval frequency of 30 frames per second to obtain a number of sample images of 30 m.
In this embodiment, for example, the specific motion video is extracted for the first time at an interval frequency of 4 frames per second, the extraction time corresponding to the first image extraction is not completely overlapped with the extraction time corresponding to the second image extraction, and the extraction time corresponding to the third image extraction is not completely overlapped with the extraction time corresponding to the third image extraction.
In other embodiments, if the specific motion video is first extracted at an interval frequency of 2 frames per second, the corresponding extraction time is 0.5 seconds, 1 second, 1.5 seconds, 2 seconds, the specific motion video is second extracted at an interval frequency of 4 frames per second, the corresponding extraction time is 0.25 seconds, 0.5 seconds, 0.75 seconds, 1 second … …, and the specific motion video is third extracted at an interval frequency of 10 frames per second, the corresponding extraction time is 0.1 seconds, 0.2 seconds, 0.3 seconds, 0.4 seconds … …, so that the extraction time corresponding to the first image extraction is covered by the extraction time corresponding to the second image extraction, that is, if the extraction time corresponding to the interval frequency of some image extraction is completely overlapped with the extraction time corresponding to the interval frequency of other image extraction, it is not acceptable.
S5, respectively converting a plurality of sample images obtained by extracting each image into a sample file format of a video motion recognition algorithm model, and inputting all sample images after the format conversion into the video motion recognition algorithm model to train the sample images until the video motion recognition algorithm model has the capability of recognizing specific motions in video.
After a plurality of sample images are extracted, the training system respectively converts the plurality of sample images extracted from each image into a sample file format of a video motion recognition algorithm model SlowFast, the specific format is a coco format json result file marked by labelme, and then all sample images after the conversion format are input into the video motion recognition algorithm model SlowFast to train the video motion recognition algorithm model SlowFast until the video motion recognition algorithm model SlowFast has the capability of recognizing specific motions in video. Therefore, the total quantity of the sample images input into the video motion recognition algorithm model swfast is 4m+10m+30m=44m, the quantity is more than the quantity (4 m, 10m or 30 m) of the sample images obtained by single image extraction, the interval frequency of each sample image in extraction is different, namely the sample images are different, the characteristic diversity of the sample images is increased, the sample images are input into the video motion recognition algorithm model swfast for training, the overfitting phenomenon can be avoided in training, and a large error does not exist between the predicted result and the actual result of the trained video motion recognition algorithm model swfast.
After obtaining three-dimensional motion videos, the main control device 1 respectively inputs the three-dimensional motion videos into the trained video motion recognition algorithm model SlowFast to perform specific motion recognition, and it should be noted that, when the motion exerciser performs specific motion, a continuous motion gesture change process is undergone, for example, when the motion exerciser performs a specific motion, the specific motion is performed by turning the right hand from front to back half a turn, the specific motion is subjected to a motion gesture change process from front starting point to back end point, and the motion gesture change process corresponds to a continuous frame image, so that after performing specific motion recognition, the video motion recognition algorithm model SlowFast can obtain a starting frame image of the specific motion in each three-dimensional motion videoTo end frame image->All consecutive frame images in between. In addition, in all continuous frame images with specific actions in each three-dimensional action video, a certain frame image can reflect the specific actions most, and a trained video action recognition algorithm model SlowFast can recognize key frame images which can reflect the specific actions most in all continuous frame images of each three-dimensional action video>。
E. And acquiring a preset standard action library, wherein the standard action library comprises all standard action continuous frame images of each specific action.
In this embodiment, a standard motion library is preset for the martial arts exercise scene, the standard motion library includes all standard motion continuous frame images of all standard motions played by a trainer during martial arts exercise, and motion types and specification coefficients given by the trainer are recorded for all standard motion continuous frame images of all standard motionsAnd difficulty coefficient->. After each main control device 1 obtains a preset standard action library, each key frame image is compared with the standard action continuous frame images in the standard action library in similarity, so that one of the key frame images which is most in line with the standard action can be identified.
F. Inputting each key frame image into a human body key point regression algorithm respectively to obtain a human body key point feature map corresponding to a specific action in each key frame image, comparing the human body key point feature map of each key frame image with all standard action continuous frame images of each specific action in a standard action library respectively, selecting one which is most in line with the standard action and a corresponding three-dimensional action video thereof from each key frame image according to a comparison result, and identifying an action category, a specification coefficient and a difficulty coefficient to which the specific action in the selected key frame image belongs.
Key frame image capable of reflecting specific action most in all continuous frame images corresponding to each three-dimensional action videoAfter that, the main control device 1 will every key frame image +.>Respectively inputting the images into a human body key point regression algorithm DEKR to obtain each key frame image +.>Human body key point feature map corresponding to specific action in the image, and then each key frame image is +.>The human body key point feature images of the (4) are respectively compared with all standard action continuous frame images of each specific action in a standard action library, and the +.>Selecting one of the most suitable standard actions, and acquiring the key frame image +.>And corresponding three-dimensional action video. For example, in step D, the video motion recognition algorithm model SlowFast recognizes the first keyframe image +.>Identifying a second keyframe image from the second three-dimensional motion video>Identifying a third keyframe image from the third three-dimensional action video>In the case of the three key frame images +.>Inputting the three key frame images into a human key point regression algorithm DEKR to obtain a corresponding human key point feature image>The human body key point feature images of the (4) are respectively compared with all standard action continuous frame images of each specific action in a standard action library to obtain a first key frame image +.>The similarity of the specific motion in the (2) and the standard motion of a certain standard motion continuous frame image is 0.99, and the second key frame image is +.>The similarity of the specific motion in the (2) and the standard motion of a certain standard motion continuous frame image is 0.98, and the third key frame image is +.>The similarity of the specific motion in the (2) and the standard motion of the continuous frame image of a certain standard motion is 0.97, the main control device 1 is in the (i) three key frame images>Selecting the first key frame image with highest similarity, i.e. the most conforming to the standard action +.>And acquires the first key frame image +.>A corresponding first three-dimensional action video.
Then, the main control device 1 generates the first key frame image according to the standard action libraryStandard motion continuous frame image with similarity of 0.99, identifying motion category and specification coefficient of standard motion corresponding to the standard motion continuous frame image>And difficulty coefficient->And takes this as the first keyframe image selected +.>Action category and specification coefficient of specific action in (a)>And difficulty coefficient->。
G. According to the action category of the specific action in the selected key frame image, acquiring the quantity of all standard action continuous frame images corresponding to the action category from a standard action library, comparing the quantity of all continuous frame images of the specific action in the three-dimensional action video corresponding to the selected key frame image with the acquired quantity of all standard action continuous frame images, and combining body data of an action trainer to obtain the exercise coefficient.
The main control device 1 selects the first key frame imageThe action category of the specific action in the list is obtained from the standard action library, and the number of all standard action continuous frame images corresponding to the action category is +.>I.e. acquire the first key frame image +.>All standard motion continuous frame image number with similarity of 0.99 +.>And acquires the selected first key frame image +.>The number of all consecutive frame images of the corresponding three-dimensional motion video in which the specific motion occurs>Then, the number of all continuous frame images of the specific action is +.>And all standard movementsNumber of pictures for successive frames->By contrast, the physical data of the action exercisers are combined to obtain the exercise coefficient +.>The drill factor->The specific calculation formula of (2) is as follows:
wherein ,representing the selected key frame image +.>Drill coefficient of specific action in +.>Representing the selected key frame image +.>All the continuous frame image quantity of specific action in the corresponding three-dimensional action video, +.>Representing the number of consecutive frame images of all standard actions taken from the standard action library,/for>The body coefficients are fitted based on body data (sex coefficient X, height h, weight W and arm length L) of the action exercises, and a specific fitting formula is as follows:
wherein ,sex coefficient of the person who performs exercise, +.>Representing the weight of the action exercisers, +.>Representing the height of the action exercisers, +.>Representing arm length of the motion trainer, +.>、/>、/>、/>、/>Is a preset fitting constant.
H. And calculating to obtain a power coefficient according to the action category, the exercise coefficient and the body data of the specific action in the selected key frame image, and calculating to obtain an action exercise score according to the specification coefficient, the difficulty coefficient, the exercise coefficient and the power coefficient of the specific action in the selected key frame image.
Then, the main control device 1 selects the first key frame imageAction category and exercise coefficient to which the specific action in (a) belongs +.>Calculating to obtain the power coefficient->Coefficient of power->The specific calculation formula of (2) is as follows:
wherein ,representing the selected key frame image +.>The power factor of a specific action in +.>Representing the selected key frame image +.>Drill coefficient of specific action in +.>Is a body factor fitted based on body data of the action exercises, is +.>The power constant is preset corresponding to the action category to which each specific action belongs.
Then, the main control device 1 then selects the first key frame imageSpecification coefficient of specific motion in (a)Difficulty coefficient->Drill factor->And the coefficient of power->Calculating to obtain action exercise score +.>Drill score->The specific calculation formula of (2) is as follows:
wherein ,representing an action drill score, < >>Representing the selected key frame image +.>Specification coefficient of specific action in (a), +.>Representing the selected key frame image +.>Difficulty coefficient of specific action in +.>Representing the selected key frame image +.>Drill coefficient of specific action in +.>Representing the selected key frame image +.>Work of a specific action in (a)Force coefficient.
The action exercise score thus finally calculatedComprehensively consider the physical data (sex coefficient X, height h, weight W, arm length L) of the action exercisers, the action category and specification coefficient of the specific action>Reflected action specification, difficulty coefficient ∈>The reflected action difficulty and the exercise coefficient according to the action category>The obtained power factor->And enabling the action scoring result to be no longer on one side.
The above-described embodiments are provided for the present invention only and are not intended to limit the scope of patent protection. Insubstantial changes and substitutions can be made by one skilled in the art in light of the teachings of the invention, as yet fall within the scope of the claims.
Claims (10)
1. The video action scoring method is characterized by comprising the following steps of:
A. acquiring body data of an action trainer, wherein the body data comprises sex coefficients, height, weight and arm length;
B. shooting a front action video and a plurality of side action videos of an action exerciser, wherein the shooting angles of the side action videos are different;
C. inputting each side action video and the front action video into a three-dimensional reconstruction algorithm model together to construct a plurality of three-dimensional action videos corresponding to each side action video respectively;
D. respectively inputting the plurality of three-dimensional action videos into a video action recognition algorithm model to recognize specific actions, obtaining all continuous frame images from a start frame image to an end frame image of the specific actions in each three-dimensional action video, and recognizing a key frame image which can reflect the specific actions most in all the continuous frame images;
E. acquiring a preset standard action library, wherein the standard action library comprises all standard action continuous frame images of each specific action;
F. inputting each key frame image into a human body key point regression algorithm respectively to obtain a human body key point feature image corresponding to a specific action in each key frame image, comparing the human body key point feature image of each key frame image with all standard action continuous frame images of each specific action in the standard action library respectively, selecting one which is most in line with the standard action and a corresponding three-dimensional action video thereof from each key frame image according to a comparison result, and identifying an action category, a specification coefficient and a difficulty coefficient to which the specific action in the selected key frame image belongs;
G. according to the action category of the specific action in the selected key frame image, acquiring the number of all standard action continuous frame images corresponding to the action category from the standard action library, comparing the number of all continuous frame images with the number of all standard action continuous frame images, and acquiring an exercise coefficient by combining body data of an action exercise person, wherein the number of all continuous frame images of the specific action in the three-dimensional action video corresponding to the selected key frame image;
H. and calculating to obtain a power coefficient according to the action category, the exercise coefficient and the body data of the specific action in the selected key frame image, and calculating to obtain an action exercise score according to the specification coefficient, the difficulty coefficient, the exercise coefficient and the power coefficient of the specific action in the selected key frame image.
2. The video action scoring method according to claim 1, wherein in the step G, a calculation formula of the drilling coefficient is:
wherein ,drill coefficients representing specific actions in the selected keyframe image, +.>Representing the number of all continuous frame images of a specific motion in the three-dimensional motion video corresponding to the selected key frame image,/or more>Representing the number of consecutive frame images of all standard actions taken from the standard action library,/for>The body coefficients are fitted based on body data of the action exercisers, and the fitting formula is as follows:
3. The video action scoring method according to claim 2, wherein in the step H, the calculation formula of the power coefficient is:
wherein ,a power factor representing a specific action in the selected key frame image,/or->Drill coefficients representing specific actions in the selected keyframe image, +.>Is a body factor fitted based on body data of the action exercises, is +.>The power constant is preset corresponding to the action category to which each specific action belongs.
4. The video motion scoring method according to claim 3, wherein in the step H, a calculation formula of the motion exercise score is:
wherein ,representing an action drill score, < >>Specification coefficients representing a particular action in the selected key frame image,difficulty coefficient representing specific action in selected key frame image, < >>Drill coefficients representing specific actions in the selected keyframe image, +.>A power factor representing a particular action in the selected key frame image.
5. The video motion scoring method according to claim 1, wherein the video motion recognition algorithm model is trained for specific motion recognition, and the specific training method comprises the following steps S1, S2, S3, S4, S5:
s1, acquiring an original video, and segmenting the original video according to a segmentation instruction input by a user to obtain a plurality of segmented videos;
s2, extracting images of each segmented video for multiple times to obtain multiple video images;
s3, dividing the extracted video images according to specific actions defined by a user, wherein the division results comprise specific action images and non-specific action images;
s4, synthesizing the specific action images obtained by dividing into specific action videos, and respectively carrying out image extraction on the specific action videos for a plurality of times according to a plurality of preset different interval frequencies, wherein each image extraction obtains a plurality of sample images;
s5, respectively converting a plurality of sample images obtained by extracting each image into a sample file format of a video motion recognition algorithm model, and inputting all sample images after the format conversion into the video motion recognition algorithm model to train the video motion recognition algorithm model until the video motion recognition algorithm model has the capability of recognizing specific motions in video.
6. The video action scoring method according to claim 5, wherein in step S1, the original video is equally divided according to a segmentation command input by a user to obtain a plurality of segmented videos with the same duration.
7. The video motion scoring method according to claim 5, wherein in step S4, the number of times of image extraction of the specific motion video is three or more.
8. The video action scoring method according to claim 7, wherein in step S4, the extraction time corresponding to the interval frequency of each image extraction does not completely overlap with the extraction time corresponding to the interval frequency of the other image extraction.
9. A computer readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps in the video action scoring method of any one of claims 1 to 8.
10. A video action scoring system, comprising a main control device (1), a height and weight measuring instrument (2) and a plurality of cameras (3), wherein the main control device (1) is respectively connected with the height and weight measuring instrument (2) and each camera (3), and the main control device (1) comprises a processor and a computer readable storage medium according to claim 9 which are mutually connected.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310561082.6A CN116311536B (en) | 2023-05-18 | 2023-05-18 | Video action scoring method, computer-readable storage medium and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310561082.6A CN116311536B (en) | 2023-05-18 | 2023-05-18 | Video action scoring method, computer-readable storage medium and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116311536A true CN116311536A (en) | 2023-06-23 |
CN116311536B CN116311536B (en) | 2023-08-08 |
Family
ID=86801733
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310561082.6A Active CN116311536B (en) | 2023-05-18 | 2023-05-18 | Video action scoring method, computer-readable storage medium and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116311536B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117216313A (en) * | 2023-09-13 | 2023-12-12 | 中关村科学城城市大脑股份有限公司 | Attitude evaluation audio output method, attitude evaluation audio output device, electronic equipment and readable medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107349594A (en) * | 2017-08-31 | 2017-11-17 | 华中师范大学 | A kind of action evaluation method of virtual Dance System |
US20190126145A1 (en) * | 2014-10-22 | 2019-05-02 | Activarium, LLC | Exercise motion system and method |
US20200394413A1 (en) * | 2019-06-17 | 2020-12-17 | The Regents of the University of California, Oakland, CA | Athlete style recognition system and method |
WO2021096669A1 (en) * | 2019-11-15 | 2021-05-20 | Microsoft Technology Licensing, Llc | Assessing a pose-based sport |
CN113947809A (en) * | 2021-09-18 | 2022-01-18 | 杭州电子科技大学 | Dance action visual analysis system based on standard video |
CN114187654A (en) * | 2021-11-24 | 2022-03-15 | 东南大学 | Micro-inertia martial arts action identification method and system based on machine learning |
CN114550027A (en) * | 2022-01-18 | 2022-05-27 | 清华大学 | Vision-based motion video fine analysis method and device |
US20220358310A1 (en) * | 2021-05-06 | 2022-11-10 | Kuo-Yi Lin | Professional dance evaluation method for implementing human pose estimation based on deep transfer learning |
CN115527080A (en) * | 2022-09-09 | 2022-12-27 | 阿里巴巴(中国)有限公司 | Method for generating video motion recognition model and electronic equipment |
-
2023
- 2023-05-18 CN CN202310561082.6A patent/CN116311536B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190126145A1 (en) * | 2014-10-22 | 2019-05-02 | Activarium, LLC | Exercise motion system and method |
CN107349594A (en) * | 2017-08-31 | 2017-11-17 | 华中师范大学 | A kind of action evaluation method of virtual Dance System |
US20200394413A1 (en) * | 2019-06-17 | 2020-12-17 | The Regents of the University of California, Oakland, CA | Athlete style recognition system and method |
WO2021096669A1 (en) * | 2019-11-15 | 2021-05-20 | Microsoft Technology Licensing, Llc | Assessing a pose-based sport |
US20220358310A1 (en) * | 2021-05-06 | 2022-11-10 | Kuo-Yi Lin | Professional dance evaluation method for implementing human pose estimation based on deep transfer learning |
CN113947809A (en) * | 2021-09-18 | 2022-01-18 | 杭州电子科技大学 | Dance action visual analysis system based on standard video |
CN114187654A (en) * | 2021-11-24 | 2022-03-15 | 东南大学 | Micro-inertia martial arts action identification method and system based on machine learning |
CN114550027A (en) * | 2022-01-18 | 2022-05-27 | 清华大学 | Vision-based motion video fine analysis method and device |
CN115527080A (en) * | 2022-09-09 | 2022-12-27 | 阿里巴巴(中国)有限公司 | Method for generating video motion recognition model and electronic equipment |
Non-Patent Citations (2)
Title |
---|
JOSÉ RUI FIGUEIRA 等: "Electre-Score: A first outranking based method for scoring actions", 《EUROPEAN JOURNAL OF OPERATIONAL RESEARCH 》, pages 986 - 1005 * |
周帅 等: "基于Kinect采集的武术动作识别匹配研究", 自动化技术与应用, vol. 39, no. 03, pages 94 - 97 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117216313A (en) * | 2023-09-13 | 2023-12-12 | 中关村科学城城市大脑股份有限公司 | Attitude evaluation audio output method, attitude evaluation audio output device, electronic equipment and readable medium |
Also Published As
Publication number | Publication date |
---|---|
CN116311536B (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Rogez et al. | Mocap-guided data augmentation for 3d pose estimation in the wild | |
CN110555434B (en) | Method for detecting visual saliency of three-dimensional image through local contrast and global guidance | |
CN110544301A (en) | Three-dimensional human body action reconstruction system, method and action training system | |
CN109903331B (en) | Convolutional neural network target detection method based on RGB-D camera | |
CN108388882B (en) | Gesture recognition method based on global-local RGB-D multi-mode | |
CN110448870B (en) | Human body posture training method | |
CN116311536B (en) | Video action scoring method, computer-readable storage medium and system | |
KR20090084563A (en) | Method and apparatus for generating the depth map of video image | |
EP4072147A1 (en) | Video stream processing method, apparatus and device, and medium | |
CN109117753A (en) | Position recognition methods, device, terminal and storage medium | |
CN109274883A (en) | Posture antidote, device, terminal and storage medium | |
CN110378234A (en) | Convolutional neural networks thermal imagery face identification method and system based on TensorFlow building | |
US11810366B1 (en) | Joint modeling method and apparatus for enhancing local features of pedestrians | |
CN111723687A (en) | Human body action recognition method and device based on neural network | |
Zhang et al. | Automatic calibration of the fisheye camera for egocentric 3d human pose estimation from a single image | |
CN113065506B (en) | Human body posture recognition method and system | |
CN113177940A (en) | Gastroscope video part identification network structure based on Transformer | |
CN112509129B (en) | Spatial view field image generation method based on improved GAN network | |
JP2007199864A (en) | Method for image sequence generation and image column generation device | |
CN112070181A (en) | Image stream-based cooperative detection method and device and storage medium | |
CN116311537A (en) | Training method, storage medium and system for video motion recognition algorithm model | |
CN111260555A (en) | Improved image splicing method based on SURF | |
US20220207261A1 (en) | Method and apparatus for detecting associated objects | |
CN114943746A (en) | Motion migration method utilizing depth information assistance and contour enhancement loss | |
CN115035007A (en) | Face aging system for generating countermeasure network based on pixel level alignment and establishment method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: A video action rating method, computer-readable storage medium, and system Granted publication date: 20230808 Pledgee: Guangdong Provincial Bank of Communications Co.,Ltd. Pledgor: Xunlong (Guangdong) Intelligent Technology Co.,Ltd. Registration number: Y2024980002437 |