CN114550071A - Method, device and medium for automatically identifying and capturing track and field video action key frames - Google Patents

Method, device and medium for automatically identifying and capturing track and field video action key frames Download PDF

Info

Publication number
CN114550071A
CN114550071A CN202210280271.1A CN202210280271A CN114550071A CN 114550071 A CN114550071 A CN 114550071A CN 202210280271 A CN202210280271 A CN 202210280271A CN 114550071 A CN114550071 A CN 114550071A
Authority
CN
China
Prior art keywords
action
key
frame
frames
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210280271.1A
Other languages
Chinese (zh)
Other versions
CN114550071B (en
Inventor
林平
李瀚懿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
One Body Technology Co.,Ltd.
Original Assignee
Beijing Yiti Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yiti Technology Co ltd filed Critical Beijing Yiti Technology Co ltd
Priority to CN202210280271.1A priority Critical patent/CN114550071B/en
Publication of CN114550071A publication Critical patent/CN114550071A/en
Application granted granted Critical
Publication of CN114550071B publication Critical patent/CN114550071B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method, a device and a medium for automatically identifying and capturing a track and field video action key frame, wherein the method comprises the following steps: automatically identifying and capturing to obtain four groups of action key frames based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position area model respectively; selecting from four groups of action key frames in sequence to form a first set according to each key action; removing the action key frames beyond the range by using the key frame confidence space; and outputting the final result of automatic identification and capture of the key frames according to the number of the residual action key frames in the first set. According to the method and the device, the action key frames are automatically identified and captured, the efficiency of identifying and capturing the action key frames of the track and field video is improved, and the accuracy of identification is improved through matching of the key frame confidence space and the key action semantics.

Description

Method, device and medium for automatically identifying and capturing track and field video action key frames
Technical Field
The invention relates to the technical field of video analysis, in particular to a method and a device for automatically identifying and capturing a field video action key frame and a computer readable medium.
Background
With the development of science and technology, more and more Artificial Intelligence (AI) analysis technologies are applied to track and field project training, and the daily training of players is scientifically and reasonably guided by calculating the running parameters of the players, such as the pace, the stride, the instantaneous speed, the landing time, the flight time and the like, so that the training effect is improved.
AI analysis of track and field videos is based on key actions of various items, such as start-up, run-through of 100 meter items; the first jump (single foot jump), the second jump (striding jump), the last jump (jump), the soaring peak and the landing of the reunion; and the key actions of starting, crossing, emptying, landing and the like of the 110 columns of items.
At present, action key frames of track and field project video AI analysis are marked manually, and the efficiency is low.
In view of this, there is a need for improving the manual marking of action key frames in the AI analysis of the existing track and field videos, so as to automatically identify and capture the key frames from the track and field videos, and improve the efficiency of the AI analysis of the track and field key actions.
Disclosure of Invention
In view of the above drawbacks, the technical problem to be solved by the present invention is to provide a method and a device for automatically identifying and capturing an action key frame of a track and field video, so as to solve the problem of low efficiency caused by manually marking the action key frame at present.
Therefore, the invention provides a method for automatically identifying and capturing a track and field video action key frame, which comprises the following steps:
automatically identifying and capturing a first group, a second group, a third group and a fourth group of action key frames of each key action based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position region model respectively;
selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames in sequence according to each key action preset by each motion item to form a first set corresponding to each key action;
removing action key frames beyond the range of the confidence space from the first set by using the key frame confidence space of each key action;
outputting a final result of automatic identification and capture of the key frames according to the number of the residual action key frames in the first set; wherein if only one action key frame remains in the first set, the action key frame is the final result; otherwise, selecting one action key frame as a final result according to the matching degree of skeleton information in each action key frame in the first set and key action semantics, wherein the key action semantics are obtained by deep learning of the skeleton posture and key actions of the human body key frame.
In the above method, preferably, if a unique action key frame cannot be selected as a final result according to the matching degree between the skeleton information in each action key frame in the first set and the key action semantics, averaging the frame numbers of all the action key frames in the first set to obtain an average value, and taking an action key frame closest to the average value as the final result.
In the above method, preferably, the automatic identification and capture of motion key frames based on the key motion model comprises the following steps:
presetting a key action model of each key action, wherein the key action model comprises the movement direction of each key skeleton point in each key action and is respectively defined as the preset movement direction;
judging the motion direction of the corresponding key skeleton point in the current video frame based on the difference between the key skeleton point position in the current video frame and the key skeleton point position in the next frame, and respectively defining the motion direction as the current motion direction;
and judging whether the current video frame is an action key frame or not by comparing whether the current motion direction is the same as the preset motion direction or not.
In the above method, preferably, frame-assisted judgment is performed on the automatic identification and capture of motion keys based on the key motion model through the sports item, the height of the center of mass of the human skeleton and the position of the athlete.
In the above method, preferably, the automatic identification and capture of the action key frame based on the action trend includes the following steps:
presetting an action trend model which comprises a track direction for representing the possible movement of the centroid;
capturing a portrait activity area from the track and field video by using a rectangular frame;
in the portrait activity area, acquiring the barycenter position of the athlete in each frame of video image by using a human body skeleton attitude algorithm;
based on the action trend model, the seed distribution of the centroids of the skeletons in the track and field video frames in the preset number in the possible moving track direction is obtained by utilizing a seed optimization algorithm, the action trend of the athlete is obtained according to the seed distribution of the centroids in the possible moving track direction, and the initial starting point of the action trend is an action key frame.
In the above method, preferably, the automatically identifying and grabbing the action key frame based on the predefined location area comprises the following steps:
establishing each key position area of each motion item by performing big data analysis on the track and field video;
and intercepting the clearest and most complete video frame in the track and field video in the key position area as an action key frame.
In the above-described method, it is preferable that,
key actions for a 100 meter running project include: a start run action and a mid-way run action;
the key actions of the three-level jump project comprise: a take-off action and a take-off action, wherein the take-off action comprises a single-foot jump, a striding jump and a jump;
the key actions for the 110 column items include: a hurdle action and an inter-hurdle running action, wherein the hurdle action includes starting to hurdle, vacating a hurdle, and landing a hurdle.
In the above method, preferably, the key action semantics corresponding to the corresponding key action skeleton information are obtained by deep learning of the key actions of each track and field project.
The invention also provides a device for automatically identifying and grabbing the track and field video action key frames, which comprises the following components:
the recognition and capture module is used for automatically recognizing and capturing a first group, a second group, a third group and a fourth group of action key frames of each key action based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position area model;
the key frame collection module is used for sequentially selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames according to each key action preset by each motion item to form a first set corresponding to each key action;
the key frame removing module is used for removing action key frames exceeding the range of the confidence space from the first set by utilizing the key frame confidence space of each key action;
the key frame output module is used for outputting the final result of automatic key frame identification and capture according to the number of the remaining action key frames in the first set; wherein if only one action key frame remains in the first set, the action key frame is the final result; otherwise, selecting one action key frame as a final result according to the matching degree of skeleton information in each action key frame in the first set and key action semantics, wherein the key action semantics are obtained by deep learning of the skeleton posture and key actions of the human body key frame.
In the above-described apparatus, it is preferable that,
the key action model comprises the motion direction of each key skeleton point in each key action, which is respectively defined as a preset motion direction;
the identification and capture module judges the motion direction of the corresponding key skeleton point in the current video frame based on the difference between the key skeleton point position in the current video frame and the key skeleton point position in the next frame, and respectively defines the motion direction as each current motion direction; and judging whether the current video frame is a key frame or not by comparing whether each current motion direction in the current video frame is the same as each preset motion direction in each key action model or not.
The present invention also provides a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the above-described method for automatic recognition and capture of track and field video action key frames.
According to the technical scheme, the method, the device and the computer readable medium for automatically identifying and capturing the track and field video action key frames solve the problem that the efficiency is low when the action key frames are manually marked at present. Compared with the prior art, the invention has the following beneficial effects:
firstly, based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position area model, action key frames are automatically identified and captured, and the efficiency is improved.
Secondly, selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames in sequence according to each key action preset by each motion item to form a first set corresponding to each key action; removing action key frames beyond the range of the confidence space from the first set by using the key frame confidence space of each key action; and outputting the final result of automatic identification and capture of the key frames according to the number of the residual action key frames in the first set, thereby improving the accuracy of identification of the action key frames.
Thirdly, according to the matching degree of skeleton information in each action key frame in the first set and key action semantics, one action key frame which is most matched is selected as a final result, and the accuracy of action key frame identification is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments of the present invention or the prior art will be briefly described and explained. It is obvious that the drawings in the following description are only some embodiments of the invention, from which the drawings of the players can be derived without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of an automatic track and field video action key frame recognition and capture method provided by the present invention;
FIG. 2 is a schematic diagram of keyframe skeleton information corresponding to a starting action of a 100-meter project in the present invention;
FIG. 3 is a diagram illustrating key frame skeleton information corresponding to a 100 m item run-in action according to the present invention;
FIG. 4 is a key frame skeleton information diagram of a single-foot jump action corresponding to a three-level jump project in the present invention;
FIG. 5 is a key frame skeleton information diagram of a step jump action corresponding to a three-level jump project in the present invention;
FIG. 6 is a key frame skeleton information diagram of a jumping motion corresponding to a three-level jumping project according to the present invention;
FIG. 7 is a diagram illustrating key frame skeleton information corresponding to a hurdle item starting a hurdle attack action according to the present invention;
FIG. 8 is a diagram illustrating key frame skeleton information corresponding to a column-crossing event in the present invention;
FIG. 9 is a diagram illustrating key frame skeleton information of a landing action in the hurdle item soaring according to the present invention;
FIG. 10 is a diagram of a key motion model in the present invention;
FIG. 11 is a diagram illustrating a preset action trend model according to the present invention;
FIG. 12 is a schematic diagram of a motion region for capturing a human image according to the present invention;
fig. 13 is a schematic diagram of seed distribution in the corresponding direction obtained by using a seed optimization algorithm in the present invention.
Detailed Description
The technical solutions of the embodiments of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All embodiments of athletes obtained by those skilled in the art without creative efforts based on the embodiments of the present invention belong to the protection scope of the present invention.
The realization principle of the invention is as follows:
automatically identifying and capturing a first group, a second group, a third group and a fourth group of action key frames of each key action based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position region model;
selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames in sequence according to each key action preset by each motion item to form a first set corresponding to each key action;
removing action key frames beyond the range of the confidence space from the first set by using the key frame confidence space of each key action;
and outputting the final result of automatic identification and capture of the key frames according to the number of the remaining action key frames in the first set.
The scheme provided by the invention realizes automatic identification and capture of the track and field video action key frames, and improves the identification accuracy rate through key frame confidence space and key action semantic analysis.
In order to make the technical solution and implementation of the present invention more clearly explained and illustrated, several preferred embodiments for implementing the technical solution of the present invention are described below.
It should be noted that the terms of orientation such as "inside, outside", "front, back" and "left and right" are used herein as reference objects, and it is obvious that the use of the corresponding terms of orientation does not limit the scope of protection of the present invention.
Referring to fig. 1, fig. 1 is a flowchart of a method for automatically identifying and capturing a track and field video action key frame, which includes the following steps:
and step 110, automatically identifying and capturing a first group of action key frames of each key action from the track and field video based on the human body key frame skeleton posture model respectively. And automatically identifying and capturing a second group of action key frames of each key action from the track and field video based on the key action model. And automatically identifying and capturing a third group of action key frames of each key action from the track and field video based on the action trend model, and automatically identifying and capturing a fourth group of action key frames of each key action from the track and field video based on the predefined position area.
The name and frame number are recorded for each group of action key frames. E.g., start running keyframe, 00012.
And step 120, selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames in sequence according to each key action preset in each motion item, and forming a first set corresponding to each key action.
For example: start key frame [ start key frame, 00012; start running keyframe, 00015; … … ].
Step 130, using the key frame confidence space of each key action, removing action key frames beyond the confidence space range from the first set.
And step 140, outputting the final result of automatic identification and capture of the action key frames according to the number of the remaining action key frames in the first set.
If only one action key frame remains in each key action in the first set, taking the action key frame as a final result, and continuously and automatically identifying and capturing the next key action or finishing the process; otherwise, step 150 is performed.
Step 150, selecting an action key frame as a final result according to the matching degree of the skeleton information in each action key frame in the first set and the key action semantics. The key action semantics are obtained by deep learning of the skeleton posture and key actions of the human key frame.
For example:
key actions for a 100 meter running project include: a start run action and a mid-way run action.
The key actions of the three-level jump project comprise: a take-off action, which is similar to a 100 meter running project, and a take-off action, which includes a single-foot jump, a striding jump, and a jump.
The key actions for the 110 column items include: a hurdle action and an inter-hurdle running action, wherein the hurdle action includes starting to hurdle, vacating a hurdle, and landing a hurdle.
And obtaining key action semantics corresponding to corresponding key action skeleton information through deep learning of key actions of track and field projects such as a 100-meter running project, a three-level jumping project and a 110-column project.
For example: fig. 2 corresponds to a 100-meter item starting action, fig. 3 corresponds to a 100-meter item running-halfway action, fig. 4 corresponds to a single-foot jump action of a three-level jump item, fig. 5 corresponds to a step jump action of a three-level jump item, fig. 6 corresponds to a jump action of a three-level jump item, fig. 7 corresponds to a hurdle item starting hurdling action, fig. 8 corresponds to a hurdle item emptying hurdle action, fig. 9 corresponds to a hurdle item emptying landing action, and the like.
The deep learning of the key action semantics can be realized by adopting the prior art.
And if only one action key frame which accords with the key action semantics is available, outputting the action key frame as a final result, and finishing automatic identification and capture. And if more than one action key frame is consistent with the key action semantics, averaging the frame sequence numbers of the action key frames to obtain an average value, outputting the action key frame closest to the average value as a final result, and finishing automatic identification and capture.
In the method, detailed descriptions are automatically recognized and captured based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position area.
The method comprises the steps of automatically recognizing and capturing key frames based on human skeleton posture deep learning, obtaining a key frame human skeleton posture model mainly through training of a human skeleton posture of an athlete and setting each key frame, and automatically recognizing and capturing the key frames from a running video of the athlete by using the key frame human skeleton posture model. This technique can be implemented using existing techniques.
Specifically, the method comprises the following steps: take 100 meters starting as an example.
First, a start motion key frame as shown in fig. 2 is obtained by deep learning based on the human skeleton posture, and the angles of the joints in the skeleton posture are calculated and obtained based on the start motion key frame. Then, a running video of the athlete is input, each frame of image of the running video is compared with the starting motion key frame, the angle of each joint is compared, the angle tolerance can be set to be 1-1.5 degrees, and finally the video frames meeting the conditions in the running video are added into the first group of motion key frames.
Secondly, as shown in fig. 10, automatically identifying and capturing action key frames based on key action analysis includes the following steps:
and presetting a key action model of each key action, wherein the key action model comprises the motion direction of each key skeleton point in each key action, and the motion direction is respectively defined as the preset motion direction. Wherein, the selection of key skeleton point includes: left, right hand, left, right elbow, left, right knee, and left, right foot.
And judging the motion direction of the corresponding key skeleton point in the current video frame based on the difference between the key skeleton point position in the current video frame and the key skeleton point position in the next frame, and respectively defining the motion direction as each current motion direction.
And judging whether the current video frame is a key frame or not by comparing whether each current motion direction in the current video frame is the same as each preset motion direction in each key action model or not.
The preset movement direction and the current movement direction both comprise a movement direction and an included angle with a horizontal line, and the included angle is provided with a tolerance of 1-2 degrees.
In this step, the auxiliary judgment can be carried out through the sports item, the height of the mass center and the position of the athlete. For example: in the third-level jump project, the centroid height in the current video frame is low (more than 50% lower than the centroid average height), and the athlete is located in front of the bunker, which indicates that the jumping stage is in progress.
Thirdly, as shown in fig. 11, 12 and 13, the automatic identification and capturing of action key frames based on action trends includes the following steps:
and presetting an action trend model, wherein the action trend model is used for representing the possible moving track of the centroid, and as shown in fig. 11, the action trend model comprises a first direction, a second direction, a third direction, a fourth direction and a fifth direction, wherein the first direction is vertically upward, the second direction is inclined upwards by 45 degrees relative to the horizontal direction, the third direction is horizontal direction, the fourth direction is inclined downwards by 45 degrees relative to the horizontal direction, and the fifth direction is vertically downward.
As shown in fig. 12, a portrait session is captured from a sports video of an athlete using a rectangular box.
And in the portrait activity area, acquiring the centroid position of the athlete of each frame by using a skeleton algorithm.
As shown in fig. 13, based on the action trend model, a seed optimization algorithm is used to obtain a seed distribution in a corresponding direction, so as to accurately obtain an action trend of the athlete, where an initial starting point of the action trend is a key frame.
Automatically identifying and capturing key frames at key positions based on a predefined position area method, comprising the following steps:
and establishing each key position area of each sports item through big data analysis, and automatically intercepting the clearest and most complete picture as a key frame when the athlete enters the key position area.
In step 130, the specific steps of removing the action key frames beyond the confidence space range from the first set by using the key frame confidence space of each key action are as follows:
each key action is a set of key action frame sequences comprising a plurality of key action frames starting to ending from the key action, such as: the start key action includes a total of 50 frames of key action frames from crouch to upright. Each frame of key action frame comprises a plurality of human body key point coordinates, and the coordinates use the gravity center of the human body as the origin of coordinates.
For example, the following 17 key points are commonly used: 0: nose, 1: left eye, 2: right eye, 3: left ear, 4: right ear, 5: left shoulder, 6: right shoulder, 7: left elbow, 8: right elbow, 9: left wrist, 10: right wrist, 11: left crotch, 12: right crotch, 13: left knee, 14: right knee, 15: left ankle, 16: right ankle.
The first step is as follows: in the key action model, 17 key points of each key action frame of a key action frame sequence are stored in a two-dimensional table, wherein the frame number is a row, and the key column is a column. The key action model is a standard key action model which is manually selected and processed in advance, and can be intercepted from a match video or a training video or manually intercepted after being recorded from a match rebroadcast.
Noses [302.12, 305.15, 310.23, 330.56, 210.45, 250.65 … … ],
left eye [ … … ]
Right eye [ … … ]
Left ear [ … … ]
Right ear [ … … ]
Left shoulder [ … … ]
Right shoulder [ … … ]
Left elbow [ … … ]
Right elbow (… …)
Left wrist [ … … ]
Right wrist [ … … ]
Left crotch [ … … ]
Right crotch [ … … ]
Left knee (… …)
Right knee (… …)
Left ankle [ … … ]
Right ankle [ … … ]
The data is similar to the data of a nose and is obtained by similar openposition human skeleton posture recognition. These exemplary data are omitted here and do not affect the implementation of the technical solution.
The numerical values in the above table may be different due to the size of the key motion model image, but basically have no influence on the calculation result due to the fact that the gravity center of the human body is taken as the origin of coordinates and the difference calculation mode is adopted.
And secondly, respectively acquiring the coordinates of 17 corresponding key points in each frame of image in the athlete running video (video to be analyzed), translating the window by taking the sequence length of the key action frames as the window, and calculating the coordinate difference value of each corresponding key point in each frame of athlete running video in the window so as to calculate the score of each key point, wherein for example, if the difference value is 0, the score is 100, the weights of the coordinates x and y are the same, each phase is 10 pixels apart, and the score is reduced by 1.
For example, the perfection score for 17 keypoints in one frame of image is 1700 points. The confidence score can be preset according to the requirement of the accuracy of the action, for example, if the total score of each keypoint in one frame of image is greater than or equal to 85%, the score of 1700 × 85% =1445 is regarded as confidence. Thus, a window will generate a time (window duration) based score curve whose upper and lower fluctuation ranges are confidence spaces, for example: the confidence space fluctuates up and down by 5 points.
And thirdly, obtaining a motion video sequence positioned in the confidence space by translating the window on the motion video, and recording the sequence numbers of the starting frame and the ending frame of the motion video sequence.
And fourthly, judging the action key frame identified in the first set, if the sequence number of the action key frame is positioned in the sequence numbers of the starting frame and the ending frame, keeping the action key frame, otherwise, deleting the action key frame from the first set.
Based on the method, the invention also provides a device for automatically identifying and capturing the track and field video action key frames, which comprises the following steps:
the recognition and capture module is used for automatically recognizing and capturing to obtain a first group, a second group, a third group and a fourth group of action key frames based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position area model;
the key frame collection module is used for sequentially selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames according to each key action preset by each motion item to form a first set corresponding to each key action;
the key frame removing module is used for removing action key frames exceeding the range of the confidence space from the first set by utilizing the key frame confidence space of each key action;
the key frame output module is used for outputting the final result of automatic key frame identification and capture according to the number of the remaining action key frames in the first set; wherein if only one action key frame remains in the first set, the action key frame is the final result; otherwise, selecting one action key frame as a final result according to the matching degree of skeleton information in each action key frame in the first set and key action semantics, wherein the key action semantics are obtained by deep learning of the skeleton posture and key actions of the human body key frame.
In the device, the human body key frame-based skeleton posture model automatic identification and grabbing, the key action model automatic identification and grabbing, the action trend model automatic identification and grabbing and the predefined position area automatic identification and grabbing based on the method can be used.
The method for automatically identifying and capturing the track and field video action key frames can be realized as a computer software program. For example, the present invention also provides a computer readable medium having stored thereon a computer program which, when executed by a processor, implements the above-described track and field video action key frame automatic identification and capture method.
With the above description of the specific embodiments, compared with the prior art, the method, the device and the computer readable medium for automatically identifying and capturing the track and field video action key frame provided by the invention have the following advantages:
firstly, the action key frames are automatically identified and captured based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position area model, and the efficiency of identifying and capturing the action key frames of the track and field video is improved.
Secondly, selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames in sequence according to each key action preset by each motion item to form a first set corresponding to each key action; removing action key frames beyond the range of the confidence space from the first set by using the key frame confidence space of each key action; and outputting the final result of automatic identification and capture of the key frames according to the number of the residual action key frames in the first set, thereby improving the accuracy of identification of the action key frames.
Thirdly, according to the matching degree of skeleton information in each action key frame in the first set and key action semantics, one action key frame which is most matched is selected as a final result, and the accuracy of action key frame identification is further improved.
Fourthly, in the process of automatically identifying and grabbing action key frames based on key action analysis, auxiliary judgment is carried out through the motion items, the height of the mass center and the positions of athletes, and the error rate of identification is reduced.
And fifthly, in the automatic identification and capturing of the action key frames based on the action trend, the seed distribution in the corresponding direction is obtained by using a seed optimization algorithm, so that the action trend of the athlete is accurately obtained, and the method is applied to the field of video capturing for the first time, and the identification of the action key frames is pertinently further improved.
Sixth, in view of the fact that the action key frames are obtained by automatic recognition and capture based on the human body key frame skeleton posture model, the recognition accuracy is poor, a large amount of training data and long modeling time are needed, therefore, according to the scheme of the application, on the basis that the action key frames are obtained by automatic recognition and capture based on the human body key frame skeleton posture model, the action key frames are obtained by automatic recognition and capture based on a key action model, an action trend model and a predefined position area model, the optimal action key frames are obtained in a self-adaptive mode by adopting a confidence space, key action semantic matching, averaging and other combination modes for the obtained action key frames, the recognition rate of the action key frames is up to more than 95%, and the feasibility and the reliability of the method in practical application are improved.
Finally, it should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The present invention is not limited to the above-mentioned preferred embodiments, and any structural changes made under the teaching of the present invention shall fall within the scope of the present invention, which is similar or similar to the technical solutions of the present invention.

Claims (10)

1. A track and field video action key frame automatic identification and capture method is characterized by comprising the following steps:
automatically identifying and capturing a first group, a second group, a third group and a fourth group of action key frames of each key action based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position region model respectively;
selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames in sequence according to each key action preset by each motion item to form a first set corresponding to each key action;
removing action key frames beyond the range of the confidence space from the first set by using the key frame confidence space of each key action;
outputting a final result of automatic identification and capture of the key frames according to the number of the residual action key frames in the first set; wherein if only one action key frame remains in the first set of each key action, the action key frame is taken as a final result; otherwise, selecting one action key frame as a final result according to the matching degree of skeleton information in each action key frame in the first set and key action semantics, wherein the key action semantics are obtained by deep learning of the skeleton posture and key actions of the human body key frame.
2. The method according to claim 1, wherein if a unique action key frame cannot be selected as a final result according to the matching degree of the skeleton information and the key action semantics in each action key frame in the first set of each key action, averaging the frame numbers of all the action key frames in the first set to obtain an average value, and taking an action key frame closest to the average value as the final result.
3. The method of claim 1, wherein automatically identifying and grabbing motion key frames based on a key motion model comprises the steps of:
presetting a key action model of each key action, wherein the key action model comprises the movement direction of each key skeleton point in each key action and is respectively defined as the preset movement direction;
judging the motion direction of the corresponding key skeleton point in the current video frame based on the difference between the key skeleton point position in the current video frame and the key skeleton point position in the next frame, and respectively defining the motion direction as the current motion direction;
and judging whether the current video frame is an action key frame or not by comparing whether the current motion direction is the same as the preset motion direction or not.
4. The method of claim 3, wherein the automatic identification and capture of motion keys based on a key motion model is frame-aided determined by the sport item, the height of the centroid of the human skeleton, and the athlete's position.
5. The method of claim 1, wherein automatically identifying and grabbing motion key frames based on motion trends, comprises the steps of:
presetting an action trend model which comprises a track direction for representing the possible movement of the centroid;
capturing a portrait activity area from the track and field video by using a rectangular frame;
in the portrait activity area, acquiring the centroid position of the athlete in each frame of video image by using a human body skeleton posture algorithm;
based on the action trend model, the seed distribution of the centroids of the skeletons in the track direction of the possible movement in a preset number of track and field video frames is obtained by utilizing a seed optimization algorithm, the action trend of the athlete is obtained according to the seed distribution of the centroids in the track direction of the possible movement, and the initial starting point of the action trend is an action key frame.
6. The method of claim 1, wherein automatically identifying and grabbing action key frames based on predefined location areas comprises the steps of:
establishing each key position area of each motion item by performing big data analysis on the track and field video;
and intercepting the clearest and most complete video frame in the track and field video in the key position area as an action key frame.
7. The method of claim 3,
key actions for a 100 meter running project include: a start run action and a mid-way run action;
the key actions of the three-level jump project comprise: a take-off action and a take-off action, wherein the take-off action comprises a single-foot jump, a striding jump and a jump;
the key actions for the 110 column items include: a hurdle action and an inter-hurdle running action, wherein the hurdle action includes starting to hurdle, vacating a hurdle, and landing a hurdle.
8. The method according to claim 7, wherein the key action semantics corresponding to the corresponding key action skeleton information are obtained through deep learning of the key actions of each track and field project.
9. An automatic identification and grabbing device for track and field video action key frames is characterized by comprising the following components:
the recognition and capture module is used for automatically recognizing and capturing a first group, a second group, a third group and a fourth group of action key frames of each key action based on a human body key frame skeleton posture model, a key action model, an action trend model and a predefined position area model;
the key frame collection module is used for sequentially selecting corresponding action key frames from the first group of action key frames, the second group of action key frames, the third group of action key frames and the fourth group of action key frames according to each key action preset by each motion item to form a first set corresponding to each key action;
the key frame removing module is used for removing action key frames exceeding the range of the confidence space from the first set by utilizing the key frame confidence space of each key action;
the key frame output module is used for outputting the final result of automatic key frame identification and capture according to the number of the remaining action key frames in the first set; wherein if only one action key frame remains in the first set, the action key frame is the final result; and otherwise, selecting one action key frame as a final result according to the matching degree of the skeleton information and the key action semantics in each action key frame in the first set, wherein the key action semantics are obtained by deeply learning the skeleton posture and the key action of the human body key frame.
10. A computer-readable medium, on which a computer program is stored, which, when executed by a processor, implements the track and field video action key frame automatic identification and grabbing method according to any one of claims 1 to 8.
CN202210280271.1A 2022-03-22 2022-03-22 Method, device and medium for automatically identifying and capturing track and field video action key frames Active CN114550071B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210280271.1A CN114550071B (en) 2022-03-22 2022-03-22 Method, device and medium for automatically identifying and capturing track and field video action key frames

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210280271.1A CN114550071B (en) 2022-03-22 2022-03-22 Method, device and medium for automatically identifying and capturing track and field video action key frames

Publications (2)

Publication Number Publication Date
CN114550071A true CN114550071A (en) 2022-05-27
CN114550071B CN114550071B (en) 2022-07-19

Family

ID=81666472

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210280271.1A Active CN114550071B (en) 2022-03-22 2022-03-22 Method, device and medium for automatically identifying and capturing track and field video action key frames

Country Status (1)

Country Link
CN (1) CN114550071B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115665359A (en) * 2022-10-09 2023-01-31 西华县环境监察大队 Intelligent compression method for environmental monitoring data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012048362A (en) * 2010-08-25 2012-03-08 Kddi Corp Device and method for human body pose estimation, and computer program
CN103218824A (en) * 2012-12-24 2013-07-24 大连大学 Motion key frame extracting method based on distance curve amplitudes
CN113762133A (en) * 2021-09-01 2021-12-07 哈尔滨工业大学(威海) Self-weight fitness auxiliary coaching system, method and terminal based on human body posture recognition

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012048362A (en) * 2010-08-25 2012-03-08 Kddi Corp Device and method for human body pose estimation, and computer program
CN103218824A (en) * 2012-12-24 2013-07-24 大连大学 Motion key frame extracting method based on distance curve amplitudes
CN113762133A (en) * 2021-09-01 2021-12-07 哈尔滨工业大学(威海) Self-weight fitness auxiliary coaching system, method and terminal based on human body posture recognition

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115665359A (en) * 2022-10-09 2023-01-31 西华县环境监察大队 Intelligent compression method for environmental monitoring data

Also Published As

Publication number Publication date
CN114550071B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN108256433B (en) Motion attitude assessment method and system
CN110705390A (en) Body posture recognition method and device based on LSTM and storage medium
Hu et al. Real-time human movement retrieval and assessment with kinect sensor
CN110674785A (en) Multi-person posture analysis method based on human body key point tracking
JP6082101B2 (en) Body motion scoring device, dance scoring device, karaoke device, and game device
Chaudhari et al. Yog-guru: Real-time yoga pose correction system using deep learning methods
KR102106135B1 (en) Apparatus and method for providing application service by using action recognition
CN112819852A (en) Evaluating gesture-based motion
CN110298218B (en) Interactive fitness device and interactive fitness system
Suzuki et al. Enhancement of gross-motor action recognition for children by CNN with OpenPose
CN114550071B (en) Method, device and medium for automatically identifying and capturing track and field video action key frames
CN115331314A (en) Exercise effect evaluation method and system based on APP screening function
Yang et al. Human exercise posture analysis based on pose estimation
WO2023108842A1 (en) Motion evaluation method and system based on fitness teaching training
Tang et al. Research on sports dance movement detection based on pose recognition
Yang et al. Research on face recognition sports intelligence training platform based on artificial intelligence
CN114049590A (en) Video-based ski-jump analysis method
CN110996178B (en) Intelligent interactive data acquisition system for table tennis game video
CN110070036B (en) Method and device for assisting exercise motion training and electronic equipment
CN116271757A (en) Auxiliary system and method for basketball practice based on AI technology
CN115497170A (en) Method for identifying and scoring formation type parachuting training action
JP7074727B2 (en) Sport behavior recognition devices, methods and programs
CN113255450A (en) Human motion rhythm comparison system and method based on attitude estimation
Kim et al. Implementation of golf swing analysis system based on swing trajectories analysis
Murthy et al. DiveNet: Dive Action Localization and Physical Pose Parameter Extraction for High Performance Training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220902

Address after: Room 2310, 23rd Floor, No. 24, Jianguomenwai Street, Chaoyang District, Beijing 100010

Patentee after: One Body Technology Co.,Ltd.

Address before: Room zt1009, science and technology building, No. 45, Zhaitang street, Mentougou District, Beijing 102300 (cluster registration)

Patentee before: Beijing Yiti Technology Co.,Ltd.

PE01 Entry into force of the registration of the contract for pledge of patent right
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Method, device and medium for automatic recognition and capture of motion key frames in track and field video

Effective date of registration: 20230112

Granted publication date: 20220719

Pledgee: Haidian Beijing science and technology enterprise financing Company limited by guarantee

Pledgor: One Body Technology Co.,Ltd.

Registration number: Y2023110000017