CN114821812B - Deep learning-based skeleton point action recognition method for pattern skating players - Google Patents

Deep learning-based skeleton point action recognition method for pattern skating players Download PDF

Info

Publication number
CN114821812B
CN114821812B CN202210721105.0A CN202210721105A CN114821812B CN 114821812 B CN114821812 B CN 114821812B CN 202210721105 A CN202210721105 A CN 202210721105A CN 114821812 B CN114821812 B CN 114821812B
Authority
CN
China
Prior art keywords
action
video set
track
training
motion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210721105.0A
Other languages
Chinese (zh)
Other versions
CN114821812A (en
Inventor
虞博文
翟天泰
熊三玥
林馨怡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Petroleum University
Original Assignee
Southwest Petroleum University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Petroleum University filed Critical Southwest Petroleum University
Priority to CN202210721105.0A priority Critical patent/CN114821812B/en
Publication of CN114821812A publication Critical patent/CN114821812A/en
Application granted granted Critical
Publication of CN114821812B publication Critical patent/CN114821812B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention relates to a deep learning-based skeleton point action recognition method for pattern skating players, which comprises the steps of randomly dividing an action video set to be classified into a training video set and a testing video set, respectively using the training video set and the testing video set to calculate a training track and a testing track of an action, inputting the training track into an action recognition model for action recognition, then optimizing the action recognition model through the testing track, and obtaining a deep action recognition model for quickly extracting the action; and acquiring a motion video set to be tested through the depth motion recognition model, and recognizing the limb motion and the track of the object in the motion video set to be tested to obtain a recognition result of the test set. The method can realize quick recognition of the action, overcomes the defects of non-uniform standard and low evaluation efficiency caused by pure manual evaluation in the prior art, and improves the evaluation efficiency and the result uniformity of the evaluation.

Description

Deep learning-based skeleton point action recognition method for pattern skating players
Technical Field
The invention relates to the technical field of sport assistance, in particular to a skeletal point action identification method for a pattern skating player based on deep learning.
Background
Human body action recognition is an important research direction of multidisciplinary intersection of computer vision, mode recognition, image processing, artificial intelligence and the like, and has great application value and theoretical significance in the fields of human-computer interaction, intelligent monitoring and medical treatment. The method mainly aims at the motion image sequence containing people to carry out analysis processing, feature extraction and moving object classification, and realizes the recognition and understanding of individual actions of people, and interactive behaviors between people and external environment.
In recent years, many motion recognition methods based on human bones have been proposed, and the basic principle of these methods is to combine key posture features of bones into motion sequences, and distinguish different motions by comparing the probability of different postures appearing in the motions or the difference of postures. Compared with the prior motion identification method based on silhouette or outline, the skeleton static modeling method has a certain effect on improving the identification rate, but the skeleton static modeling method does not fully utilize the time and space characteristics of the skeleton, is difficult to identify similar motions such as waving hands and drawing symbols, and has limitation in application in a real environment.
A method for dynamically modeling a skeleton is proposed, wherein an action sequence is regarded as a dynamic problem of time and space, the motion characteristics of skeleton nodes are extracted, and then the recognition result is obtained through characteristic analysis and classification.
The method obviously improves the accuracy of motion recognition, but because the space-time characteristics of bones are complex, and robust motion characteristics are difficult to provide, more researchers are dedicated to establishing effective models to extract the characteristics at present. On the other hand, if the bone data is inaccurate due to occlusion or view angle change, the recognition result is also greatly influenced.
The pattern skating event is a high-athletic and high-ornamental event and is always important in international sports events, but when the motion of the pattern skating is guided and improved, the judgment standard of motion accuracy in the event highly depends on the experience of practitioners, so that negative results such as judgment subjectivity, low efficiency, scoring contradiction and the like are easily caused.
Disclosure of Invention
In order to overcome at least part of defects in the prior art, the method for recognizing the bone point actions of the figure skating player based on deep learning can improve the action recognition effect of the player and can grade the actions according to the defects in the recognized actions.
The invention relates to a deep learning-based skeleton point action recognition method for a pattern skating player, which comprises the following steps of:
s1, acquiring the action video set to be classified through the client, uploading the action video set to a cache area of the server, and storing the action video set to the cache area of the server;
s2, randomly dividing the action video set to be classified in the cache area into a training video set and a testing video set, respectively using the training video set and the testing video set to calculate a training track and a testing track of an action, inputting the training track into an action recognition model for action recognition, and then optimizing the action recognition model through the testing track to obtain a deep action recognition model for quickly extracting the action;
s3, acquiring a motion video set to be tested through the depth motion recognition model, and recognizing the limb motion and the track of an object in the motion video set to be tested to obtain a recognition result of the test set;
and S4, decomposing and comparing the identification result of the test set with the standard fancy skating motion model through a grading system, grading according to the matching degree of the track, and simultaneously outputting a grading result, wherein the test set is a single motion to be tested or a set of a plurality of motions to be tested.
Further, the method comprises the steps of obtaining daily training videos of athletes through a coach system, uploading the obtained daily training videos to a depth action recognition model arranged on a cloud server for skeletal point action analysis, obtaining improvement opinions according to scores of the actions of the athletes and defects of the actions given by the depth action recognition model, storing the daily training videos in the coach system, enabling a user to access the coach system through a human-computer interaction interface, and further enabling action reply and/or posture viewing through accessing the daily videos.
Further, the method further comprises the steps of filtering out background and invalid actions in the video set to be detected through a first filtering module arranged at the client side, and extracting features, and comprises the following steps:
a1, extracting three-dimensional coordinates of 16 relatively active bone joint points in a training video set or a test video set, wherein the 16 bone joint points are respectively head, middle shoulder, spine, middle hip, left shoulder, left elbow, left wrist, right shoulder, right elbow, right wrist, left hip, left knee, left ankle, right hip, right knee and right ankle;
a2, calculating the translation matrix and quaternion rotation of 16 bone joint points: the translation matrix represents the position change of the current frame and the previous frame of the skeletal joint point; the quaternion rotation represents the angle change of the current frame and the previous frame of the skeleton joint point, and the position change and the angle change of the current frame and the previous frame of the skeleton joint point form the motion characteristics of the skeleton joint point;
a2, forming motion characteristics based on human body parts: dividing a human body into 9 parts, and fusing the motion characteristics of skeletal joint points related to the 9 parts respectively to form motion characteristics based on the human body parts; the 9 parts of the human body are a trunk, a left upper arm, a left lower arm, a right upper arm, a right lower arm, a left upper leg, a left lower leg, a right upper leg and a right lower leg respectively.
Further, the depth motion recognition model comprises an ST-GCN bone point classification model and a noise reduction encoder, and the construction method of the ST-GCN bone point classification model comprises the following steps:
b1: preprocessing the acquired data in the training track by a noise reduction encoder, removing unsmooth end points and incomplete tracks in the training track, and mutually separating different action groups to acquire a plurality of sections of smooth training tracks;
b2: establishing an ST-GCN network and an action track fitting unit, and embedding the action track fitting unit into the back of an ST-GCN network convolution layer to build an overall network;
b3: training the network by using a training set, optimizing parameters and obtaining a skeleton behavior recognition network based on the action track;
b4: inputting the test set into the network obtained in step B3 for prediction, and giving out the corresponding action category.
Furthermore, the client is connected with three-dimensional cameras, the number of the three-dimensional cameras is at least three, the three-dimensional cameras are arranged around the target to be detected, the position of each three-dimensional camera can track along with the target to be detected, the shot video is firstly cached in the three-dimensional cameras, then the time marking is carried out on the shot video according to the time period, the video is split according to the time marking, then the split video is subjected to disorder processing, and the split video is uploaded to a cache region of the server side.
Further, the method also comprises the construction of a standard fancy skating action model, and comprises the following steps:
c1, acquiring a standard action video set of the pattern skater with standard action;
and C2, calculating a training track of the action by taking the standard action video set as a training set, inputting the training track into the action recognition model for action recognition to obtain a standard fancy skating action model, and taking the obtained standard fancy skating action model as a grading reference of the grading system.
Furthermore, the scoring system comprises a plurality of scoring modules, the scoring modules are used for scoring the performance of different evaluation dimensions of the action, and the evaluation dimensions comprise the completion degree of a single action, the completion degree of an overall action, the fluency of action connection, the difficulty of the single action and the difficulty of the overall action.
Further, the method further comprises performing manifold mapping on a training action video set or a testing action video set through the noise reduction encoder, and specifically comprises the following steps: and each action in the training action video set or the test action video set is represented as a set based on the motion characteristics of the 9 parts, the motion characteristics of the 9 parts in each action in the training action video set or the test action video set are mapped onto a low-dimensional manifold through a local linear embedding algorithm, each action forms 9 parts of track corresponding to the 9 parts, the part track related to the action is a curve, and the part track unrelated to the action is a point, so that the training track and the test track are obtained.
Furthermore, the ST-GCN network is used for predicting the track, the action track fitting unit is used for fitting the track obtained through ST-GCN network prediction and the track of the test set to be tested obtained through the noise reduction encoder to obtain a fitted track, then the predicted track and the fitted track are subjected to differentiation processing to obtain difference data, the obtained difference data are transmitted to the scoring system, and the scoring system scores the action according to different dimensions.
Further, the construction method of the ST-GCN skeletal point classification model comprises the following steps:
d1, inputting a skeleton sequence, normalizing the input matrix and constructing a topological graph structure;
d2, transforming the time and space dimensions by ST-GCN unit alternately using GCN and TCN;
d3, classifying the features by using an average pooling and full connection layer, and then outputting the action classification result through the improved Softmax.
The invention has the advantages that: the method has the advantages that the deep learning algorithm is introduced to quickly and effectively identify the actions of the pattern skating, so that the action identification effect of athletes can be improved, and the actions can be scored according to the defects in the identified actions; meanwhile, the specific guidance can be carried out on the athlete through the recognition result of the action of the athlete, so that the training effect of the athlete can be strengthened, and the sport injury caused by long-term nonstandard action can be reduced.
In order to make the aforementioned and other objects, features and advantages of the invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic step diagram of a method for identifying skeletal point actions of a pattern skater based on deep learning.
FIG. 2 is a schematic diagram of a construction method of an ST-GCN skeletal point classification model.
FIG. 3 is a schematic diagram of the structure and flow of the ST-GCN cell of FIG. 2.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In a preferred embodiment of the present invention, a method for identifying skeletal point actions of a pattern skating player based on deep learning comprises the following steps:
s1, acquiring the action video set to be classified through the client, uploading the action video set to a cache area of the server, and storing the action video set to the cache area of the server;
s2, randomly dividing the action video set to be classified in the cache area into a training video set and a testing video set, respectively using the training video set and the testing video set to calculate a training track and a testing track of an action, inputting the training track into an action recognition model for action recognition, and then optimizing the action recognition model through the testing track to obtain a deep action recognition model for quickly extracting the action;
s3, acquiring a motion video set to be tested through the depth motion recognition model, and recognizing the limb motion and the track of an object in the motion video set to be tested to obtain a recognition result of the test set;
and S4, decomposing and comparing the identification result of the test set with the standard fancy skating motion model through a grading system, grading according to the matching degree of the track, and simultaneously outputting a grading result, wherein the test set is a single motion to be tested or a set of a plurality of motions.
In the embodiment, the method further comprises the steps of acquiring daily training videos of the athletes through a coach system, uploading the acquired daily training videos to a deep motion recognition model arranged on a cloud server for skeletal point motion analysis, and acquiring improved opinions according to scores of the actions of the athletes and defects of the actions given by the deep motion recognition model, wherein the daily training videos are stored in the coach system, and a user can access the coach system through a human-computer interaction interface, so that action reply and/or posture check can be performed through accessing the daily videos. In the practical implementation process, the coaching system can be mounted on a server side or a terminal device of a user. Through preset voice and/or text guidance information in the coach system, voice or text guidance can be provided for the athlete after the difference between the action of the athlete and the actual action is recognized through the deep action recognition model, so that the action of the athlete is corrected in time, and the training effect of the athlete is improved.
In the above embodiment, the method further includes filtering out a background and an invalid action in the video set to be detected by a first filtering module disposed at the client, and performing feature extraction, and includes the following steps:
a1, extracting three-dimensional coordinates of 16 relatively active bone joint points in a training video set or a test video set, wherein the 16 bone joint points are respectively head, middle shoulder, spine, middle hip, left shoulder, left elbow, left wrist, right shoulder, right elbow, right wrist, left hip, left knee, left ankle, right hip, right knee and right ankle;
a2, calculating the translation matrix and quaternion rotation of 16 bone joint points: the translation matrix represents the position change of the current frame and the previous frame of the skeletal joint point; the quaternion rotation represents the angle change of the current frame and the previous frame of the skeleton joint point, and the position change and the angle change of the current frame and the previous frame of the skeleton joint point form the motion characteristics of the skeleton joint point;
a2, forming motion characteristics based on human body parts: dividing a human body into 9 parts, and respectively fusing the motion characteristics of bone joint points related to the 9 parts to form the motion characteristics based on the human body parts; the 9 parts of the human body are a trunk, a left upper arm, a left lower arm, a right upper arm, a right lower arm, a left upper leg, a left lower leg, a right upper leg and a right lower leg respectively.
In the above embodiment, the depth motion recognition model includes an ST-GCN skeleton point classification model and a noise reduction encoder, and the construction method of the ST-GCN skeleton point classification model includes:
b1: preprocessing the acquired data in the training track by a noise reduction encoder, removing unsmooth end points and incomplete tracks in the training track, and mutually separating different action groups to acquire a plurality of sections of smooth training tracks;
b2: establishing an ST-GCN network and an action track fitting unit, and embedding the action track fitting unit into the back of an ST-GCN network convolution layer to build an overall network;
b3: training the network by using a training set, optimizing parameters and obtaining a skeleton behavior recognition network based on the action track;
b4: the test set is input to the network obtained in step S3 for prediction, and the corresponding action category is given.
In the embodiment, the client is connected with three-dimensional cameras, the number of the three-dimensional cameras is at least three, the three-dimensional cameras are installed around a target to be detected, the position of each three-dimensional camera can track the target to be detected, a shot video is firstly cached in the three-dimensional cameras, then time marking is carried out on the shot video according to a time period, the video is split according to the time marking, then out-of-order processing is carried out on the split video, and the split video is uploaded to a cache area of the server. In the actual implementation process, the number of the three-dimensional cameras is at least 4, one of the three-dimensional cameras is used as a reference, the positions of the other 3 or more than 3 cameras are calibrated, and the calibrated cameras track and shoot along with the actual motion trail of the athlete, so that more complete limb action data of the athlete can be acquired.
In the above embodiment, the method further comprises the step of constructing a standard fancy skating motion model, which comprises the following steps:
c1, acquiring a standard action video set of the pattern skater with standard action;
and C2, calculating a training track of the action by taking the standard action video set as a training set, inputting the training track into the action recognition model for action recognition to obtain a standard fancy skating action model, and taking the obtained standard fancy skating action model as a grading reference of the grading system. In the actual implementation process, the fancy skating motion model is optimized through different combination modes of the same standard motion video set, and is divided into different sub-modules according to the length of each motion, so that when the motions of the athlete are evaluated through the evaluation system, different motions can be evaluated independently, and the response time in the evaluation process is reduced.
In the above embodiment, the scoring system includes a plurality of scoring modules, and the plurality of scoring modules are configured to score the performance of different evaluation dimensions of the action, where the evaluation dimensions include a degree of completion of a single action, a degree of completion of an overall action, a smoothness of action engagement, a difficulty of a single action, and a difficulty of an overall action. In the actual implementation process, the total score can be calculated by adding the completion degree of a single action, the completion degree of the whole action, the smoothness of action connection, the difficulty of the single action and the difficulty of the whole action, and the total score can also be calculated by multiplication according to evaluation coefficients of different score dimensions.
In the above embodiment, the method further includes performing manifold mapping on the training motion video set or the test motion video set by using a noise reduction encoder, and specifically includes the following steps: each action in the training action video set or the test action video set is represented as a set based on motion characteristics of 9 parts, the motion characteristics of the 9 parts in each action in the training action video set or the test action video set are mapped onto a low-dimensional manifold through a local linear embedding algorithm, each action forms 9 parts of track corresponding to the 9 parts, the track of the part related to the action is a curve, and the track of the part unrelated to the action is a point, so that the training track and the test track are obtained.
In the above embodiment, the ST-GCN network is configured to predict a trajectory, the motion trajectory fitting unit is configured to fit the trajectory obtained through prediction by the ST-GCN network and the trajectory of the test set to be tested obtained through the noise reduction encoder to obtain a fitted trajectory, then perform differentiation processing on the predicted trajectory and the fitted trajectory to obtain difference data, and then transmit the obtained difference data to the scoring system, and the scoring system scores the motion according to different dimensions.
Referring to fig. 2, in practical implementation, the method for constructing the ST-GCN bone point classification model includes: inputting a skeleton sequence, normalizing an input matrix and constructing a topological graph structure; transforming, by an ST-GCN unit, the temporal and spatial dimensions alternately using GCN and TCN; classifying the features by using an average pooling and full connection layer, and then outputting an action classification result through improved Softmax; during actual use, the number of the ST-GCN units is set to be 9, the number of the ST-GCN units is set to be 1-9, and the stides of the 4 th time domain convolution layer and the 7 th time domain convolution layer are 2; the input and output of each ST-GCN cell is shown in FIG. 2.
Adopting improved Dropout in the process of constructing an ST-GCN skeleton point classification model, selecting a Huber loss function, measuring accuracy by top1 and top5, and reducing random gradient of additional momentum into an optimization function; initializing the weight, loading data, a model and an optimizer, and performing end-to-end training. In the practical implementation process, in order to avoid that the outlier has a large influence on the result and improve the robustness of the model, a Huber loss function is used, and the formula after correction is as follows:
Figure DEST_PATH_IMAGE001
v t+1 in the form of an actual value of the value,
Figure DEST_PATH_IMAGE002
is the predicted value of the model, delta is the threshold,
Figure DEST_PATH_IMAGE003
the Huber loss function value, which represents the multiplication operation;
when the MSE is used as a loss function, the model is often forced to fit singular point data because the loss function value needs to be reduced, so that the prediction result is influenced; huber Loss is a parameterized piecewise Loss function used in the regression problem, which has the advantage of enhancing the robustness of the mean square error Loss function (MSE) to outliers. Given a delta, it takes the squared error when the prediction bias is less than delta and reduces Loss when the prediction bias is greater than delta, using a linear function. The method can reduce the weight of singular data points to the Loss calculation, avoids linear regression of model overfitting compared with least square, and reduces the punishment degree of outliers by HuberLoss.
All channels in the ST-GCN share an adjacency matrix, which means that all channels will share the same aggregation core, a condition known as coupled aggregation. However, in the convolutional neural network, the convolution kernel parameters of each channel are different, and the diversity of extracted features is ensured, so that different channel data are processed by using different adjacency matrixes, the adjacency matrixes and convolution kernels are similar, parameters can be trained and changed, and the diversity of the adjacency matrixes is greatly increased. When n = C, each channel itself can generate a spatial aggregation kernel of a large number of redundant parameters; when n =1, the convolution is degraded to an aggregated graph convolution, and the formula of the corrected graph convolution is as follows:
Figure DEST_PATH_IMAGE004
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE005
is the new characteristic diagram information obtained by calculation.
Figure 188178DEST_PATH_IMAGE006
The number of channels of the original characteristic diagram is shown, n is the number of channels which divide the original characteristic diagram into n groups according to the channels,
Figure DEST_PATH_IMAGE007
meaning rounding down, "," meaning separating the two ": means. Wherein
Figure 541799DEST_PATH_IMAGE008
To decouple adjacent matrices, then
Figure DEST_PATH_IMAGE009
And the second step of slicing the selected adjacent matrix to obtain partial adjacent matrix, wherein n represents the decoupling adjacent matrix corresponding to the selected nth group of channels.
Figure 979472DEST_PATH_IMAGE010
Calculating information between different channels, i.e. channel correlation, wherein
Figure DEST_PATH_IMAGE011
A diagram of the characteristics is shown,
Figure 709530DEST_PATH_IMAGE012
represents the weight of each keypoint in the channel, i.e., the variable convolution kernel. Then
Figure DEST_PATH_IMAGE013
Representing a1 st group of channels, of whichOne for representing all the joint information in each channel and the second for representing the grouping situation of the channels, i.e. the first channel to the second channel
Figure 784934DEST_PATH_IMAGE014
A channel, the same as follows;
Figure DEST_PATH_IMAGE015
representing the 2 nd group of channels, the second of which represents the second from
Figure 309456DEST_PATH_IMAGE016
From the channel to the
Figure DEST_PATH_IMAGE017
A plurality of channels, each of which is provided with a plurality of channels,
Figure 470310DEST_PATH_IMAGE018
representing the nth group of channels, the second of which represents the channel from the second
Figure DEST_PATH_IMAGE019
From one channel to the last channel.
Figure 581223DEST_PATH_IMAGE008
And
Figure 370188DEST_PATH_IMAGE020
is expressed using a python representation,
Figure DEST_PATH_IMAGE021
indicating per-channel connections, C is the number of image channels.
Meanwhile, because the graph neural network is adapted to a non-Euclidean space structure, in the graph volume process, the characteristics of the nodes and the neighbor nodes are mixed, and overfitting cannot be avoided by deleting only one node, so that a Dropout mechanism can be changed to enhance the regularization result:
when a certain node is deleted, part of the nodes around the node are deleted at the same time. Therefore, two parameters are introduced, respectively: node drop probability
Figure 206557DEST_PATH_IMAGE022
Neighbor node range k for the dropped node, and surrounding node drop probability
Figure DEST_PATH_IMAGE023
. Here we use k =1 directly. We pass in the hyper-parameters: probability of node being reserved
Figure 244920DEST_PATH_IMAGE024
And probability of node drop
Figure 660989DEST_PATH_IMAGE023
Assuming that the average degree of each node is
Figure DEST_PATH_IMAGE025
Where n represents the total number of nodes and e represents the total number of edges, the average number of discarded nodes is
Figure 835618DEST_PATH_IMAGE026
Where denotes a multiplication operation, so the probability of each node being dropped can be calculated as:
Figure DEST_PATH_IMAGE027
when classifying the features, average pooling is adopted, the weight L2 is regularized after the average pooling, the bias is set to be 0, an angle interval coefficient m and a cosine interval distance are introduced, and when the features are classified, the average pooling is carried out, and when the angle interval coefficient m and the cosine interval distance are set to be 0
Figure 279107DEST_PATH_IMAGE028
The derivation is inconvenient when the angle interval coefficient m reversely propagates, and the derivation does not change when the angle interval coefficient m reversely propagates, so that the Softmax function is changed into:
Figure DEST_PATH_IMAGE029
where s is a scale factor, scaling cosine values, s requires setting ratiosLarger, we use here 25 to speed up and stabilize the optimization,
Figure 539187DEST_PATH_IMAGE030
indicates the class to which the sample belongs, then
Figure DEST_PATH_IMAGE031
The corresponding target angle in sample space for the class to which it belongs,
Figure 290105DEST_PATH_IMAGE032
wherein
Figure DEST_PATH_IMAGE033
Figure 194607DEST_PATH_IMAGE034
Namely cosine distance, N is the data volume of the training sample, e is a natural constant, c represents a plurality of outputs or category numbers of the neural network,
Figure DEST_PATH_IMAGE035
means for removing
Figure 700675DEST_PATH_IMAGE030
All other classification categories, then
Figure 556373DEST_PATH_IMAGE036
To represent
Figure 704458DEST_PATH_IMAGE035
The corresponding class is the corresponding angle in the sample space.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (7)

1. A deep learning-based method for recognizing skeletal point actions of pattern skaters is characterized by comprising the following steps of:
s1, acquiring the action video set to be classified through the client, uploading the action video set to a cache area of the server, and storing the action video set to the cache area of the server;
s2, randomly dividing the action video set to be classified in the cache area into a training video set and a testing video set, respectively using the training video set and the testing video set to calculate a training track and a testing track of an action, inputting the training track into an action recognition model for action recognition, and then optimizing the action recognition model through the testing track to obtain a deep action recognition model for quickly extracting the action;
s3, acquiring a motion video set to be tested through the depth motion recognition model, and recognizing the limb motion and the track of an object in the motion video set to be tested to obtain a recognition result of the test set;
s4, decomposing and comparing the recognition result of the test set with a standard fancy skating motion model through a grading system, grading according to the matching degree of the track, and simultaneously outputting a grading result, wherein the test set is a single motion to be tested or a set of a plurality of motions to be tested;
the method comprises the steps that a coach system is used for obtaining daily training videos of athletes, the obtained daily training videos are uploaded to a depth action recognition model arranged on a cloud server for bone point action analysis, then improved opinions are obtained according to scores of the actions of the athletes and action defects given by the depth action recognition model, the daily training videos are stored in the coach system, a user can access the coach system through a human-computer interaction interface, and action reply and/or posture check can be carried out through accessing the daily videos;
the depth motion recognition model comprises an ST-GCN skeleton point classification model and a noise reduction encoder, and the construction method of the ST-GCN skeleton point classification model comprises the following steps:
b1: preprocessing the acquired data of the training video set and the test video set by a noise reduction encoder to remove unsmooth end points and incomplete tracks in a training track, and mutually separating different action groups to acquire a plurality of sections of smooth training tracks;
b2: establishing an ST-GCN network and an action track fitting unit, and embedding the action track fitting unit into the back of an ST-GCN network convolution layer to build an overall network;
b3: training an ST-GCN network by using a training set, optimizing parameters and obtaining a skeleton behavior recognition network based on an action track;
b4: inputting the test set into the network obtained in the step B3 for prediction, and giving out a corresponding action type;
the ST-GCN network is used for predicting tracks, the action track fitting unit is used for fitting the tracks obtained through ST-GCN network prediction and the tracks of the test set to be tested obtained through the noise reduction encoder to obtain fitting tracks, then the predicted tracks and the fitting tracks are subjected to differentiation processing to obtain difference data, then the obtained difference data are transmitted to the scoring system, and actions are scored through the scoring system according to different dimensions.
2. The method for recognizing skeletal points of pattern skaters based on deep learning of claim 1, further comprising filtering out background and invalid actions in a video set to be tested by a noise reduction encoder arranged at a server end, and performing feature extraction, comprising the following steps:
a1, extracting three-dimensional coordinates of 16 relatively active bone joint points in a training video set or a test video set, wherein the 16 bone joint points are respectively head, middle shoulder, spine, middle hip, left shoulder, left elbow, left wrist, right shoulder, right elbow, right wrist, left hip, left knee, left ankle, right hip, right knee and right ankle;
a2, calculating the translation matrix and quaternion rotation of 16 bone joint points: the translation matrix represents the position change of the current frame and the previous frame of the skeletal joint point; the quaternion rotation represents the angle change of the current frame and the previous frame of the skeleton joint point, and the position change and the angle change of the current frame and the previous frame of the skeleton joint point form the motion characteristics of the skeleton joint point;
a2, forming motion characteristics based on human body parts: dividing a human body into 9 parts, and fusing the motion characteristics of skeletal joint points related to the 9 parts respectively to form motion characteristics based on the human body parts; the 9 parts of the human body are a trunk, a left upper arm, a left lower arm, a right upper arm, a right lower arm, a left upper leg, a left lower leg, a right upper leg and a right lower leg respectively.
3. The method for recognizing the bone point actions of the pattern skating players based on the deep learning as claimed in claim 1, wherein a client is connected with three-dimensional cameras, the number of the three-dimensional cameras is at least three, the three-dimensional cameras are installed around a target to be detected, the positions of the three-dimensional cameras can track along with the target to be detected, the shot videos are firstly cached in the three-dimensional cameras, then the shot videos are time-stamped according to time periods, the videos are split according to the time stamps, then the split videos are subjected to disorder processing and uploaded to a cache region of a server.
4. The deep learning-based pattern skating player skeletal point action recognition method according to claim 1, further comprising the construction of a standard fancy skating action model, comprising the steps of:
c1, acquiring a standard action video set of the pattern skater with standard action;
and C2, calculating a training track of the action by taking the standard action video set as a training set, inputting the training track into the action recognition model for action recognition to obtain a standard fancy skating action model, and taking the obtained standard fancy skating action model as a grading reference of the grading system.
5. The deep learning-based skeletal point motion recognition method for figure skating players as claimed in claim 1, wherein the scoring system comprises a plurality of scoring modules for scoring the performances of different evaluation dimensions of the motion, and the evaluation dimensions comprise the completion degree of a single motion, the completion degree of an overall motion, the fluency of motion engagement, the difficulty of a single motion and the difficulty of an overall motion.
6. The method for recognizing skeletal points of a pattern skater based on deep learning according to claim 2, further comprising performing manifold mapping on a training motion video set or a test motion video set by the noise reduction encoder, specifically comprising the following steps: and each action in the training action video set or the test action video set is represented as a set based on the motion characteristics of the 9 parts, the motion characteristics of the 9 parts in each action in the training action video set or the test action video set are mapped onto a low-dimensional manifold through a local linear embedding algorithm, each action forms 9 parts of track corresponding to the 9 parts, the part track related to the action is a curve, and the part track unrelated to the action is a point, so that the training track and the test track are obtained.
7. The deep learning-based pattern skating player skeletal point action recognition method as claimed in claim 1, wherein the ST-GCN skeletal point classification model construction method comprises the steps of:
d1, inputting a skeleton sequence, normalizing the input matrix and constructing a topological graph structure;
d2, transforming the time and space dimensions by ST-GCN unit alternately using GCN and TCN;
d3, classifying the features by using an average pooling and full connection layer, and then outputting the action classification result through the improved Softmax.
CN202210721105.0A 2022-06-24 2022-06-24 Deep learning-based skeleton point action recognition method for pattern skating players Active CN114821812B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210721105.0A CN114821812B (en) 2022-06-24 2022-06-24 Deep learning-based skeleton point action recognition method for pattern skating players

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210721105.0A CN114821812B (en) 2022-06-24 2022-06-24 Deep learning-based skeleton point action recognition method for pattern skating players

Publications (2)

Publication Number Publication Date
CN114821812A CN114821812A (en) 2022-07-29
CN114821812B true CN114821812B (en) 2022-09-13

Family

ID=82520787

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210721105.0A Active CN114821812B (en) 2022-06-24 2022-06-24 Deep learning-based skeleton point action recognition method for pattern skating players

Country Status (1)

Country Link
CN (1) CN114821812B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106384093A (en) * 2016-09-13 2017-02-08 东北电力大学 Human action recognition method based on noise reduction automatic encoder and particle filter
CN109063568A (en) * 2018-07-04 2018-12-21 复旦大学 A method of the figure skating video auto-scoring based on deep learning
CN109948459A (en) * 2019-02-25 2019-06-28 广东工业大学 A kind of football movement appraisal procedure and system based on deep learning
CN111784121A (en) * 2020-06-12 2020-10-16 清华大学 Action quality evaluation method based on uncertainty score distribution learning
CN112529941A (en) * 2020-12-17 2021-03-19 深圳市普汇智联科技有限公司 Multi-target tracking method and system based on depth trajectory prediction
CN112734808A (en) * 2021-01-19 2021-04-30 清华大学 Trajectory prediction method for vulnerable road users in vehicle driving environment
WO2022032652A1 (en) * 2020-08-14 2022-02-17 Intel Corporation Method and system of image processing for action classification

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3652642B1 (en) * 2017-07-11 2022-06-22 Telefonaktiebolaget LM Ericsson (Publ) Methods and arrangements for robot device control in a cloud
WO2021178914A1 (en) * 2020-03-06 2021-09-10 Northwell Health, Inc. System and method for determining user intention from limb or body motion or trajectory to control neuromuscular stimuation or prosthetic device operation
CN112634325B (en) * 2020-12-10 2022-09-09 重庆邮电大学 Unmanned aerial vehicle video multi-target tracking method
CN113568410B (en) * 2021-07-29 2023-05-12 西安交通大学 Heterogeneous intelligent body track prediction method, system, equipment and medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106384093A (en) * 2016-09-13 2017-02-08 东北电力大学 Human action recognition method based on noise reduction automatic encoder and particle filter
CN109063568A (en) * 2018-07-04 2018-12-21 复旦大学 A method of the figure skating video auto-scoring based on deep learning
CN109948459A (en) * 2019-02-25 2019-06-28 广东工业大学 A kind of football movement appraisal procedure and system based on deep learning
CN111784121A (en) * 2020-06-12 2020-10-16 清华大学 Action quality evaluation method based on uncertainty score distribution learning
WO2022032652A1 (en) * 2020-08-14 2022-02-17 Intel Corporation Method and system of image processing for action classification
CN112529941A (en) * 2020-12-17 2021-03-19 深圳市普汇智联科技有限公司 Multi-target tracking method and system based on depth trajectory prediction
CN112734808A (en) * 2021-01-19 2021-04-30 清华大学 Trajectory prediction method for vulnerable road users in vehicle driving environment

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Hui-Ying Li 等.Skeleton Based Action Quality Assessment of Figure Skating Videos.《2021 11th International Conference on Information Technology in Medicine and Education (ITME)》.2022, *
Learning to Score Figure Skating Sport Videos;Chengming Xu等;《IEEE Transactions on Circuits and Systems for Video Technology》;20190705;第30卷(第12期);第4578-4590页 *
Skeleton Based Action Quality Assessment of Figure Skating Videos;Hui-Ying Li 等;《2021 11th International Conference on Information Technology in Medicine and Education (ITME)》;20220415;第196-200页 *
Skeleton-Based Action Recognition with Dense Spatial Temporal Graph Network;Lin Feng等;《International Conference on Neural Information Processing, ICONIP 2020》;20201117;第188-194页 *
基于骨骼点特征的人体滑冰运动识别;姜东;《中国优秀硕士学位论文全文数据库社会科学Ⅱ辑》;20220115(第(2022)01期);第H134-39页 *

Also Published As

Publication number Publication date
CN114821812A (en) 2022-07-29

Similar Documents

Publication Publication Date Title
CN106778854B (en) Behavior identification method based on trajectory and convolutional neural network feature extraction
CN105512289B (en) Image search method based on deep learning and Hash
CN115661943B (en) Fall detection method based on lightweight attitude assessment network
Bu Human motion gesture recognition algorithm in video based on convolutional neural features of training images
CN108875586B (en) Functional limb rehabilitation training detection method based on depth image and skeleton data multi-feature fusion
Liu Objects detection toward complicated high remote basketball sports by leveraging deep CNN architecture
CN110490109A (en) A kind of online human body recovery action identification method based on monocular vision
WO2022183805A1 (en) Video classification method, apparatus, and device
Liu et al. Viewpoint invariant action recognition using rgb-d videos
Wang et al. Basketball shooting angle calculation and analysis by deeply-learned vision model
Zhang et al. A Gaussian mixture based hidden Markov model for motion recognition with 3D vision device
CN116844084A (en) Sports motion analysis and correction method and system integrating blockchain
CN114821812B (en) Deep learning-based skeleton point action recognition method for pattern skating players
CN115546491B (en) Fall alarm method, system, electronic equipment and storage medium
Pan et al. Analysis and Improvement of Tennis Motion Recognition Algorithm Based on Human Body Sensor Network
Tsai et al. Temporal-variation skeleton point correction algorithm for improved accuracy of human action recognition
CN114299279A (en) Unmarked group rhesus monkey motion amount estimation method based on face detection and recognition
Liu et al. The visual movement analysis of physical education teaching considering the generalized hough transform model
CN111274908A (en) Human body action recognition method
CN111209433A (en) Video classification algorithm based on feature enhancement
Skublewska-Paszkowska et al. Attention Temporal Graph Convolutional Network for Tennis Groundstrokes Phases Classification
Nayak et al. Learning a sparse dictionary of video structure for activity modeling
Zhang et al. Application of optimized BP neural network based on genetic algorithm in rugby tackle action recognition
CN110705367A (en) Human body balance ability classification method based on three-dimensional convolutional neural network
He et al. Recognition and prediction of badminton attitude based on video image analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant