CN117253290B - Rope skipping counting implementation method and device based on yolopose model and storage medium - Google Patents

Rope skipping counting implementation method and device based on yolopose model and storage medium Download PDF

Info

Publication number
CN117253290B
CN117253290B CN202311331557.9A CN202311331557A CN117253290B CN 117253290 B CN117253290 B CN 117253290B CN 202311331557 A CN202311331557 A CN 202311331557A CN 117253290 B CN117253290 B CN 117253290B
Authority
CN
China
Prior art keywords
rope skipping
counting
key points
human body
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311331557.9A
Other languages
Chinese (zh)
Other versions
CN117253290A (en
Inventor
韩宇娇
倪非非
张波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Scenery Wisdom Beijing Information Technology Co ltd
Original Assignee
Scenery Wisdom Beijing Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Scenery Wisdom Beijing Information Technology Co ltd filed Critical Scenery Wisdom Beijing Information Technology Co ltd
Priority to CN202311331557.9A priority Critical patent/CN117253290B/en
Publication of CN117253290A publication Critical patent/CN117253290A/en
Application granted granted Critical
Publication of CN117253290B publication Critical patent/CN117253290B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a rope skipping counting realization method, a device and a storage medium based on yolopose model, belonging to the technical field of AI body-building exercise, comprising the following steps: video data are acquired in real time through video data acquisition equipment, and human body posture assessment is carried out on each frame of data by utilizing yolopose related algorithm and stored; determining the position of the tested person by utilizing the key point information of the human body; constructing a motion feature vector by utilizing key point features of a human body; classifying the human body key point characteristic motion construction, judging whether the rope skipping action is performed or not, and then counting. The counting method utilizes the excellent characteristics of yolopose series models, greatly reduces the calculation resources while realizing simultaneous detection of multiple people, simultaneously counts the rope skipping for the first time by utilizing the characteristic engineering method, improves the counting robustness, ensures the counting accuracy, and solves the defects of inaccurate counting of counting personnel and large number of manual counting in the traditional mode.

Description

Rope skipping counting implementation method and device based on yolopose model and storage medium
Technical Field
The application relates to the technical field of AI sports, in particular to a rope skipping counting implementation method and device based on yolopose model and a storage medium.
Background
With the continuous maturation and development of artificial intelligence technology and big data analysis technology, ai+others have become a trend, wherein ai+sports are now developing better and better. Along with openpose, simple and efficient human feature point algorithms are developed better, and the algorithms can better help to realize AI+sports.
In rope skipping field, rope skipping belongs to quick cyclic motion, to traditional people's count mode, can appear personnel's wasting of resources and personnel's error of counting, leads to the inaccurate scheduling problem of count, adopts present AI technique to count and can significantly reduce the manpower, and the degree of accuracy of simultaneous count is higher.
The common AI implementation modes are two, one is based on cloud computing, the other is based on edge end, the cloud computing has the defects that a network is needed, the network is required to transmit video, and the missing of information transmitted by the network can cause miscounting or missing. Network delay can be well solved based on edge end realization, but the computational power requirement for edge calculation is high, so that selecting an efficient algorithm to realize rope skipping is also an important step.
AI skipping ropes are currently counted based on single features, such as height features, and human body posture assessment algorithms for single person detection are used in many cases, so that it is difficult to realize multi-person skipping ropes at the edge. The single characteristic can lead to lower counting robustness, namely, the situation that a person walks back and forth and the situation that skeleton points shake can lead to miscounting, and different positions of the person in a lens are difficult to set a threshold value even if normalization is carried out, and the limitation can lead to difficulty in realizing rope skipping of multiple persons. Such as: patent CN116563922a, an artificial intelligence based automatic counting method for rope skipping, successfully uses key point information of human body to obtain height characteristics, and uses height characteristics of shoulders to count; the patent CN115471906A discloses a multi-rope skipping mode identification counting method, which is characterized in that actions of a person are shot through two cameras, jump counting points are carried out by utilizing heights, and jump counting is corrected by combining rope skipping identification with time, so that the counting accuracy is improved; patent CN116311523a, a fancy rope skipping recognition algorithm based on image recognition, by dynamically selecting key points, counting the height information of key points of human body and judging fancy action, smoothing and normalizing the height information. The patent CN116416551A discloses a video image multi-person self-adaptive rope skipping intelligent counting method based on a tracking algorithm, which is characterized in that a two-stage algorithm is used, namely, target detection is firstly carried out, then human body posture evaluation is carried out, rope skipping counting is carried out, the counting algorithm is judged only through height characteristics, the multi-person rope skipping counting method is successfully realized, but the human body posture evaluation can be carried out n times during multi-person counting, the calculation resource consumption is high, and the counting characteristic is single.
Disclosure of Invention
The application provides a rope skipping counting implementation method, device and storage medium based on yolopose model, solves the counting problem existing in the prior art, successfully realizes multi-person rope skipping at the edge end, avoids interference of other actions of a human body on rope skipping counting, and improves counting accuracy.
In a first aspect, a rope skipping counting implementation method based on yolopose model, the method includes:
Collecting single rope skipping video data, generating rope skipping pictures, marking key points of the rope skipping pictures by utilizing lableme, and training a pre-established yolopose model by the rope skipping picture data marked by the key points;
acquiring video data of a plurality of people skipping ropes through a camera, and dividing the video data according to character areas; wherein, each divided video data is single rope skipping video data;
aiming at each divided single rope skipping video data, acquiring human body key point characteristics corresponding to each region according to a yolopose model after training; wherein the human body key points are leg key points;
according to the obtained human body key point characteristics, determining the motion state of rope skipping in each video frame, and constructing corresponding characteristic vectors according to the motion states of all the video frames;
And determining the man-made rope skipping times in each region according to the feature vectors corresponding to each region.
Optionally, in the training of the pre-established yolopose model by the rope skipping picture data marked by the key points, the leg key points are identified by modifying the weights in the OKS in the loss function;
the loss function of OKS specifically includes:
Wherein k n is a specific weight of the key point to be adjusted; d n is the Euclidean distance between the predicted point and the real point; delta (v n > 0) is the visibility of the key point, the condition v n >0 is satisfied, delta (v n > 0) =1, the condition is not satisfied, delta (v n>0)=0;vn represents the visibility mark, namely 0 is not marked, 1 is marked and not blocked, 2 is marked and blocked, and s is a scale factor, and the value of the scale factor is the square root of the area of the detection frame.
Optionally, determining a motion state of rope skipping in each video frame, wherein the motion state specifically includes jump, jump hover, drop, two-foot landing, one-foot landing, and others, wherein the specific state is defined by displacement and speed of actions in adjacent frames, specifically:
the take-off is defined as: the y value of the key points of the human body characteristics of the adjacent frames is smaller, and the upward speed is higher, wherein the y value is used for representing the vertical height in the image frames;
the jump-up hovering is as follows: the y value difference value of the key points of the human body features of the adjacent frames is basically unchanged, and the speed is smaller than a set speed threshold;
The drop is defined as: the y value of key points of human body characteristic frames of adjacent frames is increased, and the speed in the downward direction is increased;
The bipedal landing is defined as: the y value and the x value of the human body characteristic key points of the adjacent frames are unchanged, the speed is smaller than a set speed threshold, and the height difference of the y values of the coordinates of the ankle of the two feet is smaller than a height threshold; the x value is used for representing the horizontal distance in the image frame;
Single foot landing is defined as: the y value and the x value of the human body characteristic key points of the adjacent frames are unchanged, the speed is smaller than a set speed threshold, and the height difference of the coordinate y values of the ankles of the feet is larger than the height threshold;
other states are defined as: other states than the above state.
Optionally, constructing the corresponding feature vector according to the motion states of all video frames includes:
Mapping the motion state in each video frame to a corresponding number;
Arranging the numbers mapped by the motion states of all the video frames in time sequence to obtain a state sequence corresponding to the video;
and determining the number of each state in the state sequence, and taking the number of each state as a characteristic value, thereby obtaining the characteristic vectors of all video frames.
Optionally, feature vectors of all video frames are obtained, and the method further comprises:
corresponding weights are set for different eigenvalues in the eigenvector.
Optionally, determining the number of rope skipping of the person in each region according to the feature vector corresponding to each region includes counting by similarity:
when counting is carried out through the similarity, cosine similarity calculation is carried out on the current feature vector and the feature vector mean value of which the historical counting is successful;
and when the calculation result is greater than a preset threshold value, performing rope skipping counting.
Optionally, determining the number of rope skipping of the people in each region according to the feature vector corresponding to each region includes counting by a classifier:
when counting is carried out through the classifier, a counting classifier is constructed; the count classifier may include a decision tree and a support vector machine;
And (3) performing two-classification by inputting feature vectors corresponding to all video frames, and performing rope skipping counting according to classification results.
In a second aspect, a rope skipping count implementation device based on yolopose model, the device includes:
The training module is used for collecting rope skipping video data of a single person, generating rope skipping pictures, marking key points of the rope skipping pictures by utilizing lableme, and training a pre-established yolopose model by the rope skipping picture data marked by the key points;
the acquisition module is used for acquiring video data of the multi-person rope skipping through the camera and dividing the video data according to the character areas; wherein, each divided video data is single rope skipping video data;
the first processing module is used for acquiring human body key point characteristics corresponding to each region according to the yolopose model after training aiming at each divided single rope skipping video data; wherein the human body key points are leg key points;
The second processing module is used for determining the motion state of rope skipping in each video frame according to the obtained human body key point characteristics and constructing corresponding feature vectors according to the motion states of all the video frames;
And the determining module is used for determining the man-made rope skipping times in each region according to the feature vectors corresponding to each region.
In a third aspect, there is provided an electronic device comprising a memory and a processor, the memory storing a computer program, the processor implementing the rope skipping count implementing method of any of the first aspects described above when executing the computer program.
In a fourth aspect, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the rope skipping count implementation method of any of the first aspects above.
Compared with the prior art, the application has at least the following beneficial effects:
According to the application, through yolopose series of algorithms, target detection and human body posture evaluation of multiple persons can be realized at the same time, and the computing resources of edge end computing are greatly reduced.
The application provides the construction of the characteristic vector of the rope skipping for the first time and the construction method thereof, and the method can better integrate the motion characteristics of the rope skipping and has stronger characteristic expression.
The application firstly provides counting by utilizing cosine similarity or a classifier, can greatly eliminate the interference of other redundant information on rope skipping counting, and can realize multi-person asynchronous rope skipping by a plurality of persons standing at different positions.
The multi-person rope skipping method based on yolopose algorithm provided by the application successfully realizes multi-person rope skipping and improves the counting accuracy.
Drawings
FIG. 1 is a flow chart of the steps of the method of counting rope skipping of multiple people according to the application;
FIG. 2 is a schematic flow chart of a counting method of a multi-person rope skipping according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an actual scene of a multi-person rope jump according to one embodiment of the application;
FIG. 4 is a schematic diagram showing characteristics of an abnormal state and a normal state of rope skipping counting according to an embodiment of the present application;
Fig. 5 is a block diagram of a module architecture of a rope skipping count implementation device according to an embodiment of the present application;
fig. 6 is an internal structural diagram of a computer device in one embodiment.
Detailed Description
The present application will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present application more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
In the description of the present application: the expressions "comprising", "including", "having", etc. also mean "not limited to" (certain units, components, materials, steps, etc.).
In one embodiment, as shown in fig. 1, a rope skipping counting implementation method based on yolopose model is provided, which comprises the following steps:
step one, collecting single rope skipping video data, generating rope skipping pictures, marking key points of the rope skipping pictures by utilizing lableme, and training a pre-established yolopose model through the rope skipping picture data marked by the key points.
In this step, pre-training yolopose models is first performed: the yolopose model refers to the yolopose model of the target detection models yolov5, yolox, yolov7 and part of the yolov series model, the model for human body posture estimation, which provides a pre-training model based on the COCO dataset, but the data during jumping can generate unrecognized conditions, so 17 key point datasets of the COCO are built through lableme to train the model. The training process specifically comprises the following steps:
The specific mode is as follows:
Collecting rope skipping video data of a single person or multiple persons, and generating pictures, wherein the main environment is a plurality of indoor and outdoor conditions;
then, carrying out key point labeling and bbox labeling by utilizing lableme;
processing lableme marked data to obtain yolo series data formats required by model training;
In rope skipping, the facial key point information of the person is non-key information, so that the weight in the OKS in the loss function is modified to better identify the leg characteristics in order to improve the identification accuracy of the key points of the legs during training.
The OKS-based loss function is as follows:
Wherein: k n is a key specific weight; d n is the Euclidean distance between the predicted point and the real point; delta is the visibility of each keypoint.
The adjustment weight is the value of adjustment k n; delta (v n > 0) is the visibility of the key point, the condition v n >0 is satisfied, delta (v n > 0) =1, the condition is not satisfied, delta (v n>0)=0;vn represents the visibility mark, namely 0 is not marked, 1 is marked and not blocked, 2 is marked and blocked, and s is a scale factor, and the value of the scale factor is the square root of the area of the detection frame.
And secondly, acquiring video data of the multi-person rope skipping through a camera, and dividing the video data according to the character areas.
Wherein, each divided video data is single rope skipping video data. In the step, the identification of key points of the human body is mainly realized, namely, the trained model is utilized, the data is acquired through the camera, and the bone point information of all people in the video is identified.
In the step, the rope skipping area is set: the area setting means that the positions of all people are different in the multi-person jump rope, so that the area of the camera is divided, then the key points of the leg parts of yolopose are associated with the positions of the divided areas based on the positions of the areas, and keywords of only one are given, namely, only one person can stand in one area.
And thirdly, aiming at the divided single rope skipping video data, acquiring human body key point characteristics corresponding to each region according to a yolopose model after training.
Wherein, human body key points are leg key points.
And step four, determining the motion state of rope skipping in each video frame according to the obtained human body key point characteristics, and constructing corresponding feature vectors according to the motion states of all the video frames.
In the step, a feature vector is constructed, specifically, the obtained key point features of the human body are obtained, and a feature vector is constructed according to the action and the state of the rope skipping. The specific construction method is as follows:
The motion characteristics of a jump rope can be defined as jump, jump hover, drop, two-foot landing, one-foot landing and other 6 states, each action is determined by displacement and velocity:
The take-off is defined as: the Y value of the key points of the human body characteristics of the adjacent frames is smaller, namely Y t-Yt-1y0, the speed of the upward direction is increased by V 0v0, the downward direction in the image is positive, and the Y value is used for representing the vertical height in the image frame;
The jump-up hovering is as follows: the Y value difference of the key points of the human body features of the adjacent frames is basically unchanged |Y t-Yt-1|<θy1, the speed is about 0, |V 1|<θv1;
The drop is defined as: the Y value of the key point of the human body characteristic frame of the adjacent frame is increased by Y t-Yt-1y2, the speed in the downward direction is increased, and V 2v2;
The bipedal landing is defined as: the Y value and the X value of the key points of the human body characteristics of the adjacent frames are unchanged, |X t-Xt-1|<θx3,|Yt-Yt-1|<θy3, the speed is about 0, namely |V 3|<θv3, and the height difference of the Y values of the coordinates of the ankle of the two feet is smaller than a threshold value |Y t[0]-Yt[1]|<δ0; the x value is used for representing the horizontal distance in the image frame;
Single foot landing is defined as: the Y value and the X value of the key points of the human body characteristics of the adjacent frames are unchanged, |X t-Xt-1|<θx3,|Yt-Yt-1|<θy3, the speed is about 0, and the height difference of the Y values of the coordinates of the ankle of the two feet is larger than a threshold value |Y t[0]-Yt[1]|>δ1;
other states are defined as: other states than the above state.
The key points of the human body characteristics are several or all 11-16 corresponding to the COCO key point data set of the human body.
Wherein each video frame corresponds to a state, including [0,1,2,3,4,5], respectively. I.e., a video frame is determined by the above formula, specifically, jump hover, drop, two-foot landing, one-foot landing, and other corresponding to 0,1,2,3,4,5, respectively.
The feature vector is defined as starting from the detected take-off state to the next take-off state and is assumed to have a state sequence of [0,0,0,1,1,2,4,3,3] for a period of time, the feature vector for this period of time being [3,2,1,2,1,0], i.e. the first bit of the feature vector represents the number of states of 0, i.e. the number of 0's of the state sequence has 3, the second bit and so on.
The feature vector may set a weight vector w= [ W0, W1, W2, W3, W4, W5], the feature vector being as follows: f=a×w T
Fifthly, determining the man-made rope skipping times in each area according to the feature vectors corresponding to each area.
In this step, counting is specifically implemented: the counting can be performed by two methods, namely, the counting by similarity and the counting by using a classifier;
Similarity count: a markov independent assumption is made here that the current rope-jump status is to be correlated only with the previous ones. The cosine similarity is selected for similarity calculation.
Current feature vector F t is taken as a and the previous n successfully counted vectors F 1,F2.
The similarity calculation value is greater than a threshold value, and counting is performed.
The classification algorithm realizes counting: no markov assumption is made here, but a classifier needs to be constructed, the input of which is the features constructed above, and the classification algorithm selects a traditional machine learning algorithm, such as decision tree and support vector machine, and the classification adopts two classifications, namely a successful jump and a failure (1, 0). The data set is collected through the method, normal data and abnormal data are divided into approximately 2:1 for better classification, and for better data collection, all the videos are abnormal as far as possible and all the actions of normal jump are performed when the videos are shot.
In summary, the automatic counting method of the multi-person rope skipping based on yolopose belongs to the field of edge calculation, belongs to the related technical field of AI body building exercise, and comprises the following steps: video data are acquired in real time through video data acquisition equipment, and human body posture assessment is carried out on each frame of data by utilizing yolopose related algorithm and stored; determining the position of the tested person by utilizing the key point information of the human body; constructing a motion feature vector by utilizing key point features of a human body; classifying the human body key point characteristic motion construction, judging whether the rope skipping action is performed or not, and then counting.
The counting method of the application utilizes the excellent characteristics of yolopose series models, realizes simultaneous detection of multiple people, greatly reduces the calculation resources, simultaneously counts the rope skipping for the first time by utilizing the characteristic engineering method, improves the counting robustness and ensures the counting accuracy. The defects of inaccurate counting of counting personnel and large number of manual counting in the traditional mode are overcome.
Meanwhile, bone point information of all people in the image is grasped through yolopose at one time, the calculated amount is reduced, communication delay is greatly reduced through edge end calculation, and counting is realized by utilizing the motion characteristics of the characteristic engineering construction rope skipping, rather than only counting through height. The multi-person asynchronous rope skipping at the edge end is successfully realized, the calculation accuracy is improved by fusing a plurality of characteristics, and the setting difficulty of the threshold value is reduced.
A specific example of the application of the above method is given below:
As shown in fig. 2, the method for implementing the method according to the embodiment of the present invention includes the following steps:
initializing parameters, firstly setting parameters of positions, and demarcating areas on the ground, wherein 5 areas are demarcated in the camera area as shown in fig. 3. Next, a start feature F0= [2,1,2,1,0,0] is initialized. Finally, the initialization weight W is [1,1,1,1,0.1,1], which is equivalent to turning off the single-foot jump.
Training yolopose a model, wherein a yolov5 model is selected for fine tuning;
The input of the selected model in the second step is yolov-s model structure, and the input size of the model is (640 x 640);
The weight of the key points for training in the second step is set as follows: [0.26, 0.25, 25, 0.35, 0.35, 0.79, 0.79, 0.72,0.72, 0.62, 0.62, 0.7, 0.7, 0.6, 0.6, 0.6, 0.6, ];
the training data in the second step is derived from labelme marked data, so that the marking of the motion data of the person is mainly increased, and the total number of the training data is 5630;
and step two, training the super parameters of the model, wherein the batch-size is 8, training 100epochs, and lr is 0.001, and adding data for enhancing.
In the second step, in order to better verify the fine tuning effect, 1000 test set pictures are constructed, are also directly obtained from the rope skipping video, and AP (Average Precision) is calculated by using the weight of the OKS, so that the AP is improved from 41.2% to 43.6%, and the AP is improved to a certain extent;
The face recognition authentication is firstly carried out, and the hand lifting can be carried out to indicate the beginning of rope skipping only after the authentication is successful.
In the third step, the snapshot of the face is based on a yolov target detection frame, and 1/3 of the image information is taken from top to bottom;
and step three, face recognition adopts a disclosed algorithm, such as a rainbow soft open api, and hand recognition is carried out by judging by using key points, namely, the skeleton points of the wrist are higher than the head.
The face recognition process needs to input user information in advance and then compare faces.
The data for each frame is then recorded using the yolov-pose algorithm and the above feature engineering after lifting the hand.
The parameters of the feature engineering in step four are shown in table 1:
TABLE 1 characteristic engineering parameter set table
The size of the picture input by the parameter in the fourth step is 640 x 640.
In the third step, the key points of the human body mainly comprise 17 key points of the human body, including (nose ,left_eye, right_eye ,left_ear ,right_ear ,left_shoulder ,right_shoulder ,left_elbow ,right_elbow ,left_wrist ,right_wrist ,left_ship ,right_ship ,left_knee ,right_knee ,left_ankle ,right_ankle).
In the fourth step, the key points of the jump and drop human body characteristics are the information with the serial numbers of 15 and 16, namely (left_ ankle and right_ ankle), and the other key points are all the information with the serial numbers of 11-16. The sequence numbers start from 0.
And entering the recorded characteristics, entering a counter to calculate the similarity, and calculating by using the initialized characteristics F0. And if the similarity is greater than 0.9, judging that the jump is successful.
The embodiment of the invention is run on Jetson Orin T edge equipment, and the camera adopts DS-2CD7A4XYZ123A. yolov5-pose performs FP16 quantization processing, and performs reasoning by using tensorRT engine, wherein FPs of yolov-pose execution can reach about 30, and the whole program can run at 20-25 FPs.
According to the embodiment, the method realizes rope skipping of multiple persons and accurate counting through the algorithm of yolopose and the characteristic construction method.
Fig. 4 is a schematic diagram showing the characteristics of the abnormal state and the normal state of rope skipping counting, and can be used for recognizing the skeleton points of the corresponding position people correctly and counting when a plurality of people skip the rope. The upper part of the figure 4 shows that the feet can be lifted and moved forward when the jump is interrupted, the lower part of the figure is the characteristic number when the jump is normal, and the similarity of the two characteristics is very low, so that the invention can better avoid abnormal actions and better ensure the accuracy of the jump rope.
In one embodiment, as shown in fig. 5, there is provided a rope skipping count implementation apparatus, including the following program modules: training module, acquisition module, first processing module, second processing module and confirm the module, wherein:
The training module is used for collecting rope skipping video data of a single person, generating rope skipping pictures, marking key points of the rope skipping pictures by utilizing lableme, and training a pre-established yolopose model by the rope skipping picture data marked by the key points;
the acquisition module is used for acquiring video data of the multi-person rope skipping through the camera and dividing the video data according to the character areas; wherein, each divided video data is single rope skipping video data;
The first processing module is used for acquiring human body key point characteristics corresponding to each region according to the yolopose model after training aiming at each divided single rope skipping video data; wherein, the key points of the human body are key points of the legs;
The second processing module is used for determining the motion state of rope skipping in each video frame according to the obtained human body key point characteristics and constructing corresponding feature vectors according to the motion states of all the video frames;
And the determining module is used for determining the man-made rope skipping times in each region according to the feature vectors corresponding to each region.
The specific implementation content of each module can be referred to the limitation of the rope skipping counting implementation method, and is not repeated here.
In one embodiment, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 6. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. The processor of the computer device is used for providing computing and control capabilities, and the communication interface is used for conducting wired or wireless communication with an external terminal, wherein the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer equipment runs a computer program by loading to realize the rope skipping counting realization method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a camera, and the video data of the rope skipping is obtained through the camera; the memory is used for process data in the rope skipping counting implementation method and result data of the rope skipping counting implementation method.
It will be appreciated by those skilled in the art that the structure shown in FIG. 6 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In an embodiment, a computer readable storage medium is also provided, on which a computer program is stored, involving all or part of the flow of the method of the above embodiment.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

Claims (6)

1. The rope skipping counting implementation method based on yolopose model is characterized by comprising the following steps:
Collecting single rope skipping video data, generating rope skipping pictures, marking key points of the rope skipping pictures by utilizing lableme, and training a pre-established yolopose model by the rope skipping picture data marked by the key points;
acquiring video data of a plurality of people skipping ropes through a camera, and dividing the video data according to character areas; wherein, each divided video data is single rope skipping video data;
aiming at each divided single rope skipping video data, acquiring human body key point characteristics corresponding to each region according to a yolopose model after training; wherein the human body key points are leg key points;
according to the obtained human body key point characteristics, determining the motion state of rope skipping in each video frame, and constructing corresponding characteristic vectors according to the motion states of all the video frames;
determining the man-object rope skipping times in each region according to the feature vectors corresponding to each region;
In training the pre-established yolopose model by the rope skipping picture data marked by the key points, carrying out leg key point identification by modifying the weight in the OKS in the loss function;
the loss function of OKS specifically includes: Wherein N kpts represents the total number of key points, and k n is a specific weight of the key points to be adjusted; d n is the Euclidean distance between the predicted point and the real point; delta (v n > 0) is the visibility of the key point, the condition v n >0 is satisfied, delta (v n > 0) =1, the condition is not satisfied, delta (v n>0)=0;vn represents the visibility mark, namely 0 is not marked, 1 is marked and not blocked, 2 is marked and blocked;
Determining the motion state of rope skipping in each video frame, wherein the motion state specifically comprises taking off, jumping, hovering, falling, landing of two feet, landing of one foot and others, and the specific state is defined by the displacement and the speed of actions in adjacent frames, specifically:
the take-off is defined as: the y value of the key points of the human body characteristics of the adjacent frames is smaller, and the upward speed is higher, wherein the y value is used for representing the vertical height in the image frames;
The jump-up hovering is as follows: the y value difference value of the key points of the human body features of the adjacent frames is unchanged, and the speed is smaller than a set speed threshold;
The drop is defined as: the y value of key points of human body characteristic frames of adjacent frames is increased, and the speed in the downward direction is increased;
The bipedal landing is defined as: the y value and the x value of the human body characteristic key points of the adjacent frames are unchanged, the speed is smaller than a set speed threshold, and the height difference of the y values of the coordinates of the ankle of the two feet is smaller than a height threshold; the x value is used for representing the horizontal distance in the image frame;
Single foot landing is defined as: the y value and the x value of the human body characteristic key points of the adjacent frames are unchanged, the speed is smaller than a set speed threshold, and the height difference of the coordinate y values of the ankles of the feet is larger than the height threshold;
Other states are defined as: other states than the above state;
constructing corresponding feature vectors according to the motion states of all video frames comprises: mapping the motion state in each video frame to a corresponding number; arranging the numbers mapped by the motion states of all the video frames in time sequence to obtain a state sequence corresponding to the video; determining the number of each state in the state sequence, and taking the number of each state as a characteristic value, thereby obtaining the characteristic vectors of all video frames;
Obtaining feature vectors of all video frames, the method further comprising: corresponding weights are set for different eigenvalues in the eigenvector.
2. The method of claim 1, wherein determining the number of rope jumps of the person in each region based on the feature vector corresponding to each region comprises counting by similarity:
when counting is carried out through the similarity, cosine similarity calculation is carried out on the current feature vector and the feature vector mean value of which the historical counting is successful;
and when the calculation result is greater than a preset threshold value, performing rope skipping counting.
3. The method of claim 1, wherein determining the number of rope jumps for the person in each region based on the feature vector for each region comprises counting by a classifier:
When counting is carried out through the classifier, a counting classifier is constructed; the counting classifier comprises a decision tree and a support vector machine;
And (3) performing two-classification by inputting feature vectors corresponding to all video frames, and performing rope skipping counting according to classification results.
4. Rope skipping count implementation device based on yolopose model, characterized in that the device includes:
The training module is used for collecting rope skipping video data of a single person, generating rope skipping pictures, marking key points of the rope skipping pictures by utilizing lableme, and training a pre-established yolopose model by the rope skipping picture data marked by the key points;
the acquisition module is used for acquiring video data of the multi-person rope skipping through the camera and dividing the video data according to the character areas; wherein, each divided video data is single rope skipping video data;
the first processing module is used for acquiring human body key point characteristics corresponding to each region according to the yolopose model after training aiming at each divided single rope skipping video data; wherein the human body key points are leg key points;
The second processing module is used for determining the motion state of rope skipping in each video frame according to the obtained human body key point characteristics and constructing corresponding feature vectors according to the motion states of all the video frames;
the determining module is used for determining the man-made rope skipping times in each region according to the feature vectors corresponding to each region;
In training the pre-established yolopose model by the rope skipping picture data marked by the key points, carrying out leg key point identification by modifying the weight in the OKS in the loss function;
the loss function of OKS specifically includes: Wherein N kpts represents the total number of key points, and k n is a specific weight of the key points to be adjusted; d n is the Euclidean distance between the predicted point and the real point; delta (v n > 0) is the visibility of the key point, the condition v n >0 is satisfied, delta (v n > 0) =1, the condition is not satisfied, delta (v n>0)=0;vn represents the visibility mark, namely 0 is not marked, 1 is marked and not blocked, 2 is marked and blocked;
Determining the motion state of rope skipping in each video frame, wherein the motion state specifically comprises taking off, jumping, hovering, falling, landing of two feet, landing of one foot and others, and the specific state is defined by the displacement and the speed of actions in adjacent frames, specifically:
the take-off is defined as: the y value of the key points of the human body characteristics of the adjacent frames is smaller, and the upward speed is higher, wherein the y value is used for representing the vertical height in the image frames;
The jump-up hovering is as follows: the y value difference value of the key points of the human body features of the adjacent frames is unchanged, and the speed is smaller than a set speed threshold;
The drop is defined as: the y value of key points of human body characteristic frames of adjacent frames is increased, and the speed in the downward direction is increased;
The bipedal landing is defined as: the y value and the x value of the human body characteristic key points of the adjacent frames are unchanged, the speed is smaller than a set speed threshold, and the height difference of the y values of the coordinates of the ankle of the two feet is smaller than a height threshold; the x value is used for representing the horizontal distance in the image frame;
Single foot landing is defined as: the y value and the x value of the human body characteristic key points of the adjacent frames are unchanged, the speed is smaller than a set speed threshold, and the height difference of the coordinate y values of the ankles of the feet is larger than the height threshold;
Other states are defined as: other states than the above state;
constructing corresponding feature vectors according to the motion states of all video frames comprises: mapping the motion state in each video frame to a corresponding number; arranging the numbers mapped by the motion states of all the video frames in time sequence to obtain a state sequence corresponding to the video; determining the number of each state in the state sequence, and taking the number of each state as a characteristic value, thereby obtaining the characteristic vectors of all video frames;
Obtaining feature vectors of all video frames, further comprising: corresponding weights are set for different eigenvalues in the eigenvector.
5. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 3 when the computer program is executed.
6. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 3.
CN202311331557.9A 2023-10-13 2023-10-13 Rope skipping counting implementation method and device based on yolopose model and storage medium Active CN117253290B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311331557.9A CN117253290B (en) 2023-10-13 2023-10-13 Rope skipping counting implementation method and device based on yolopose model and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311331557.9A CN117253290B (en) 2023-10-13 2023-10-13 Rope skipping counting implementation method and device based on yolopose model and storage medium

Publications (2)

Publication Number Publication Date
CN117253290A CN117253290A (en) 2023-12-19
CN117253290B true CN117253290B (en) 2024-05-10

Family

ID=89132925

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311331557.9A Active CN117253290B (en) 2023-10-13 2023-10-13 Rope skipping counting implementation method and device based on yolopose model and storage medium

Country Status (1)

Country Link
CN (1) CN117253290B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464808A (en) * 2020-11-26 2021-03-09 成都睿码科技有限责任公司 Rope skipping posture and number identification method based on computer vision
CN114100103A (en) * 2021-10-28 2022-03-01 杭州电子科技大学 Rope skipping counting detection system and method based on key point identification
CN115171019A (en) * 2022-07-15 2022-10-11 浙江大学 Rope skipping counting method based on semi-supervised video target segmentation
CN115205750A (en) * 2022-07-05 2022-10-18 北京甲板智慧科技有限公司 Motion real-time counting method and system based on deep learning model
CN115346149A (en) * 2022-06-21 2022-11-15 浙江大沩人工智能科技有限公司 Rope skipping counting method and system based on space-time diagram convolution network
CN115359566A (en) * 2022-08-23 2022-11-18 深圳市赛为智能股份有限公司 Human behavior identification method, device and equipment based on key points and optical flow
CN115512439A (en) * 2022-09-23 2022-12-23 燕山大学 Real-time analysis method for spontaneous gait of mouse in long and narrow runway

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464808A (en) * 2020-11-26 2021-03-09 成都睿码科技有限责任公司 Rope skipping posture and number identification method based on computer vision
CN114100103A (en) * 2021-10-28 2022-03-01 杭州电子科技大学 Rope skipping counting detection system and method based on key point identification
CN115346149A (en) * 2022-06-21 2022-11-15 浙江大沩人工智能科技有限公司 Rope skipping counting method and system based on space-time diagram convolution network
CN115205750A (en) * 2022-07-05 2022-10-18 北京甲板智慧科技有限公司 Motion real-time counting method and system based on deep learning model
CN115171019A (en) * 2022-07-15 2022-10-11 浙江大学 Rope skipping counting method based on semi-supervised video target segmentation
CN115359566A (en) * 2022-08-23 2022-11-18 深圳市赛为智能股份有限公司 Human behavior identification method, device and equipment based on key points and optical flow
CN115512439A (en) * 2022-09-23 2022-12-23 燕山大学 Real-time analysis method for spontaneous gait of mouse in long and narrow runway

Also Published As

Publication number Publication date
CN117253290A (en) 2023-12-19

Similar Documents

Publication Publication Date Title
CN109863535B (en) Motion recognition device, storage medium, and motion recognition method
US20220232247A1 (en) Image coding method, action recognition method, and action recognition apparatus
CN110490177A (en) A kind of human-face detector training method and device
CN111931585A (en) Classroom concentration degree detection method and device
CN110473232A (en) Image-recognizing method, device, storage medium and electronic equipment
CN108629306A (en) Human posture recognition method and device, electronic equipment, storage medium
CN114616588A (en) Image processing apparatus, image processing method, and non-transitory computer-readable medium storing image processing program
CN107767335A (en) A kind of image interfusion method and system based on face recognition features&#39; point location
CN115661943B (en) Fall detection method based on lightweight attitude assessment network
WO2017161734A1 (en) Correction of human body movements via television and motion-sensing accessory and system
CN105426882B (en) The method of human eye is quickly positioned in a kind of facial image
CN110084192A (en) Quick dynamic hand gesture recognition system and method based on target detection
CN109685037A (en) A kind of real-time action recognition methods, device and electronic equipment
CN112800905A (en) Pull-up counting method based on RGBD camera attitude estimation
CN110032932A (en) A kind of human posture recognition method based on video processing and decision tree given threshold
CN105912126A (en) Method for adaptively adjusting gain, mapped to interface, of gesture movement
CN110956141A (en) Human body continuous action rapid analysis method based on local recognition
Yang et al. Human exercise posture analysis based on pose estimation
CN109299641A (en) A kind of train dispatcher&#39;s fatigue monitoring image adaptive Processing Algorithm
CN111079481B (en) Aggressive behavior recognition method based on two-dimensional skeleton information
CN117253290B (en) Rope skipping counting implementation method and device based on yolopose model and storage medium
Chang et al. Vision-based tracking and interpretation of human leg movement for virtual reality applications
CN115294660B (en) Body-building action recognition model, training method of model and body-building action recognition method
Zeng et al. Deep learning approach to automated data collection and processing of video surveillance in sports activity prediction
CN111860031A (en) Face pose estimation method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant