CN114495254A - Action comparison method, system, equipment and medium - Google Patents
Action comparison method, system, equipment and medium Download PDFInfo
- Publication number
- CN114495254A CN114495254A CN202011268929.4A CN202011268929A CN114495254A CN 114495254 A CN114495254 A CN 114495254A CN 202011268929 A CN202011268929 A CN 202011268929A CN 114495254 A CN114495254 A CN 114495254A
- Authority
- CN
- China
- Prior art keywords
- action
- target
- unit
- standard
- comparison
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000009471 action Effects 0.000 title claims abstract description 778
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000012937 correction Methods 0.000 claims abstract description 23
- 230000033001 locomotion Effects 0.000 claims description 154
- 230000011218 segmentation Effects 0.000 claims description 72
- 238000004891 communication Methods 0.000 claims description 19
- 230000015654 memory Effects 0.000 claims description 16
- 238000003860 storage Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 abstract description 12
- 238000013473 artificial intelligence Methods 0.000 abstract description 6
- 238000010586 diagram Methods 0.000 description 18
- 230000003993 interaction Effects 0.000 description 16
- 238000001514 detection method Methods 0.000 description 9
- 238000009826 distribution Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 6
- 238000004590 computer program Methods 0.000 description 5
- 238000005192 partition Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 230000002452 interceptive effect Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 210000004394 hip joint Anatomy 0.000 description 3
- 210000000629 knee joint Anatomy 0.000 description 3
- 238000002620 method output Methods 0.000 description 3
- 210000000323 shoulder joint Anatomy 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 238000009434 installation Methods 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 235000005156 Brassica carinata Nutrition 0.000 description 1
- 244000257790 Brassica carinata Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 210000002310 elbow joint Anatomy 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 210000001145 finger joint Anatomy 0.000 description 1
- 210000000245 forearm Anatomy 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 210000004247 hand Anatomy 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 210000000707 wrist Anatomy 0.000 description 1
- 210000003857 wrist joint Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The application provides an action comparison method, which is applied to the field of artificial intelligence and comprises the following steps: receiving a target video, comparing a target action recorded by the target video with a standard action recorded by the standard video, and presenting a comparison result to a user, wherein the comparison result comprises an error position of the target action relative to the standard action, and the error position indicates the position of an error local action in the target action. The method improves the comparison accuracy by dividing the whole set of actions into single actions for comparison. Moreover, the method can provide effective information for the user, for example, provide the error position of the target action relative to the standard action, so that the user can carry out targeted correction according to the error position, and the learning or training efficiency is improved.
Description
Technical Field
The present application relates to the field of Artificial Intelligence (AI), and in particular, to a method, a system, a device, and a computer readable storage medium for comparing actions.
Background
In various fields such as sports, dancing, movie and television, medical simulation, action skill training and the like, the requirement on the accuracy of human body actions is high. At present, the manual checking mode is still adopted in many occasions to determine whether the action is accurate and standard. With the rise of AI technology, especially image recognition technology, a new motion detection method is in force.
Specifically, human body actions to be detected can be recorded in a video, and the video recorded with the human body actions and the video recorded with the standard actions are compared through AI technologies, particularly technologies such as image recognition and image processing based on AI, so that automatic action comparison can be realized, the action comparison efficiency is improved, and the labor cost is reduced.
However, current action alignment methods are usually based on a whole set of actions to directly perform the alignment. On one hand, the method has the problem of low accuracy, and on the other hand, the method outputs an integral score and does not provide enough effective information to help a user correct the action.
Disclosure of Invention
The application provides an action comparison method, on one hand, the method solves the problem of low comparison accuracy by comparing a target action in a target video with a standard action in a standard video, and on the other hand, the method outputs the error position of the target action relative to the standard action and can help a user to correct the action. The application also provides a system, a device, a computer readable storage medium and a computer program product corresponding to the method.
In a first aspect, the present application provides a method for action comparison. The action comparison method can be realized by an action comparison system. The action comparison system can be deployed in a cloud environment, an edge environment or an end device, or be deployed in different environments in a distributed manner.
Specifically, the action comparison system receives the target video, for example, the target video may be received through a user interface such as a Graphical User Interface (GUI) or a Command User Interface (CUI). The target video can be a video which is recorded in advance and is to be subjected to action comparison, and can also be a video which is recorded in real time and is to be subjected to action comparison. The action comparison system compares the target action of the target video record with the standard action of the standard video record, generates a comparison result, and then presents the comparison result to the user, for example, presents the comparison result to the user through a GUI or a CUI. The comparison result comprises the error position of the target action relative to the standard action, and the error position indicates the position of the wrong local action in the target action.
In the method, the action comparison system divides the whole set of actions recorded by the target video to obtain at least one target action, and then compares the target action with the standard action recorded by the standard video, so that the problems that the whole set of actions recorded by the target video comprise interference actions or complex actions consisting of a plurality of continuous actions, and the action combination is not fixed, so that the comparison accuracy is low and the comparison result is not reliable are solved. The action comparison system can also output the error position of the target action relative to the standard action, and the user can carry out targeted correction according to the error position, so that the learning or training efficiency is improved. Therefore, the method not only improves the comparison accuracy, but also is beneficial to correcting wrong actions of the user.
In some possible implementations, the action comparison system may determine a frame in the target video where the target action is wrong with respect to the standard action, that is, a target frame, and then the action comparison system may present the comparison result to the user through the target frame of the target video.
In this way, the user can know not only the position of the wrong local action in the target action, but also the wrong time of the target action in the target video, and the user can be helped to correct the target action. The target frame may be one frame or multiple frames. When a plurality of wrong local actions occur, a user can quickly switch among the plurality of wrong local actions by clicking a plurality of target frames displayed on the GUI, so that the user experience is improved.
In some possible implementations, the action comparison system may also present to the user a prompt for the error location, the prompt being used to assist in correcting the target action. The prompt information can be a suggestion for the error position, and the user can carry out targeted training according to the suggestion, so that the training efficiency is improved.
For example, in a table tennis scenario, when the wrong position is at the arm, the motion comparison system can identify the cause of the specific error, such as the forearm being too low, and at this time, the motion comparison system can output prompt information, such as a recommendation to stand tightly against the table and perform a motion with bare hands for practice.
In some possible implementations, the action comparison system may also receive a user-specified action type as the action type of the target video through the GUI interface. Therefore, the action comparison system can be instructed to select a corresponding standard video (standard action) according to the specified action type, and then the target action recorded by the target video is compared with the standard action corresponding to the action type, so that the action comparison efficiency is improved.
In some possible implementations, the motion alignment system extracts motion sequence features from the target video and determines motion segmentation points according to the motion sequence features. The action comparison system divides the target action segment into at least one target action according to the action division points, can determine the action division points by using a time window according to the action sequence characteristics, and divides the target action segment based on the action division points to obtain one or more single action segments. Each single-action segment corresponds to a target action. In this way, the action comparison system can compare a single target action against a standard action of a standard video recording. Compared with the whole comparison, the method can improve the comparison accuracy and can provide more effective information, such as the error position, the error time and the like of the action. The time window may be a fixed time window or an elastic time window.
Further, each action may include at least one action unit. For example, an action in the sport of soccer may include passing, catching, and shooting at least three action units. For another example, the action in the table tennis sport may include action units such as a forehand attack and a backhand attack. The action comparison system can also determine action unit segmentation points according to the action sequence characteristics. And dividing the target action into at least one action unit according to the action unit division point. Therefore, the action comparison system can improve the comparison accuracy and can more accurately provide the error position and the error time of the action by comparing at least one action unit of the comparison target action with at least one action unit of the standard action.
Similar to the single action segment obtained by segmentation, the action comparison system can adopt an elastic time window to detect the boundaries of different action units, so that the action segmentation can adapt to the condition that the same action is continuous.
In some possible implementations, considering that the speed of the target object when performing the action is variable, the action comparison system may also make fine adjustments to the action boundaries or action unit boundaries in conjunction with the standard action, enabling the action segmentation to adapt to the dynamic changes in the action speed.
Specifically, the motion alignment system may determine the initial segmentation point according to the motion sequence features extracted from the target video. And determining the action sequence characteristics corresponding to the action units of the target action according to the initial segmentation points, and adjusting the initial segmentation points according to the similarity between the action sequence characteristics corresponding to the action units of the target action and the action sequence characteristics of the action units of the standard action so as to obtain more refined action unit segmentation points. The action comparison system can more accurately determine the boundary of the action unit by utilizing the more refined action unit segmentation points, so that the comparison accuracy is improved.
In some possible implementations, the action comparison system may further compare at least one action unit of the target action with at least one action unit of the standard action according to the keypoint. The key point may be a joint point that affects the motion. Different key points have different effects on motion alignment, and particularly in the case of speed variation, the different key points vary in degree with speed. For this reason, the motion comparison system may determine the joint types of interest for various types of motions or motion units and perform the comparison based on the joint types.
Specifically, the action comparison system determines the joint point type used for comparison, and then aligns at least one action unit of the target action and at least one action unit of the standard action according to a target joint point of a target object executing the target action and a reference joint point of a reference object executing the standard action to obtain a posture sequence of the target object and a posture sequence of the reference object. The target joint point and the reference joint point are joint points belonging to the type of joint point. Then, the action comparison system determines a comparison result according to the attitude sequence of the target object and the attitude sequence of the reference object.
Therefore, the action units can be aligned in a targeted manner, so that a better alignment effect is obtained, and the action comparison accuracy is improved.
In some possible implementations, the motion alignment system may further perform a velocity correction on the pose sequence of the target object according to the pose sequence of the reference object, taking into account the difference in motion velocity. For example, for an action unit in a magic chair type in yoga motion, the speed of the action unit executed by a reference object in a standard video is 1 times of the speed, and the speed of the action unit executed by a target object in a target video is 0.5 times of the speed, so that the action comparison system can perform down-sampling on the posture sequence of the target object according to the speed relation, thereby realizing the correction on the posture sequence of the target object.
Through carrying out speed correction, can avoid speed mismatch to lead to the erroneous judgement to action standard degree, improve the degree of accuracy that the action was compared.
A second method, the present application provides an action comparison system, which includes:
a communication unit for receiving a target video;
the comparison unit is used for comparing the target action of the target video record with the standard action of the standard video record;
the display unit is used for presenting a comparison result to a user, wherein the comparison result comprises an error position of the target action relative to the standard action, and the error position indicates the position of an error local action in the target action.
In some possible implementations, the display unit is specifically configured to:
and presenting a comparison result to a user through a target frame of the target video, wherein the target frame is a frame in the target video, in which the target action is wrong relative to the standard action.
In some possible implementations, the display unit is further configured to:
presenting prompt information for the error location to the user, the prompt information being used to assist in correcting the target action.
In some possible implementations, the system further includes:
the segmentation unit is used for extracting action sequence characteristics from the target video, determining action unit segmentation points according to the action sequence characteristics, and segmenting the target action into at least one action unit according to the action unit segmentation points;
the comparison unit is specifically configured to:
and comparing the at least one action unit of the target action with the at least one action unit of the standard action.
In some possible implementations, the segmentation unit is specifically configured to:
determining an initial segmentation point according to the action sequence characteristics extracted from the target video;
and adjusting the initial segmentation point to obtain the action unit segmentation point according to the similarity between the action sequence characteristics of the action unit obtained by segmenting the target action by the initial segmentation point and the action sequence characteristics of the action unit of the standard action.
In some possible implementations, the alignment unit is specifically configured to:
determining the joint point type adopted by comparison;
according to a target joint point of a target object executing the target action and a reference joint point of a reference object executing the standard action, aligning at least one action unit of the target action and at least one action unit of the standard action to obtain a posture sequence of the target object and a posture sequence of the reference object, wherein the target joint point and the reference joint point are joint points of the joint point type.
In some possible implementations, the alignment unit is specifically configured to:
carrying out speed correction on the attitude sequence of the target object according to the attitude sequence of the reference object;
and determining a comparison result according to the corrected posture sequence of the target object and the posture series of the reference object.
In a third aspect, the present application provides an apparatus comprising a processor and a memory. The processor and the memory are in communication with each other. The processor is configured to execute the instructions stored in the memory to cause the apparatus to perform the action comparison method as in the first aspect or any implementation manner of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, where instructions are stored in the computer-readable storage medium, and the instructions instruct a device to perform the action comparison method according to the first aspect or any implementation manner of the first aspect.
In a fifth aspect, the present application provides a computer program product containing instructions that, when run on a device, cause the device to perform the method of action comparison according to the first aspect or any implementation manner of the first aspect.
The present application can further combine to provide more implementations on the basis of the implementations provided by the above aspects.
Drawings
In order to more clearly illustrate the technical method of the embodiments of the present application, the drawings used in the embodiments will be briefly described below.
Fig. 1 is an application scenario diagram of an action comparison system according to an embodiment of the present application;
fig. 2 is an application scenario diagram of an action comparison system according to an embodiment of the present application;
FIG. 3 is a system architecture diagram of an action comparison system according to an embodiment of the present application;
fig. 4 is an interface schematic diagram of a main interface of an action comparison system according to an embodiment of the present application;
fig. 5 is an interface schematic diagram of a main interface of an action comparison system according to an embodiment of the present application;
fig. 6 is an interface schematic diagram of a result interface of an action comparison system according to an embodiment of the present application;
fig. 7 is an interface schematic diagram of a result interface of an action comparison system according to an embodiment of the present application;
fig. 8 is an interface schematic diagram of a result interface of an action comparison system according to an embodiment of the present application;
FIG. 9 is a flowchart of an action comparison system according to an embodiment of the present application;
FIG. 10 is a schematic diagram of a time window provided by an embodiment of the present application;
FIG. 11 is a flow chart illustrating an operation segmentation according to an embodiment of the present disclosure;
FIG. 12 is a schematic diagram of a process of comparing actions according to an embodiment of the present application;
FIG. 13 is a schematic diagram of action knowledge provided by an embodiment of the present application;
fig. 14 is a schematic structural diagram of a computing device according to an embodiment of the present application.
Detailed Description
The terms "first" and "second" in the embodiments of the present application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature.
Some technical terms referred to in the embodiments of the present application will be first described.
The action comparison refers to comparing the action to be compared with a preset standard action. The action to be compared refers to an action executed by the target object, and the standard action refers to a standard action used as an evaluation criterion. The target object may be a person or other organism capable of performing an action. Note that the person may be a character that exists in the real physical world, or may be a virtual character created by Computer Graphics (CG) technology. Similarly, the living body may be a real living body in the physical world or a virtual living body created by CG technology.
The action performed by the target object is specifically an action performed by the target object through its body. For example, the action performed by the target object may be a broadcast gymnastic action, a dance action, a fitness action, a rehabilitation training action, a sign language action, and so forth. In some possible implementations, the action performed by the target object may also be an action performed by the target object with the aid of another object. For example, in ball sports, the action performed by the target object may be an action performed by the target object via a ball (such as basketball, football, etc.), a racket (such as tennis, badminton, table tennis, etc.), or a club (such as baseball bat, golf club, etc.), and for example, the action performed by the target object may be a banner action performed by the target object via a banner in a traffic management scene, or a fancy performing action performed by a prop in an entertainment or sports scene, etc.
The standard action may be an action performed by an object with a corresponding professional ability (also referred to as a reference object). For example, the standard action may be an action performed by a professional such as a fitness trainer, dance teacher, or the like, in order to demonstrate the student. In some embodiments, the standard action may also be an exemplary action made by CG technology, performed by a virtual person or organism.
The action performed by the target object may be recorded in the target video and the standard action may be recorded in the standard video. Based on this, the target video and the standard video may be compared by an Artificial Intelligence (AI) technique, particularly, an image processing technique, thereby implementing action comparison. Based on the action comparison result, action standardization and normalization can be evaluated, so that teaching and training are more efficient and targeted.
Currently, the industry mainly compares the whole set of actions of the target video recording with the whole set of actions of the standard video recording. Specifically, gesture data are collected through special equipment (such as an optical motion capture system like Kinect and OptiTrack), then an action overall curve corresponding to the target video is determined according to the gesture data of actions in the target video, the action overall curve corresponding to the standard video is obtained according to the gesture data of the actions in the standard video, and an overall score can be obtained based on the similarity of the action overall curve and is used for evaluating the standard degree and the normalization of the whole set of actions of the target video.
However, there may be a disturbance action in the entire set of actions of the target video recording, or there may be a case where there is a complicated action made up of a plurality of actions in succession and the combination of actions is not fixed. For example, the player picks up the ball in the practice of playing table tennis, the actions of other people appear in the picture, and the player freely practices a plurality of continuous actions and complex actions with unfixed action combinations. The whole set of actions recorded by the target video is directly compared with the whole set of actions recorded by the standard video, so that the whole score completely deviates from a true value, and the credibility is greatly reduced. Moreover, the above method outputs an overall score, and does not output enough effective information, such as the wrong position of the output action, and the like, thereby causing that the targeted correction cannot be carried out, and influencing the efficiency of learning or training.
In view of this, the present application provides an action comparison method. The action comparison method can be realized by an action comparison system. Specifically, the action comparison system receives a target video, a whole set of actions is recorded in the target video, the action comparison system can determine a target action from the whole set of actions recorded in the target video, the target action is specifically a single action, then the target action recorded in the target video is compared with a standard action recorded in a standard video, an error position of the target action relative to the standard action is determined, the error position is specifically a position of a local action in error in the target action, and then the error position of the target action relative to the standard action is presented to a user.
On one hand, the method divides the whole set of actions recorded by the target video to obtain at least one target action, and then compares the target action with the standard action recorded by the standard video, thereby solving the problems that the whole set of actions recorded by the target video comprise interference actions, or complex actions consisting of a plurality of continuous actions and the action combination is not fixed, so that the comparison accuracy is low, and the comparison result is not reliable. On the other hand, the method can output the error position of the target motion relative to the standard motion, and the user can carry out targeted correction according to the error position, so that the learning or training efficiency is improved.
The action comparison system provided by the embodiment of the application can be used for action auxiliary teaching or action normative evaluation. For example, in the genre field, the motion comparison system may be used for auxiliary teaching of dance motions, auxiliary teaching of sports motions, auxiliary teaching of martial arts motions, and the like, and in the medical field, the motion comparison system may be used for auxiliary teaching of rehabilitation training motions. Thus, teaching and training are more efficient and targeted. For another example, in the traffic management field, the action comparison system can be used for normative evaluation of semaphore actions, and in the education field, the action comparison system can be used for normative evaluation of experimental actions. Therefore, the professional and normative of corresponding post personnel can be improved.
The action comparison system may be a software system. In particular, the action comparison system may be deployed in a computing device in the form of computer software to implement the functions of action comparison. In some embodiments, the action comparison system may also be a hardware system. The hardware system comprises a physical device with an action comparison function.
Next, a deployment manner of the action comparison system is exemplified by taking the action comparison system as a software system.
As shown in FIG. 1, the action comparison system may be deployed in a cloud environment, and in particular, one or more computing devices (e.g., a central server) on the cloud environment. The action comparison system may also be deployed in an edge environment, specifically on one or more computing devices (edge computing devices) in the edge environment, where the edge computing devices may be edge servers, computing boxes, and the like. The cloud environment indicates a central computing device cluster owned by a cloud service provider for providing computing, storage, and communication resources; the edge environment indicates a cluster of edge computing devices geographically close to the end devices (i.e., the end-side devices) for providing computing, storage, and communication resources.
When the action comparison system is deployed in a cloud environment or an edge environment, the action comparison system can be provided for a user in a service form. Specifically, a user can access a cloud environment or an edge environment through a browser, create an instance of an action comparison system in the cloud environment, and then interact with the instance of the action comparison system through the browser, so as to compare a target action recorded by a target video with a standard action recorded by a standard video, and generate a comparison result, where the comparison result includes an error position of the target action relative to the standard action.
The action comparison system can also be deployed on the end device. The end device includes, but is not limited to, a desktop computer, a notebook computer, a smart phone, and other user terminals. By running the action comparison system on the user terminals, the target action and the standard action can be compared to generate a comparison result.
In some possible implementations, the end device may also serve as a target video providing device for providing a target video including a target action to the action comparison system. When the end device is used only for providing the target video, the end device may also be a camera or the like. The end device can also be used as a device for presenting the comparison result and is used for presenting the comparison result to the user. When the end device is only used for presenting the comparison result, the end device may also be a display screen or the like.
When the action comparison system is deployed on the end equipment, the action comparison system is provided for a user in a client-side mode for use, specifically, the end equipment obtains an installation package of the action comparison system, and the action comparison system is installed in the end equipment by operating the installation package.
As shown in fig. 2, the action comparison system includes multiple parts (e.g., includes multiple subsystems, each subsystem includes multiple units), and thus, the parts of the action comparison system may also be distributively deployed in different environments, for example, the parts of the action comparison system may be deployed on three environments among a cloud environment, an edge environment, and an end device, or on any two other environments.
The action comparison system realizes action comparison through subsystems with different functions and units with different functions. The embodiment of the present application does not limit the division manner of the subsystems and units in the system, and is described below with reference to an exemplary division manner shown in fig. 3.
As shown in FIG. 3, the action comparison system 100 includes an interaction subsystem 120 and a comparison subsystem 140. The interactive subsystem 120 is configured to provide a Graphical User Interface (GUI) for a user, receive a target video according to an operation triggered by the user through the GUI, and present a comparison result of action comparison to the user. The comparison subsystem 140 is configured to compare the target action in the whole set of actions recorded in the target video with the standard action recorded in the standard video to obtain a comparison result, where the comparison result includes an error position of the target action relative to the standard action.
The interaction subsystem 120 includes a communication unit 122 and a display unit 124. The communication unit 122 is configured to receive a target video, for example, the target video through a GUI. Referring specifically to the interface schematic diagram of the main interface of the interaction subsystem 120 shown in fig. 4, as shown in fig. 4, a target video acquisition component 402 and a comparison control 404 are carried on the main interface 400. The target video acquisition component 402 is used to acquire a target video. In some embodiments, the target video acquiring component 402 includes at least one of an upload control 4022 and a shooting control 4024, where the upload control 4022 is configured to upload a target video stored locally or remotely (on a network side), where the target video is a pre-recorded video, and the shooting control 4024 is configured to perform video acquisition in real time to obtain a target video, where the target video is a real-time recorded video. The comparison control 404 is configured to trigger a comparison operation to instruct the comparison subsystem 140 to compare the target action with the standard action recorded in the standard video to obtain a comparison result. Optionally, the main interface of the interaction subsystem 120 may further include a target video preview area 406, where the target video preview area 406 is used to preview the target video. Further, the main interface of the interaction subsystem 120 may further include a frame preview area 408 of the target video, and the frame preview area 408 of the target video is used for previewing the video frame of the target video.
In some possible implementations, referring to the interface schematic diagram of the main interface of the interaction subsystem 120 shown in fig. 5, as shown in fig. 5, the main interface 400 may further include a standard video preview area 407, and the standard video preview area 407 is used for previewing a standard video. Further, the main interface of the interaction subsystem 120 may further include a frame preview area 409 of the standard video, and the frame preview area 409 of the standard video is used for previewing the video frame of the standard video.
In some possible implementations, as shown in fig. 4 or 5, the main interface 400 of the interaction subsystem 120 may also carry an action type configuration control 405. The action type configuration control 405 is used to configure the action type of the target video, and specifically, the action type configuration control 405 may receive the action type specified by the user as the action type of the target video. In this way, the comparison subsystem 140 may be instructed to select a corresponding standard video (standard action) according to the specified action type, and then compare the target action recorded by the target video with the standard action corresponding to the action type, thereby improving the action comparison efficiency.
The interactive subsystem 120 also has a function of presenting the comparison result. Specifically, the communication unit 122 is further configured to receive the comparison result, where the comparison result includes an error position of the target action relative to the standard action, and the display unit 124 is configured to present the comparison result to the user.
The display unit 124 can display the comparison result in various ways. For example, the display unit 124 may graphically display the error location of the error target action relative to the standard action in the result interface. The display unit 124 may also present the comparison result to the user through the target frame of the target video in consideration of the visual effect. The target frame is a frame of the target video in which the target action is wrong relative to the standard action.
The target frame may be one frame or multiple frames. The following describes the manner of presenting the comparison result by taking one of the frames as an example.
Referring to the interface schematic diagram of the result interface of the interaction subsystem 120 shown in fig. 6, as shown in fig. 6, the result interface 600 includes a target video display area 602, where the target video display area 602 is used to display a target video, and when the interaction subsystem 120 receives a comparison result, the target video displayed in the target video display area 602 jumps to a target frame 603, so as to display a target frame 603, where the target frame 603 further includes a label box 604, and a part framed by the label box 604 may be an error position of the target action relative to the standard action. Further, the label box 604 can also include label information, which can be the location information of the error location.
Further, the result interface 600 further includes a progress bar 606, and the progress bar 606 may further include a mark 608 of at least one target frame on the progress bar 606, so that the user can quickly switch between different target frames through the mark 608, and thus quickly switch between different error positions.
In some possible implementations, referring to the interface schematic diagram of the result interface of the interaction subsystem 120 shown in fig. 7, as shown in fig. 7, the target video display area 602 is divided into two parts, one part is used for displaying the target frame 603, and the other part is used for displaying the reference frame 601 corresponding to the target frame.
Considering that the positions of some joint points are changed when the motion is performed, the corresponding posture of the motion can be further characterized by the joint point sequence of at least one joint point. Based on this, the wrong position of the target motion relative to the standard motion is characterized by the wrong articulation point. Referring to the interface diagram of the result interface of the interactive subsystem 120 shown in fig. 8, as shown in fig. 8, when the interactive subsystem 120 displays the comparison result, the joint 609 where the error occurs is also displayed.
Further, the communication unit 122 in the interaction subsystem 120 may also receive prompt information for the error location, which is used to assist in correcting the target action, for example, the prompt information may be a suggestion for the error location. Correspondingly, referring to fig. 8, the display unit 124 in the interaction subsystem 120 is also used for displaying a prompt message 610 for the error location.
Compare subsystem 140 includes communication unit 142, segmentation unit 144, and compare unit 146. The communication unit 142 is configured to receive a target video, the segmentation unit 144 is configured to segment at least one target action from the target video, and the comparison unit 146 is configured to compare the target action with a standard action recorded in a standard video to obtain a comparison result. Correspondingly, the communication unit 142 is further configured to return the comparison result to the interaction subsystem 120.
In some possible implementations, the comparison subsystem 140 further includes a detection unit 143, and the detection unit 143 is configured to perform Action Detection (AD) on the target video to identify a non-action segment, an interference action segment, and an action segment of an object other than the target object in the target video. The segmentation unit 144 segments the whole set of actions recorded by the target video into at least one target action, and the comparison unit 146 compares the target action with the standard action, so that the problem that the accuracy of the comparison result is reduced due to the fact that the whole set of actions recorded by the target video have interference actions or complex actions consisting of a plurality of continuous actions and the action combination is not fixed can be solved, and the comparison accuracy is improved.
Moreover, the comparing unit 146 compares the segmented target action with the standard action in a fine-grained manner, rather than comparing the whole set of actions, so that the error position of the target action relative to the standard action can be determined, and thus the error position of the target action relative to the standard action is output, so that more effective information is provided for the user, the targeted correction is facilitated, and the user experience is improved.
Further, the communication unit 142 is further configured to receive an action type specified by the user, that is, the action type of the target action recorded by the target video is the specified action type, and correspondingly, the comparison unit 146 may determine a standard video corresponding to the action type from the action knowledge base 148, and compare the target action with the standard action corresponding to the action type. Therefore, the comparison calculation amount is reduced, and the comparison efficiency is improved.
The action knowledge base 148 may also store hints information corresponding to the error location of the target action relative to the standard action. The prompt may be a corrective suggestion. The communication unit 142 may obtain the prompt information from the action knowledge base 148 and send the prompt information to the interaction subsystem 120, so that the interaction subsystem 120 presents the prompt information to the user, and the user may perform a targeted correction according to the correction suggestion.
In consideration of the fact that the action execution speed in the target video may be inconsistent with the action execution speed in the standard video, in some possible implementations, the comparison unit 146 may perform speed correction on the target action first when comparing the target action with the standard action, so that the target action is aligned with the standard action. Specifically, the comparison unit 146 may perform speed correction on the posture sequence of the at least one action unit of the target action according to the posture sequence of the at least one action unit of the standard action, so as to avoid the problem that the comparison accuracy is reduced due to mismatch between the speed of the standard action and the speed of the target action, and further improve the comparison accuracy.
In order to make the technical solution of the present application clearer and easier to understand, the following describes the motion comparison method provided in the embodiment of the present application in detail from the perspective of the motion comparison system 100.
Referring to fig. 9, a flowchart of a method for action comparison is shown, the method comprising:
s902: the action comparison system 100 receives a target video.
The action comparison system 100 may receive the target video through a user interface, such as a GUI. The target video can be a video which is recorded in advance and is to be subjected to action comparison, and can also be a video which is recorded in real time and is to be subjected to action comparison. The following describes the manner of acquiring different target videos in detail.
In some possible implementations, the action comparison system 100 may provide a user interface to a user, where the user interface carries an upload control, and the user may select a locally stored target video or a network-side stored target video and then upload the target video through the upload control. The target video stored locally or the target video stored at the network side is a pre-recorded video.
In other possible implementations, the action comparison system 100 may provide a user interface to a user, where the user interface carries a shooting control, the user may trigger the shooting control, the action comparison system 100 may invoke a camera to start video recording in response to the operation, and the action comparison system 100 receives a video recorded in real time by the camera.
Further, the action comparison system 100 may also receive a user-specified action type, which is used to characterize the action type of the action recorded by the target video. Specifically, the user interface provided by the action comparison system 100 carries an action type configuration control, and the action comparison system 100 can receive an action type specified by the user through the action type configuration control.
The action type is used to characterize the category to which the action belongs. In some embodiments, the types of actions may include dance, martial arts, gymnastics, rehabilitation exercises, and the like. Dancing can be further divided into the sub-types of Latin dance, jazz dance, street dance and the like, and martial arts can be further divided into the sub-types of taekwondo, karate and Taijiquan and the like.
S904: the action comparison system 100 compares the target action of the target video recording with the standard action of the standard video recording.
In some possible implementations, the action comparison system 100 may perform action detection (action detection) on the target video in advance to obtain a target action fragment, then segment the target action from the target action fragment, and then compare the target action with the standard action of the standard video record.
Specifically, the motion matching system 100 may filter segments that do not include motion, interference segments, and motion segments that are independent of the target object in the target video through a motion detection model, such as a Two-Stream smooth 3D convolutional neural network (I3D) model, a Boundary Matching Network (BMN) model, or a long short-term memory network (LSTM) model, to obtain the target motion segments. The target action fragment includes the entire set of actions performed by the target object, which may include at least one target action.
It should be noted that the whole set of actions may include different target actions, for example, target action 1 and target action 2 may be included. In addition, the same target action may occur once, or may occur continuously or intermittently a plurality of times. For example, the entire set of actions may include the following set of actions { target action 1, target action 2, target action 3, target action 2, target action 1, target action 4 }. In this example, target action 3 occurs 2 times in succession, target action 1 and target action 2 occur 2 times intermittently, and target action 4 occurs 1 time.
Next, the action alignment system 100 can extract action sequence features from the target action segment. For example, the motion matching system 100 detects a target object from a target motion segment, and then detects a key point of the target object by a pose estimation method such as openphase, alphaphase, and the like. The key points may be joint points of the target object, such as wrist joint points, finger joint points, elbow joint points, etc., and the motion alignment system 100 may extract motion sequence features based on the key points of the target object. The motion sequence feature may be specifically expressed as X ═ X0,x1,x2,…,xLX-1]Wherein L isXLength, x, characterizing a sequence of actionsiFeatures extracted from the keypoints are characterized.
The action comparison system 100 can determine an action segmentation point according to the action sequence characteristics, and segment the target action segment into at least one target action according to the action segmentation point. Specifically, the action matching system 100 may determine an action segmentation point by using a time window according to the action sequence characteristics, and segment the target action segment based on the action segmentation point to obtain one or more single-action segments. Each single-action segment corresponds to a target action. In this manner, the motion comparison system 100 can compare a single target motion to a standard motion of a standard video recording. Compared with the whole comparison, the method can improve the comparison accuracy and can provide more effective information, such as the error position, the error time and the like of the action.
The motion alignment system 100 may segment the target motion segment using a fixed time window or an elastic time window. Wherein the boundary of the fixed time window is not variable, and the boundary of the elastic time window can be elastically changed, as shown in fig. 10, the window length of the fixed time window is fixed w, and the window length of the elastic time window is not fixed w1,w2,w3. Considering that the time lengths of different actions in different whole sets of actions may be different, the action comparison system 100 may detect the boundaries of different actions through the elastic time window, so that the action segmentation can adapt to the situation where different actions are continuous. For example, target motion 1 and target motion 2 are continuous in the whole set of motions, and the boundary between target motion 1 and target motion 2 can be detected through the elastic time window.
Further, each action may include at least one action unit. For example, an action in the sport of soccer may include passing, catching, and shooting at least three action units. For another example, the action in the table tennis sport may include action units such as a forehand attack and a backhand attack. The action matching system 100 can also determine action unit segmentation points according to the action sequence characteristics. And dividing the target action into at least one action unit according to the action unit dividing point. Correspondingly, the action comparison system 100 may also implement action comparison by comparing at least one action unit of the target action with at least one action unit of the standard action, so that the accuracy may be further improved. And the error position and the error time of the action can be provided more accurately.
Similar to the segmentation to obtain single-action segments, the action comparison system 100 may use an elastic time window to detect the boundaries of different action units, so that the action segmentation can adapt to the situation that the same action is continuous.
Setting the division point as S ═ S0,s1,s2,…,sn-1]Where n is the total number of division points, the elastic time window for division into single-action segments, and the elastic time window for division into action units (also called action type scale elastic time window and action units)A scaled elastic time window) requires S, n to satisfy:
assuming that the data distributions of different action types are not consistent, the action comparison system 100 can set F (X)1,X2)=D(X1,X2) The distance between two segments of data distribution is obtained, and the action division point S can be obtained by optimizing the formula (1)cWherein the action division point ScIncluding ncAnd (4) dividing points.
Distance D (X) of data distribution1,X2) The calculation can be specifically obtained by a data distribution calculation method. The data distribution calculation method includes, but is not limited to, a method of measuring a difference in data distribution in two domains, such as Maximum Mean Difference (MMD).
Assuming that the sequence similarity of the motion units of the same motion type is the highest, the motion alignment system 100 can set F (X)1,X2)=C(X1,X2) The similarity of the two sequences is obtained, and the action unit segmentation point S can be obtained by optimizing the formula (1)uWherein the action unit is divided into points SuIncluding nuAnd (4) dividing points.
Similarity of sequences C (X)1,X2) And calculating by using a motion similarity calculation method. The motion similarity calculation method includes, but is not limited to, euclidean distance, correlation coefficient, Dynamic Time Warping (DTW), and the like.
Considering that the speed of the target object is variable when performing the motion, the motion matching system 100 may further divide the target motion segment in combination with the standard motion, and finely adjust the motion boundary or the motion unit boundary so that the motion division can adapt to the dynamic change of the motion speed.
Specifically, the motion alignment system 100 may determine the initial segmentation point according to the motion sequence features extracted from the target video. For example, the motion comparison system 100 may use the motion unit partition point determined by the above formula (1) as an initial partition point, determine the motion sequence feature corresponding to the motion unit of the target motion according to the initial partition point, and adjust the initial partition point according to the similarity between the motion sequence feature corresponding to the motion unit of the target motion and the motion sequence feature of the motion unit of the standard motion, thereby obtaining a more refined motion unit partition point. The motion comparison system 100 can determine the boundary of the motion unit more accurately by using the more refined motion unit segmentation points, thereby improving the comparison accuracy.
It should be noted that the initial segmentation point determined by the motion alignment system 100 according to the motion sequence feature extracted from the target video may be gapped or gapless, and the operation unit segmentation point obtained by adjusting the initial segmentation point in each of these two cases will be described in detail below.
In some possible implementations, an initial segmentation point S [ [ S ] is set0,s1],[s1,s2]…,[sn-2,sn-1]]Where n is the total number of division points, and the motion sequence characteristic Y of the motion unit of the standard motion is [ Y ═ Y0,y1,y2,…,yLY-1]Wherein, LYLength of motion sequence feature, y, of motion unit characterizing standard motioniCharacterizing features extracted according to key points of standard actions, an elastic time window of standard action scales needs to solve S, n makes it satisfy:
assuming that the similarity between the action unit sequence of the target action and the action unit sequence of the standard action is the highest, the action comparison system 100 can setThe similarity of two sequences is obtained, and a finer segmentation point S can be obtained by optimizing the formula (2)s. Wherein the action unit is divided into points SsComprising nsAnd (4) dividing points.
In other possible implementations, an initial segmentation point is setWhereinThe operation sequence characteristics Y of the operation units of the standard operation are [ Y ═ Y ] (n-1)0,y1,y2,…,yLY-1]Wherein L isYLength of motion sequence feature, y, of motion unit characterizing standard motioniCharacterizing features extracted according to key points of standard actions, an elastic time window of standard action scales needs to solve S, n makes it satisfy:
assuming that the similarity between the action unit sequence of the target action and the action unit sequence of the standard action is the highest, the action comparison system 100 can setThe similarity of two sequences is obtained, and a finer segmentation point S can be obtained by optimizing the formula (3)t. Wherein the action unit is divided into points StComprising ntAnd (4) dividing points.
The action comparison system 100 may optimize the above formula (1), formula (2), or formula (3) in a traversal manner to obtain the segmentation point. In some embodiments, the action comparison system 100 may also calculate candidate segmentation points corresponding to each scale through data distribution or similarity, and then adjust the candidate segmentation points to obtain the segmentation points. In other embodiments, the motion alignment system 100 may also directly predict the segmentation points using a neural network. Specifically, the motion matching system 100 inputs the motion sequence features extracted from the target video into the neural network, and obtains the segmentation points output by the neural network.
In some possible implementations, the motion alignment system 100 can extract multi-scale motion sequence features when extracting motion sequence features. For example, the motion alignment system 100 may input the target video into a feature extraction network, such as a bottom-up (bottom-up) and top-down (top-down) combined attention network, to obtain a multi-scale feature sequence. The bottom-up network may be a Convolutional Neural Network (CNN), and the top-down network may be a network including an upsampling layer and a convolutional layer (e.g., 1 × 1 convolutional layer).
The motion comparison system 100 can fuse the multi-scale motion sequence features to obtain a fused motion sequence feature XF=F(Xs1,Xs2,Xs3,…,Xsk) Wherein x isskRepresenting the features of the k-th scale. The action alignment system 100 can compare the fused action sequence features XFAnd dividing the target action segment by utilizing a multi-scale elastic time window to obtain at least one target action, or further dividing to obtain at least one action unit.
In some possible implementations, the motion alignment system 100 may determine one segmentation result according to the motion sequence features of each scale, and then the motion alignment system 100 may fuse, for example, average, the segmentation results to obtain a final segmentation result. Thus, the action comparison system 100 can perform action comparison according to the fused segmentation result, thereby improving the comparison accuracy.
Referring to the schematic diagram of video segmentation shown in fig. 11, the action comparison system 100 may perform feature extraction on the target video and the standard video to obtain an action sequence feature corresponding to the target video and an action sequence feature corresponding to the standard video, and then perform fine-grained segmentation on the target video and the standard video by using a multi-scale elastic time window, such as an action type scale elastic time window, an action unit scale elastic time window, and a standard action scale elastic time window, according to the action sequence feature to obtain an action unit of the target action and an action unit of the standard action. The standard video segmentation process may be performed in real time when the target motion is compared, or may be performed in advance, for example, in advance when a motion knowledge base is constructed.
When comparing the action units of the target action with the action units of the standard action, the action comparison system 100 may first align the action units of the target action with the action units of the standard action. Specifically, the action comparison system 100 can align action units by aligning key points. The key points may be articulation points that affect the motion. Different key points have different effects on motion alignment, and particularly in the case of speed variation, the different key points vary in degree with speed. Therefore, the action comparison system 100 can determine the types of the joint points concerned by various actions or action units, or set the weights of the types of the joint points concerned for various actions or action units, so as to achieve the aim of aligning the action units, thereby obtaining a better aligning effect.
In some possible implementations, referring to fig. 12, the motion comparison system 100 may determine the joint point type used for the current motion comparison through the learning of the motion adaptive feature, for example, the joint point type used for comparing the yoga motion (specifically, the phantom-type motion unit) may be a shoulder joint, a hip joint, a knee joint, or other joint point types.
The action comparison system 100 aligns at least one action unit of the target action and at least one action unit of the standard action according to the target joint of the target object performing the target action and the reference joint of the reference object performing the standard action. Wherein the target joint point and the reference joint point are joint points of a predetermined joint point type.
For example, the motion alignment system 100 can align the motion units of the target motion and the motion units of the standard motion according to the shoulder joint, the hip joint, and the knee joint of the target object (e.g., the student) and the shoulder joint, the hip joint, and the knee joint of the reference object (e.g., the trainer). Here, aligning the action unit of the target action and the action unit of the standard action means that at least one frame of the action unit of the target action corresponds to a frame in the action unit of the standard action, for example, the 2 nd frame of the action unit of the target action corresponds to the 1 st frame of the action unit of the standard action, and the 100 th frame of the action unit of the target action corresponds to the 50 th frame of the action unit of the standard action.
The motion alignment system 100 can obtain the pose sequence of the target object and the pose sequence of the reference object by aligning the motion units. Wherein the pose sequence of the target object may be characterized by target joint points of the target object and the pose sequence of the reference object may be characterized by joint points of the reference object.
It should be noted that the action comparison system 100 may align at least one action unit of the target action and at least one action unit of the standard action by finding a shortest path through a dynamic planning method. In some embodiments, the action comparison system 100 may align at least one action unit of the target action with at least one action unit of the standard action in other ways, such as clustering by a neural network.
Further, the motion alignment system 100 may perform a velocity correction on the pose sequence of the target object according to the pose sequence of the reference object in consideration of the difference of the motion velocities. Still taking the motion unit of the phantom chair type in the yoga exercise as an example, assuming that the speed of the motion unit executed by the reference object in the standard video is 1 × speed, and the speed of the motion unit executed by the target object in the target video is 0.5 × speed, the motion alignment system 100 may correct the posture sequence of the target object according to the posture sequence of the reference object. For example, the sequence of poses of the target object may be downsampled according to a multiple speed relationship.
When the action comparison system 100 aligns the action units in a manner of finding the shortest path by using a dynamic programming method, the action comparison system 100 may further perform speed correction on the posture sequence of the target object by using the shortest path according to the posture sequence of the reference object.
Next, the motion alignment system 100 can determine an alignment result according to the pose sequence of the target object and the pose sequence of the reference object. As shown in fig. 12, the motion alignment system 100 may determine a key index, which may be an index such as joint angles, joint distances, and the like corresponding to a plurality of target joint points, from the corrected posture sequence, and then determine whether a motion unit is wrong according to the key index, and thus determine a wrong position of the target motion with respect to the standard motion. The error location can be characterized in particular by the joint in which the error occurred.
S906: the action comparison system 100 presents the comparison results to the user.
The comparison result at least comprises the error position of the target motion relative to the standard motion, wherein the error position is specifically the position of the error local motion in the target motion, such as the elbow, wrist and the like. The action comparison system 100 may present the comparison result to the user so as to correct the target action according to the comparison result. The action comparison system 100 may present the comparison result through a user interface such as a GUI, or present the comparison result through a voice broadcast or the like. The following is an example of presenting the comparison results through a GUI.
In some possible implementations, the action comparison system 100 may determine a frame in the target video where the target action is wrong with respect to the standard action, that is, the target frame, and then the action comparison system 100 may present the comparison result to the user through the target frame of the target video.
Referring specifically to fig. 6, the action comparison system 100 may present the comparison result to the user in the target frame 603 of the target video. As shown in FIG. 6, the action comparison system 100 can mark the error location of the target action relative to the standard action in the target frame through the mark box 604.
Further, referring to fig. 7, the action comparison system 100 may present the comparison result to the user through the target frame 603 of the target video and the corresponding reference frame 601 in the standard video, and specifically mark the error position of the target action relative to the standard action in the target frame through the mark box 604 in the target frame 603 and the reference frame 601.
In some possible implementations, the motion alignment system 100 generates motion knowledge from the standard video and the motion correction suggestions. As shown in fig. 13, the action knowledge may include a correspondence between error descriptions and correction suggestions, and further, a correspondence between error expressions and key indicators may be included in the action knowledge. The action matching system 100 stores the action knowledge in an action knowledge base.
When the action comparison system 100 determines the index value of the key index through the gesture sequence, and determines an action error, a correction suggestion corresponding to the action error may be matched from the action knowledge base, and a prompt message may be generated according to the correction suggestion. Referring to fig. 8, the action comparison system 100 may also present prompt information for the error location to the user, which is used to assist in correcting the target action.
From the above, the embodiment of the present application provides an action comparison method. In the method, a target video is received by an action comparison system 100, a whole set of actions is recorded in the target video, the action comparison system 100 can determine a single target action from the whole set of actions recorded in the target video, then the target action recorded in the target video is compared with the standard action recorded in the standard video, the error position of the target action relative to the standard action is determined, and the error position of the target action relative to the standard action is presented to a user.
On one hand, the method divides the whole set of actions recorded by the target video to obtain at least one target action, and then compares the target action with the standard action recorded by the standard video, thereby solving the problems that the whole set of actions recorded by the target video comprise interference actions, or complex actions consisting of a plurality of continuous actions and the action combination is not fixed, so that the comparison accuracy is low, and the comparison result is not reliable. On the other hand, the method can output the error position of the target motion relative to the standard motion, and the user can carry out targeted correction according to the error position, so that the learning or training efficiency is improved.
The action comparison method provided by the embodiment of the present application is described in detail above with reference to fig. 1 to 13, and the action comparison system 100 provided by the embodiment of the present application and the computing device for implementing the function of the action comparison system are described below with reference to the accompanying drawings.
Referring to fig. 3, an action comparison system 100 is provided in the embodiment of the present application, where the system 100 includes a unit for performing method steps corresponding to any one of the implementations in the foregoing method embodiments. Next, a structure of the action comparison system 100 will be described as an example.
Specifically, the action comparison system 100 includes a communication unit 122, a comparison unit 146, and a display unit 124. The communication unit 122 is configured to receive a target video, the comparison unit 146 is configured to compare a target action recorded by the target video with a standard action recorded by the standard video, and the display unit 124 is configured to present a comparison result to a user, where the comparison result includes an error position of the target action relative to the standard action, where the error position indicates a position of an erroneous local action in the target action.
In some possible implementations, the display unit 124 is specifically configured to:
and presenting a comparison result to a user through a target frame of the target video, wherein the target frame is a frame in the target video, in which the target action is wrong relative to the standard action.
In some possible implementations, the display unit 124 is further configured to:
presenting prompt information aiming at the error position to the user, wherein the prompt information is used for assisting in correcting the target action.
In some possible implementations, the system 100 further includes:
a dividing unit 144, configured to extract motion sequence features from a target video, determine motion unit dividing points according to the motion sequence features, and divide the target motion into at least one motion unit according to the motion unit dividing points;
the comparing unit 146 is specifically configured to:
and comparing the at least one action unit of the target action with the at least one action unit of the standard action.
In some possible implementations, the system 100 further includes:
the detection unit 143 is configured to perform motion detection on the target video. Therefore, the method can identify the non-action segments, the interference action segments and the action segments of other objects except the target object in the target video, thereby avoiding the interference of the non-action segments, the interference action segments and the action segments of other objects except the target object on the action comparison process and influencing the accuracy of the action comparison.
In some possible implementations, the segmentation unit 144 is specifically configured to:
determining an initial segmentation point according to the action sequence characteristics extracted from the target video;
and adjusting the initial segmentation point to obtain the action unit segmentation point according to the similarity between the action sequence characteristics of the action unit obtained by segmenting the target action by the initial segmentation point and the action sequence characteristics of the action unit of the standard action.
In some possible implementations, the alignment unit 146 is specifically configured to:
determining the joint point type adopted by comparison;
according to a target joint point of a target object executing the target action and a reference joint point of a reference object executing the standard action, aligning at least one action unit of the target action and at least one action unit of the standard action to obtain a posture sequence of the target object and a posture sequence of the reference object, wherein the target joint point and the reference joint point are joint points of the joint point type.
In some possible implementations, the alignment unit 146 is specifically configured to:
carrying out speed correction on the attitude sequence of the target object according to the attitude sequence of the reference object;
and determining a comparison result according to the corrected posture sequence of the target object and the posture series of the reference object.
In some possible implementations, the system 100 further includes:
the action knowledge base 148 stores prompt information corresponding to the error location of the target action relative to the standard action. The prompt may be a corrective suggestion.
As such, the communication unit 122 may retrieve the reminder information from the action repository 148 and then present the reminder information to the user by the display unit 124.
It should be noted that fig. 3 is only an exemplary functional division of the action comparison system 100. In other possible implementations of the embodiment of the present application, the action comparison system 100 may also divide the functional modules in other manners. For example, when motion comparison system 100 is deployed in a single device, such as a terminal, motion comparison system 100 may not include communication unit 142.
The functionality of the above-described action comparison system 100 may be implemented by a computing device, such as a single computing device or a computing cluster formed by a plurality of computing devices. The following detailed description is made with reference to the accompanying drawings.
FIG. 14 provides a computing device, such as that shown in FIG. 14, in which computing device 1400 may be used to implement the functionality of action comparison system 100 in the embodiment shown in FIG. 3 and described above. The computing device 1400 includes a bus 1401, a processor 1402, a display 1403, and a memory 1404. Communication between the processor 1402, memory 1404, and display 1403 occurs via bus 1401.
The bus 1401 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 14, but this is not intended to represent only one bus or type of bus.
The processor 1402 may be any one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Micro Processor (MP), a Digital Signal Processor (DSP), and the like.
The display 1403 is an input/output (I/O) device. The device can display electronic documents such as images and characters on a screen for a user to view. The display 1405 may be classified into a Liquid Crystal Display (LCD), an Organic Light Emitting Diode (OLED) display, and the like according to a manufacturing material. Specifically, the display 1403 may display an image through the GUI, receive a target video through the GUI, or present a comparison result and the like to a user through the GUI.
The memory 1404 may include a volatile memory (volatile memory), such as a Random Access Memory (RAM). The memory 1404 may also include a non-volatile memory (non-volatile memory), such as a read-only memory (ROM), a flash memory, a Hard Disk Drive (HDD), or a Solid State Drive (SSD).
The memory 1404 stores executable program codes, and the processor 1402 executes the executable program codes to perform the operation comparison method. Specifically, the display 1403 receives the target video through the GUI, and then transmits the target video to the processor 1402 through the bus 1401, the processor 1402 executes the program codes, compares the target action of the target video record with the standard action of the standard video record to obtain a comparison result, and transmits the comparison result to the display 1403 through the bus 1401. The display 1403 presents the comparison result including the error position of the target action relative to the standard action to the user through the GUI, the error position indicating the position of the wrong local action in the target action.
The embodiment of the application also provides a computer readable storage medium. The computer-readable storage medium can be any available medium that a computing device can store or a data storage device, such as a data center, that contains one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), among others. The computer-readable storage medium includes instructions that instruct a computing device to perform the above-described motion comparison method applied to the motion comparison system.
The embodiment of the application also provides a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computing device, cause the processes or functions described in accordance with embodiments of the application to occur, in whole or in part.
The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website site, computer, or data center to another website site, computer, or data center by wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wirelessly (e.g., infrared, wireless, microwave, etc.).
The computer program product may be a software installation package, which may be downloaded and executed on a computing device in case any of the aforementioned action comparison methods needs to be used.
The description of the flow or structure corresponding to each of the above drawings has emphasis, and a part not described in detail in a certain flow or structure may refer to the related description of other flows or structures.
Claims (16)
1. A method for motion comparison, the method comprising:
receiving a target video;
comparing the target action of the target video record with the standard action of the standard video record;
and presenting a comparison result to a user, wherein the comparison result comprises an error position of the target action relative to the standard action, and the error position indicates the position of an error local action in the target action.
2. The method of claim 1, wherein presenting the comparison results to the user comprises:
and presenting a comparison result to a user through a target frame of the target video, wherein the target frame is a frame in the target video, in which the target action is wrong relative to the standard action.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
presenting prompt information for the error location to the user, the prompt information being used to assist in correcting the target action.
4. The method of any one of claims 1 to 3, wherein comparing the target action of the target video recording with a standard action of a standard video recording comprises:
extracting action sequence features from the target video;
determining action unit segmentation points according to the action sequence characteristics;
dividing the target action into at least one action unit according to the action unit dividing point;
and comparing the at least one action unit of the target action with the at least one action unit of the standard action.
5. The method of claim 4, wherein determining action unit segmentation points based on the action sequence features comprises:
determining an initial segmentation point according to the action sequence characteristics extracted from the target video;
and adjusting the initial segmentation point to obtain the action unit segmentation point according to the similarity between the action sequence characteristics of the action unit obtained by segmenting the target action by the initial segmentation point and the action sequence characteristics of the action unit of the standard action.
6. The method of claim 4 or 5, wherein said comparing at least one action unit of the target action with at least one action unit of the standard action comprises:
determining the joint point type adopted by comparison;
aligning at least one action unit of the target action and at least one action unit of the standard action according to a target joint point of a target object executing the target action and a reference joint point of a reference object executing the standard action to obtain a posture sequence of the target object and a posture sequence of the reference object, wherein the target joint point and the reference joint point are joint points belonging to the joint point type;
and determining a comparison result according to the attitude sequence of the target object and the attitude sequence of the reference object.
7. The method of claim 6, further comprising:
and carrying out speed correction on the attitude sequence of the target object according to the attitude sequence of the reference object.
8. A motion alignment system, comprising:
a communication unit for receiving a target video;
the comparison unit is used for comparing the target action of the target video record with the standard action of the standard video record;
the display unit is used for presenting a comparison result to a user, wherein the comparison result comprises an error position of the target action relative to the standard action, and the error position indicates the position of an error local action in the target action.
9. The system of claim 8, wherein the display unit is specifically configured to:
and presenting a comparison result to a user through a target frame of the target video, wherein the target frame is a frame in the target video, in which the target action is wrong relative to the standard action.
10. The system of claim 8 or 9, wherein the display unit is further configured to:
presenting prompt information for the error location to the user, the prompt information being used to assist in correcting the target action.
11. The system of any one of claims 8 to 10, further comprising:
the segmentation unit is used for extracting action sequence characteristics from the target video, determining action unit segmentation points according to the action sequence characteristics, and segmenting the target action into at least one action unit according to the action unit segmentation points;
the comparison unit is specifically configured to:
and comparing the at least one action unit of the target action with the at least one action unit of the standard action.
12. The system according to claim 11, wherein the segmentation unit is specifically configured to:
determining an initial segmentation point according to the action sequence characteristics extracted from the target video;
and adjusting the initial segmentation point to obtain the action unit segmentation point according to the similarity between the action sequence characteristics of the action unit obtained by segmenting the target action by the initial segmentation point and the action sequence characteristics of the action unit of the standard action.
13. The system according to claim 11 or 12, wherein the alignment unit is specifically configured to:
determining the joint point type adopted by comparison;
according to a target joint point of a target object executing the target action and a reference joint point of a reference object executing the standard action, aligning at least one action unit of the target action and at least one action unit of the standard action to obtain a posture sequence of the target object and a posture sequence of the reference object, wherein the target joint point and the reference joint point are joint points belonging to the joint point type.
14. The system of claim 13, wherein the alignment unit is specifically configured to:
carrying out speed correction on the attitude sequence of the target object according to the attitude sequence of the reference object;
and determining a comparison result according to the posture sequence of the corrected target object and the posture series of the reference object.
15. An apparatus, comprising a processor and a memory;
the processor is to execute instructions stored in the memory to cause the device to perform the method of any of claims 1-7.
16. A computer-readable storage medium comprising instructions that direct a device to perform the method of any of claims 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011268929.4A CN114495254A (en) | 2020-11-13 | 2020-11-13 | Action comparison method, system, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011268929.4A CN114495254A (en) | 2020-11-13 | 2020-11-13 | Action comparison method, system, equipment and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114495254A true CN114495254A (en) | 2022-05-13 |
Family
ID=81490459
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011268929.4A Pending CN114495254A (en) | 2020-11-13 | 2020-11-13 | Action comparison method, system, equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114495254A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115798040A (en) * | 2022-11-23 | 2023-03-14 | 广州市锐星信息科技有限公司 | Automatic segmentation system for cardio-pulmonary resuscitation AI |
-
2020
- 2020-11-13 CN CN202011268929.4A patent/CN114495254A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115798040A (en) * | 2022-11-23 | 2023-03-14 | 广州市锐星信息科技有限公司 | Automatic segmentation system for cardio-pulmonary resuscitation AI |
CN115798040B (en) * | 2022-11-23 | 2023-06-23 | 广州市锐星信息科技有限公司 | Automatic segmentation system of cardiopulmonary resuscitation AI |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11642047B2 (en) | Interactive training of body-eye coordination and reaction times using multiple mobile device cameras | |
US11263462B2 (en) | Non-transitory computer readable recording medium, extraction method, and information processing apparatus | |
US11798318B2 (en) | Detection of kinetic events and mechanical variables from uncalibrated video | |
CN112819852A (en) | Evaluating gesture-based motion | |
CN108805068A (en) | A kind of motion assistant system, method, apparatus and medium based on student movement | |
US20160042652A1 (en) | Body-motion assessment device, dance assessment device, karaoke device, and game device | |
WO2021098616A1 (en) | Motion posture recognition method, motion posture recognition apparatus, terminal device and medium | |
CN110505519A (en) | Video editing method, electronic equipment and storage medium | |
US11568617B2 (en) | Full body virtual reality utilizing computer vision from a single camera and associated systems and methods | |
JP6677319B2 (en) | Sports motion analysis support system, method and program | |
EP3786971A1 (en) | Advancement manager in a handheld user device | |
CN113409651B (en) | Live broadcast body building method, system, electronic equipment and storage medium | |
CN116328279A (en) | Real-time auxiliary training method and device based on visual human body posture estimation | |
CN116271766A (en) | Tennis training simulation method and device, electronic equipment and storage medium | |
CN114495254A (en) | Action comparison method, system, equipment and medium | |
US12087008B2 (en) | User analytics using a camera device and associated systems and methods | |
US20240042281A1 (en) | User experience platform for connected fitness systems | |
CN112057833A (en) | Badminton forehand high-distance ball flapping motion identification method | |
KR102095647B1 (en) | Comparison of operation using smart devices Comparison device and operation Comparison method through dance comparison method | |
JPWO2018122957A1 (en) | Sports motion analysis support system, method and program | |
TWI822380B (en) | Ball tracking system and method | |
JP7502957B2 (en) | Haptic metadata generating device, video-haptic interlocking system, and program | |
WO2023193338A1 (en) | Pacing device and method | |
Rodrigues | Wod Eye: Crossfit Movement Analysis With Human Readable Feedback | |
WO2024064703A1 (en) | Repetition counting within connected fitness systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |